ffmpegaudiovideo-processingapitutorial

How to Extract Audio from Video with FFmpeg (MP3, WAV, and API Guide)

·Javid Jamae·6 min read
How to Extract Audio from Video with FFmpeg (MP3, WAV, and API Guide)

You recorded a webinar, a YouTube video, or a product demo. Now you need the audio track as a standalone file. Maybe you're repurposing video content as podcast episodes. Maybe your app needs to strip audio from user uploads for transcription. FFmpeg handles this in a single command, but getting the codec and container options right takes some trial and error.

How to Extract Audio from Video with FFmpeg

The core command is simple. The -vn flag tells FFmpeg to drop the video stream, and you specify the output format through the file extension or codec flag.

ffmpeg -i input.mp4 -vn -c:a libmp3lame -b:a 192k output.mp3

This extracts the audio from input.mp4, re-encodes it as MP3 at 192 kbps, and writes output.mp3. The -c:a libmp3lame flag selects the LAME MP3 encoder, and -b:a 192k sets a bitrate that balances quality and file size.

If you just need the raw audio without re-encoding, use -c:a copy to stream-copy the existing audio codec:

ffmpeg -i input.mp4 -vn -c:a copy output.aac

Stream copying is faster because FFmpeg skips the decode/encode cycle entirely. But the output container must support whatever codec the source video uses. An MP4 with AAC audio can stream-copy to .aac or .m4a, but not directly to .mp3.

Extract Audio as WAV (Uncompressed)

WAV is the go-to when you need lossless audio for editing, transcription pipelines, or feeding into speech-to-text models like Whisper.

ffmpeg -i input.mp4 -vn -acodec pcm_s16le output.wav

The pcm_s16le codec produces 16-bit signed PCM, which is what most audio tools expect. WAV files are large. A 5-minute video produces roughly 50 MB of uncompressed audio. If storage matters, use FLAC instead for lossless compression at about half the size:

ffmpeg -i input.mp4 -vn -c:a flac output.flac

Convert Video to MP3 with Custom Quality

MP3 is still the most universal audio format. For podcasts and general distribution, 128-192 kbps covers most use cases. For music or high-fidelity content, bump it to 320 kbps:

ffmpeg -i input.mp4 -vn -c:a libmp3lame -b:a 320k output.mp3

You can also use FFmpeg's variable bitrate mode with -q:a instead of -b:a. Lower values mean higher quality:

ffmpeg -i input.mp4 -vn -c:a libmp3lame -q:a 2 output.mp3

A -q:a value of 2 averages around 190 kbps and produces better quality than a fixed 192k bitrate at roughly the same file size.

Common Gotchas When Extracting Audio

Container vs. codec mismatch. The most common error is trying to stream-copy audio into an incompatible container. If your source has Opus audio and you try -c:a copy output.mp3, FFmpeg will fail. Either re-encode or pick a container that supports the source codec (.ogg or .webm for Opus).

No audio stream. Some screen recordings and generated videos don't have an audio track at all. FFmpeg will error with "Output file does not contain any stream." Check first with ffprobe:

ffprobe -v quiet -show_streams -select_streams a input.mp4

If this returns nothing, the video has no audio to extract.

Sample rate mismatches. When feeding extracted audio into pipelines that expect a specific sample rate (like 16kHz for Whisper), set it explicitly:

ffmpeg -i input.mp4 -vn -c:a libmp3lame -ar 16000 -ac 1 -b:a 64k output.mp3

The -ar 16000 flag resamples to 16kHz, and -ac 1 downmixes to mono.

Extract Audio via API (No FFmpeg Installation)

All those CLI commands work great on your local machine. But if you're building an app that needs to extract audio from user uploads, or automating extraction across hundreds of videos, you don't want to install FFmpeg on a server and manage the infrastructure.

FFmpeg Micro is a cloud API that handles this with a single HTTP request. Send the video URL and your desired output format:

curl -X POST "https://www.ffmpeg-micro.com/v1/transcodes" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": [{"url": "https://example.com/video.mp4"}],
    "outputFormat": "mp3",
    "options": [
      {"option": "-vn", "argument": ""},
      {"option": "-b:a", "argument": "192k"}
    ]
  }'

The API queues the job, processes it in the cloud, and gives you a download URL when it's done. No FFmpeg binary to install. No server to scale. No codec dependencies to manage.

For WAV extraction, swap the format:

curl -X POST "https://www.ffmpeg-micro.com/v1/transcodes" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "inputs": [{"url": "https://example.com/video.mp4"}],
    "outputFormat": "wav",
    "options": [
      {"option": "-vn", "argument": ""}
    ]
  }'

Poll GET /v1/transcodes/{id} until the status is completed, then grab the result from GET /v1/transcodes/{id}/download.

Batch Audio Extraction with the API

The API approach really pays off with batch processing. Say you have 50 YouTube videos and you need MP3 versions of each for a podcast feed. With CLI FFmpeg, you'd need a script, a server, and error handling for every edge case.

With the API, loop through your URLs:

for VIDEO_URL in \; do
  curl -s -X POST "https://www.ffmpeg-micro.com/v1/transcodes" \
    -H "Authorization: Bearer YOUR_API_KEY" \
    -H "Content-Type: application/json" \
    -d "{
      \"inputs\": [{\"url\": \"\"}],
      \"outputFormat\": \"mp3\",
      \"options\": [
        {\"option\": \"-vn\", \"argument\": \"\"},
        {\"option\": \"-b:a\", \"argument\": \"192k\"}
      ]
    }"
done

Each job runs in parallel on cloud infrastructure. No queue management. No server provisioning. You pay per minute of audio processed.

For a deeper dive on working with audio in FFmpeg, check out the Learn FFmpeg: Audio lesson.

FAQ

Does FFmpeg preserve audio quality when extracting?

With -c:a copy, FFmpeg stream-copies the exact audio data with zero quality loss. Re-encoding (to MP3, WAV, etc.) involves a generation loss, but at 192+ kbps it's imperceptible for speech and most music.

What audio format should I use for podcast distribution?

MP3 at 128 kbps mono for spoken word, 192 kbps stereo for music-heavy podcasts. MP3 has universal player support. AAC is technically better at the same bitrate, but some older podcast apps don't handle it well.

Can I extract just a portion of the audio?

Yes. Use -ss to set the start time and -t for duration: ffmpeg -i input.mp4 -ss 00:01:00 -t 00:02:00 -vn -c:a libmp3lame -b:a 192k clip.mp3. This extracts 2 minutes of audio starting at the 1-minute mark.

How do I extract audio from multiple videos automatically?

Use the FFmpeg Micro API to submit extraction jobs in parallel via HTTP requests. Each job runs independently on cloud infrastructure, so you can process hundreds of videos without managing a server. Sign up for the free tier to get started.

What's the difference between -c:a copy and re-encoding?

Stream copy (-c:a copy) preserves the original codec and quality but limits your output format to containers compatible with that codec. Re-encoding lets you target any format (MP3, WAV, FLAC) but takes longer and introduces a small quality loss.

About Javid Jamae

Founder & CEO at FFmpeg Micro

Javid is a software engineer, author, and entrepreneur with over 25 years of professional software development experience across enterprise, startup, and consulting environments. He founded FFmpeg Micro to make video processing accessible to developers through a simple, automation-first REST API.

Software EngineeringVideo ProcessingFFmpegCloud ArchitectureAPI DesignAutomation

Ready to process videos at scale?

Start using FFmpeg Micro's simple API today. No infrastructure required.

Get Started Free