ScribbleGhost/Converting audio to AAC with Fraunhofer FDK AAC (libfdk_aac) in FFmpeg.md

Last active March 30, 2025 17:15

Star (15) You must be signed in to star a gist
Fork (3) You must be signed in to fork a gist

Learn more about clone URLs
Clone this repository at <script src="https://gist.github.com/ScribbleGhost/54ad17da006e8bba4a1612bd6a64571c.js"></script>
Save ScribbleGhost/54ad17da006e8bba4a1612bd6a64571c to your computer and use it in GitHub Desktop.

Download ZIP

Raw

Converting audio to AAC with Fraunhofer FDK AAC (libfdk_aac) in FFmpeg.md

Converting audio to AAC with Fraunhofer FDK AAC (libfdk_aac) in FFmpeg

Check if you have an FFmpeg build supporting libfdk_aac

Run:

ffmpeg -hide_banner -h encoder=libfdk_aac

If you have an FFmpeg version that does not include libfdk_aac, you will see this:

Codec 'libfdk_aac' is not recognized by FFmpeg.

If you have a build that includes libfdk_aac you will see this:

Encoder libfdk_aac [Fraunhofer FDK AAC]:
    General capabilities: delay small 
    Threading capabilities: none
    Supported sample rates: 96000 88200 64000 48000 44100 32000 24000 22050 16000 12000 11025 8000
    Supported sample formats: s16
    Supported channel layouts: mono stereo 3.0 4.0 5.0 5.1 7.1(wide) 7.1
libfdk_aac AVOptions:
  -afterburner       <int>        E...A...... Afterburner (improved quality) (from 0 to 1) (default 1)
  -eld_sbr           <int>        E...A...... Enable SBR for ELD (for SBR in other configurations, use the -profile parameter) (from 0 to 1) (default 0)
  -eld_v2            <int>        E...A...... Enable ELDv2 (LD-MPS extension for ELD stereo signals) (from 0 to 1) (default 0)
  -signaling         <int>        E...A...... SBR/PS signaling style (from -1 to 2) (default default)
     default         -1           E...A...... Choose signaling implicitly (explicit hierarchical by default, implicit if global header is disabled)
     implicit        0            E...A...... Implicit backwards compatible signaling
     explicit_sbr    1            E...A...... Explicit SBR, implicit PS signaling
     explicit_hierarchical 2            E...A...... Explicit hierarchical signaling
  -latm              <int>        E...A...... Output LATM/LOAS encapsulated data (from 0 to 1) (default 0)
  -header_period     <int>        E...A...... StreamMuxConfig and PCE repetition period (in frames) (from 0 to 65535) (default 0)
  -vbr               <int>        E...A...... VBR mode (1-5) (from 0 to 5) (default 0)

How to get an FFmpeg build with libfdk_aac

FFmpeg supports two AAC-LC encoders (aac and libfdk_aac) and one HE-AAC (v1/2) encoder (libfdk_aac). The license of libfdk_aac is not compatible with GPL, so the GPL does not permit distribution of binaries containing incompatible code when GPL-licensed code is also included. Therefore this encoder have been designated as "non-free", and you cannot download a pre-built ffmpeg that supports it. This can be resolved by compiling ffmpeg yourself.

My way of building a custom FFmpeg

I setup a clean install of Windows 10 in a VM and run https://github.com/m-ab-s/media-autobuild_suite

My go-to preset for highest quality regardless of file size

ffmpeg -i input.wav -ac 2 -c:a libfdk_aac -cutoff 20000 -afterburner 1 -vbr 0 output.m4a

-ac 2 Downmix to a stereo track

-c:a libfdk_aac Use Fraunhofer FDK AAC (libfdk_aac).

-cutoff 20000 libfdk_aac defaults to a low-pass filter of around 14kHz. 20000 is the maximum available.

-afterburner 1 Afterburner is "a type of analysis by synthesis algorithm which increases the audio quality but also the required processing power." Fraunhofer recommends to always activate this feature. 1 = On and 0 = Off.

-vbr 0 - Setting VBR (variable bitrate) to 0 means libfdk_aac will try to set the maximum available CBR (constant bitrate) for the stream. This results in the best theoretical quality no matter if you choose VBR or CBR. This will increase the filesize though.

ddelange commented Sep 14, 2023

Hi 👋

Some testing I did using spek for my tool yt:

tl;dr:

aac_at outperforms libfdk_aac
- same size but much more detail at CBR@256
- best libfdk_aac setting is CBR@256 (VBR performs consistently worse)
- libfdk_aac CBR@256 (12.6 MB) contains even less detail than aac_aat @ aac_he_v2 (3.1 MB) 🔥
aac_aat @ aac_he_v2 is really impressive size/quality-wise
best setting for me is aac_aat CBR@256, or VBR@q2 for 9% size savings with virtually no degradation

input.wav

84.9 MB

libfdk_aac (--enable-nonfree)

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -cutoff 20000 -b:a 256k -afterburner 1 -vbr 0 output.m4a
12.6 MB

-afterburner 1 -vbr 0 is the default setting, can be removed

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -b:a 256k -afterburner 1 -vbr 0 output.m4a
12.6 MB

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -b:a 256k -afterburner 1 output.m4a
12.6 MB

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -b:a 256k output.m4a
12.6 MB

VBR performs consistently worse

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -vbr 5 -cutoff 20000 output.m4a
10.4 MB

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -vbr 1 -cutoff 20000 output.m4a
7.3 MB

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -vbr 5 output.m4a
10.5 MB

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -vbr 1 output.m4a
5.2 MB

aac_at (--enable-audiotoolbox)

CBR

ffmpeg -i input.wav -vn -ac 2 -c:a aac_at -b:a 256k output.m4a
12.6 MB

VBR

ffmpeg -i input.wav -vn -ac 2 -c:a aac_at -q:a 0 -aac_at_mode vbr output.m4a
15.6 MB

ffmpeg -i input.wav -vn -ac 2 -c:a aac_at -q:a 1 -aac_at_mode vbr output.m4a
13.6 MB

ffmpeg -i input.wav -vn -ac 2 -c:a aac_at -q:a 2 -aac_at_mode vbr output.m4a
11.5 MB

ffmpeg -i input.wav -vn -ac 2 -c:a aac_at -q:a 3 -aac_at_mode vbr output.m4a
10.5 MB

aac_he

ffmpeg -i input.wav -vn -ac 2 -c:a aac_at -profile:a 4 output.m4a
4.2 MB

aac_he_v2

ffmpeg -i input.wav -vn -ac 2 -c:a aac_at -profile:a 28 output.m4a
3.1 MB

ABR

ffmpeg -i input.wav -vn -ac 2 -c:a aac_at -b:a 256k -aac_at_mode abr output.m4a
12.6 MB

CVBR

ffmpeg -i input.wav -vn -ac 2 -c:a aac_at -b:a 256k -aac_at_mode cvbr output.m4a
13.6 MB

ddelange commented Sep 14, 2023

the only that matters when stepping down from lossless to lossy) is compare the perceived quality with double-blind tests

very true! right now I'm only visually comparing the spectrum analysis.

and you're right, fdkaac CBR@256 and VBR@5 have visually almost the same spectrum (VBR loses some decibels at the higher frequency ranges), with VBR having considerably lower file size :)

ddelange commented Sep 14, 2023 •

edited

Loading

When I leave out the cutoff in the VBR@5 command above, the output is definitely worse 🤔

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -vbr 5 -cutoff 20000 output.m4a
10.4 MB

ffmpeg -i input.wav -vn -ac 2 -c:a libfdk_aac -vbr 5 output.m4a
10.5 MB

Strangely, your spek results seem inconsistent with mine.

That's weird indeed. I compiled in March or so, so all should be relatively recent:

$ ffmpeg -h encoder=libfdk_aac
ffmpeg version 6.0 Copyright (c) 2000-2023 the FFmpeg developers
  built with Apple clang version 12.0.0 (clang-1200.0.32.29)
  configuration: --prefix=/usr/local/Cellar/ffmpeg/6.0-with-options --enable-shared --cc=clang --host-cflags= --host-ldflags= --enable-gpl --enable-libaom --enable-libdav1d --enable-libmp3lame --enable-libopus --enable-libsnappy --enable-libtheora --enable-libvorbis --enable-libvpx --enable-libx264 --enable-libx265 --enable-libfontconfig --enable-libfreetype --enable-frei0r --enable-libass --enable-demuxer=dash --enable-opencl --enable-audiotoolbox --enable-videotoolbox --disable-htmlpages --enable-libfdk-aac --enable-libsrt --enable-libxvid --enable-nonfree
  libavutil      58.  2.100 / 58.  2.100
  libavcodec     60.  3.100 / 60.  3.100
  libavformat    60.  3.100 / 60.  3.100
  libavdevice    60.  1.100 / 60.  1.100
  libavfilter     9.  3.100 /  9.  3.100
  libswscale      7.  1.100 /  7.  1.100
  libswresample   4. 10.100 /  4. 10.100
  libpostproc    57.  1.100 / 57.  1.100
Encoder libfdk_aac [Fraunhofer FDK AAC]:
    General capabilities: dr1 delay small
    Threading capabilities: none
    Supported sample rates: 96000 88200 64000 48000 44100 32000 24000 22050 16000 12000 11025 8000
    Supported sample formats: s16
    Supported channel layouts: mono stereo 3.0 4.0 5.0 5.1 6.1(back) 7.1(wide) 7.1 7.1(top)
libfdk_aac AVOptions:
  -afterburner       <int>        E...A...... Afterburner (improved quality) (from 0 to 1) (default 1)
  -eld_sbr           <int>        E...A...... Enable SBR for ELD (for SBR in other configurations, use the -profile parameter) (from 0 to 1) (default 0)
  -eld_v2            <int>        E...A...... Enable ELDv2 (LD-MPS extension for ELD stereo signals) (from 0 to 1) (default 0)
  -signaling         <int>        E...A...... SBR/PS signaling style (from -1 to 2) (default default)
     default         -1           E...A...... Choose signaling implicitly (explicit hierarchical by default, implicit if global header is disabled)
     implicit        0            E...A...... Implicit backwards compatible signaling
     explicit_sbr    1            E...A...... Explicit SBR, implicit PS signaling
     explicit_hierarchical 2            E...A...... Explicit hierarchical signaling
  -latm              <int>        E...A...... Output LATM/LOAS encapsulated data (from 0 to 1) (default 0)
  -header_period     <int>        E...A...... StreamMuxConfig and PCE repetition period (in frames) (from 0 to 65535) (default 0)
  -vbr               <int>        E...A...... VBR mode (1-5) (from 0 to 5) (default 0)

Author

ScribbleGhost commented Aug 31, 2024

Fraunhofer FDK AAC used to be preferred over the built-in FFmpeg AAC encoder, but not anymore. Read more here: https://www.ffmpeg.org/index.html#aac_encoder_stable

bardware commented Jan 3, 2025

but not anymore

I was curious and collected information.

You link to a page from January 2016.
There's also an entry on hydrogenaudion forum from that time that concludes The FDK-AAC was the clear winner. Compared to 2016/01/17 version of the FFmpeg's native encoder, Fraunhofer's FDK-AAC library delivered the same sound quality at 32kbps reduced bitrates.

This is the relevant commit comment on their git that says the quality of this encoder rivals and surpasses libfdk_aac in some situations

There's an archived version of the wiki from that time, February 2016, that said The native FFmpeg AAC encoder. This is currently the second highest-quality AAC encoder available in FFmpeg and does not require an external library like the other AAC encoders described here. This is the default AAC encoder.

Earlier version from September 2015 said The native FFmpeg AAC encoder is included with ffmpeg and is does not require an external library like the other AAC encoders described here. Note that you will not get as good results as with libfdk_aac.

Author

ScribbleGhost commented Jan 5, 2025

Thanks for the information @bardware I didn't spend enough time researching this. I guess I will stick to Fraunhofer FDK AAC then. Thinking about it I realize that lossy is lossy either way so I don't care too much about the quality. If anything Opus is a superior codec anyway.

ScribbleGhost/Converting audio to AAC with Fraunhofer FDK AAC (libfdk_aac) in FFmpeg.md

Converting audio to AAC with Fraunhofer FDK AAC (libfdk_aac) in FFmpeg

Check if you have an FFmpeg build supporting libfdk_aac

How to get an FFmpeg build with libfdk_aac

My way of building a custom FFmpeg

My go-to preset for highest quality regardless of file size

ddelange commented Sep 14, 2023

input.wav

libfdk_aac (--enable-nonfree)

-afterburner 1 -vbr 0 is the default setting, can be removed

VBR performs consistently worse

aac_at (--enable-audiotoolbox)

CBR

VBR

aac_he

aac_he_v2

ABR

CVBR

ddelange commented Sep 14, 2023

ddelange commented Sep 14, 2023 • edited Loading

ScribbleGhost commented Aug 31, 2024

bardware commented Jan 3, 2025

ScribbleGhost commented Jan 5, 2025

ddelange commented Sep 14, 2023 •

edited

Loading