According to Fast Technology on July 18th, FFmpeg developers have once again achieved significant performance improvements through handwritten assembly code. The developers stated: “Handwritten assembly code has made FFmpeg run 100 times faster, which is potentially the largest speed increase I’ve ever seen.”
However, they quickly clarified that this 100-fold increase applies to a specific function, not the entire FFmpeg application.
Through the latest handwritten assembly patch, the “rangedetect8_avx512” function within the application saw a performance increase of 100.73%. Even for users whose processors do not support AVX512, utilizing the “rangedetect8_avx2” code path still resulted in a substantial 65.63% performance improvement.
The developers further acknowledged in subsequent posts: “This is a single function that is now 100 times faster, not the entire FFmpeg.” They elaborated that this feature, capable of such a dramatic speed enhancement, was a “relatively niche filter.”
Precisely because of its niche status, this particular filter had not been prioritized by developers until now. The rewritten filter code adopts the principles of SIMD (Single Instruction, Multiple Data) processing, leading to significantly improved parallel processing capabilities.
It’s evident that compilers still struggle to compete with handwritten assembly, or as FFmpeg puts it: “The compiler’s register allocator is terrible.” This highlights a persistent challenge in software development where automated optimization, while convenient, may not always reach the peak performance achievable through manual, low-level tuning.
FFmpeg stands out as one of the few projects that continues to embrace handwritten assembly code for optimization. The team even runs a “school” dedicated to teaching the intricacies of this low-level programming technique. This commitment to manual optimization underscores the project’s dedication to maximizing performance, especially in areas where compiler optimizations might fall short.
FFmpeg is a comprehensive suite of video and audio solutions, offering a wide array of functionalities including decoding, encoding, and post-processing. It provides robust support for the vast and often eclectic range of video and audio codecs found in the digital world.
