Saturday, 25 January 2014

Building FFmpeg in Visual Studio

The default build chain for the FFmpeg project uses the standard (well standard for gnu open source projects) gnu autotools. The use of configure and make may be rather familiar to those who compile often on linux but for many Windows developers these can seem like somewhat alien tools. This is compounded by the fact that to use these tools on Windows requires setting up a MSYS/Cygwin environment which can often be easier said than done. Even after that most build chains using this environment require a gcc based compiler which on Windows is MinGW. Gcc is a good compiler but MinGW can have some issues (which in its defence is generally always around Windows specific things) which can make it less than ideal.

FFmpegs default build tools do currently support compiling natively in msvc (Microsoft's C compiler) and will even convert the C99 FFmpeg code to msvc compliant C89 code. But this still requires setting up a MSYS shell and any additional FFmpeg dependencies don't offer msvc support from within the same build tools.

So as powerful as the configure/make build chains can be and say what you like about the msvc compiler (I agree its standards compliance is abysmal) but for those wanting to develop natively on Windows having Visual Studio support is the simplest and most robust. Since FFmpeg wont maintain a native Visual Studio build chain (and I cant blame them for not wanting to), I got a little bored and decided to take it upon myself to provide an alternative.

The result of some holiday free time is that I wrote up a FFmpeg Visual Studio project generator. This is a simple piece of software that will scan in the existing FFmpeg configure/make files and then use those to dynamically generate a visual studio project file that can be used to natively compile FFmpeg. This program not only builds the project files but will also generate the config files dynamically based on a combination of rules found in the original configure script and any passed in arguments. In fact the generator excepts many of the same configuration arguments as the configure script allowing fine-tuned control of what gets enabled/disabled.
Example command line options:

ffmpeg_generator.exe --enable-gpl --enable-bzlib --enable-iconv 
   --enable-zlib --disable-encoders --enable-encoder=vorbis

The program doesn't support all the options of the configure script but it supports many of them and in some cases actually exposes more than the original. Any option that shows up in the config.h file can be manually enabled/disabled through the command line. And even if an invalid option is set the generator uses all the inbuilt configuration checks found in the original build script to automatically validate each setting. Options such as 32/64bit are some of the options that are not supported, mainly because they are not relevant as the generated config.h file can be used for both 32/64 bit and even detects and enables inline asm if the Intel compiler is made available.

#define ARCH_X86 1
#if defined( __x86_64 ) || defined( _M_X64 )
#   define ARCH_X86_32 0
#else
#   define ARCH_X86_32 1
#endif

All of this happens dynamically so for those building from git master, once any commits are and added to your repo you just have to rerun the generator program and it will automatically detect any changes (new/deleted files, configuration changes, new/deleted options etc.) and generate a new project accordingly.

Now I should point out that this generator was rather quickly (and half-assed) thrown together so its not exactly very brilliant code. But for something quick and dirty it gets the job done. It supports all current additional dependencies but not all have been checked and it takes certain liberties with respect to library naming. This is because many dependency libraries don't have consistent naming conventions so the link include the generated project uses may not be exactly the same as the actually file you are linking against. Should this happen then you'll just have to manually tweak the file-name in the include option as without any kind of naming consistency there's not much that can be done about it.

The generator also by default generates a project with default Intel compiler settings. Those without Intel compiler may have to change a few settings in the project properties if they want to set it back to the standard msvc. Intel is chosen as default as Visual Studio 2012 doesn't support enough C99 features to be able to compile FFmpeg so only the Intel compiler can be used with 2012 to build the project. For those with Visual Studio 2013 the default compiler adds enough C99 to be able to get it to work but for the moment the generator is built to default to 2012. The same project can be loaded in both 2012/2013 and all that needs to be changed is the compiler being used. If there is enough interest I may add an option to the generator to allow for people to specify whether they want Intel support or not at generation time but in the meantime you'll just have to change the build tool in the project properties.

Update: The current version of the generator allows for specifying the compiler that will be used as default in the output project file. Of course this can also be changed directly in the project after it is generated but for convenience the "toolchain" option is now processed by the generator. With newer patches to FFmpeg Visual Studio 2013 can compile it without problems. The toolchian parameter accepts either "msvc" for default Microsoft compiler or "icl" for the Intel compiler which also supports inline assembly.

ffmpeg_generator.exe --toolchain=msvc

The project generator can be found in my git repo below. Also in the repo is a pre-built project that is built using the following command options:

ffmpeg_generator.exe --enable-gpl --enable-version3 --enable-avisynth
   --enable-nonfree --enable-bzlib --enable-iconv --enable-zlib
   --enable-libmp3lame --enable-libvorbis --enable-libspeex
   --enable-libopus --enable-libfdk-aac --enable-libtheora
   --enable-libx264 --enable-libxvid --enable-libvpx --enable-libmodplug
   --enable-libsoxr --enable-libfreetype --enable-fontconfig 
   --enable-libass --enable-openssl --enable-librtmp --enable-libssh

Update: The default projects now include libcdio, libbluray, opengl, opencl and sdl enabled. More will be enabled as they are tested. All of these dependencies have working Visual Studio projects found in the SMP directories in each of their repos found at the parent ShiftMediaProject repository.

Your free to use this project directly as I keep it up to date and all the necessary dependency projects can also be found in my github.

git repository:
https://github.com/ShiftMediaProject/FFmpeg

Sunday, 5 January 2014

Building FFmpeg on Windows with in-line asm and the Intel compiler (Part 2).

In my last post I talked about how to get FFmpeg to compile under Windows with the Intel compiler. This has advantages in that the Intel compiler supports compilation of AT&T style inline assembly. This means that its possible to use the hand tuned optimised code found in FFmpeg while natively compiling for Windows (something that is otherwise not possible). Unfortunately the assembly support in the Windows version of the compiler is not complete and so the inline asm wont compile without some changes.

The previous post on this subject had a patch that allowed for compilation with inline asm. However since then I have cleaned up the patch and fixed a few things that I wasnt entirely happy about. In fact after talking to Michael Niedermayer and others on the FFmpeg mailing list I ended up writing an entirely new patch. This patch is currently still pending review but for those who are interested in getting this working now they can grab the patch from the end of this post.

The main changes in this version is the way the inline asm was changed from using direct symbol references to something defined in an asm-interface. From my previous post I mentioned that Intel compiler does not support direct symbol references in code so previously I had changed all of these to asm-interfaces. However the FFmpeg developers didn't want to change any existing code and there was some concerns over how moving variables from direct symbols into the interface may affect Position Independent Code compilation. So based on a suggestion from Michael the patch was changed to use named constraints. For those familiar with named constraints you'll know that to use these all you have to do is replace a direct symbol reference with a named constraint.

Example of existing direct symbol reference:
"movq "MANGLE(ff_pb_80)", %%mm0     \n\t"
"lea         (%3, %3, 2), %1        \n\t"

Since FFmpeg already had a macro for name mangling the direct symbol references it was rather simple to just change this macro to generate a named constraint instead. In order to do this the definition of MANGLE was changed so that with Intel compiler on Windows it generates a named constraint.

// Determine if compiler supports direct symbol references in inline asm
#if defined(__INTEL_COMPILER) && defined(_MSC_VER)
#   define HAVE_DIRECT_SYMBOL_REF 0
#else
#   define HAVE_DIRECT_SYMBOL_REF 1
#endif

#if HAVE_DIRECT_SYMBOL_REF
    //The standard version of MANGLE for direct symbol references
#   define MANGLE(a) EXTERN_PREFIX LOCAL_MANGLE(a)
#else
    //A version of mangle that instead generates named constraints
#   define MANGLE(a) "%["#a"]"
#endif

Of course using a named constraint by itself wont work as the constraint still needs to be added to the asm-interface. Since this additional interface is only required for Intel on Windows it is not desirable to have it all the time. So in keeping with not changing any existing code the asm-interfaces were added using a new macro that simply does nothing for all other build chains. Using this the above inline asm becomes:

__asm__ volatile (
   "movq "MANGLE(ff_pb_80)", %%mm0     \n\t"
   "lea         (%3, %3, 2), %1        \n\t"
   put_signed_pixels_clamped_mmx_half(0)
   "lea         (%0, %3, 4), %0        \n\t"
   put_signed_pixels_clamped_mmx_half(64)
   : "+&r"(pixels), "=&r"(line_skip3)
   : "r"(block), "r"(line_skip)
     NAMED_CONSTRAINTS_ADD(ff_pb_80)
   : "memory"
);

The resulting changes are rather minimal and when compiling using any previously supported build chains there is no apparent difference in the code before and after this change. The macro NAMED_CONSTRAINTS is used to add each of the names of any directly accessed symbols. This macro just needs the name and can take a comma separated list of up to 10 values.

However adding the macro NAMED_CONSTRAINTS required slightly more work than I would have liked but it at least worked as required. The difficulty was due to a bug in both the Intel and Microsoft compilers where variadic arguments where not properly expanded (see my post on the subject http://siliconandlithium.blogspot.com/2014/01/macro-variadic-argument-expansion-on.html). Using the working variadic for-each from my previous post the implementation of NAMED_CONSTRAINTS is:

#if HAVE_DIRECT_SYMBOL_REF
#   define NAMED_CONSTRAINTS_ADD(...)
#   define NAMED_CONSTRAINTS(...)
#else
#   define NAME_CONSTRAINT(x) [x] "m"(x)
    // Parameters are a list of each symbol reference required
#   define NAMED_CONSTRAINTS_ADD(...) , FOR_EACH_VA(NAME_CONSTRAINT,__VA_ARGS__)
    // Same but without comma for when there are no previously defined constraints
#   define NAMED_CONSTRAINTS(...) FOR_EACH_VA(NAME_CONSTRAINT,__VA_ARGS__)
#endif

Putting this all together and for the most part all existing inline asm will compile without any problems. However you'll notice I said 'most'. The Intel compiler is extremely fussy about the use of inline AT&T assembly. This is most likely because this is considered a rarely used feature and so does not see much in the way of support. Unfortunately this means that using inline asm is much harder than it needs to be. So even after the missing support for direct symbol references the compiler still has many issues. The main one is that the compiler will sporadically decide it doesn't like some particular assembly line and generate an error. The same assembly line will be working fine until you change a compilation option (or in some cases just move the assembly block somewhere else) then all of a sudden it will generate an error. Ideally it would be nice if Intel cleaned up some these inconsistencies but in the mean time the patch had to work around them.

So the patch does change a couple of things. In fact there are 2 instances (in x86/motion_est.c and vf_fspp.c) where a direct symbol reference had to be removed and replaced with an asm-interface. This may be seen as changing the original code but in both instances the variable in question already existed in an asm-interface so there additional inclusion should have absolutely zero affect. In fact just to be sure I checked the generated code from gcc before and after the patch and ensured that every line was identical.

The patch passes all FATE tests and compiles on Intel (under normal release compilation options) and has zero impact on any other build chain. For those interested the full patch can be downloaded below.

Update 2: New and improved patches have been created and these are now available directly in FFmpeg master (no patching required). See my post for more information (Building FFmpeg on Windows with in-line asm and the Intel compiler (Part 3)).
Update: A new patch is available that adds support for an extra file. This file is only used when FAST_CMOV is enabled so was missed previously. The new patch should be grabbed from here:
https://github.com/ShiftMediaProject/FFmpeg/commit/26acab2672f27b923510ec35cbbade69a7776c9d.diff

Download patch file:
https://github.com/ShiftMediaProject/FFmpeg/commit/6eea177a4761629df5d6f02359fccc4efb116aed.diff

Note: Unlike my last patch this one is done directly against the current master (at time of writing).

Macro variadic argument expansion on Windows.

Recently while trying to write some code I decided to use a variadic macro. For those familiar with the concept you'll know that they can be very powerful. Unfortunately I ran into a slight problem when using them. That problem being that the Microsoft compiler (and the Intel compiler for Windows) were both incorrectly expanding the variadic arguments (__VA_ARGS__). Or more to the point they both were completely failing to expand the arguments. This meant that it was not directly possible to get each element individually as the compiler was essentially treating the whole variadic argument as one long string (commas included). This is directly in violation of what the standards require and so both compilers are being non-compliant. The Microsoft compiler has never embraced C99 so this should not be to much of a surprise but Intels failure here is a bit more egregious.

Luckily there is a way to get around this. It is not quite as elegant as what works on other compilers but it will at least work on Windows while still building on other build chains. So the example I will give is a for-each macro. This uses the variadic expansion and then sends each argument element to a specific macro. The trick here was that each expansion had to be explicitly handled. This means that if you want to support a variadic macro that can take up to 10 arguments then you need to create 10 explicit variants of the expansion. This limits the maximum number of arguments that can be supported (by however many expansions you can be bothered to write) and is also not very elegant but at least it will work.

#define FE_0(P,X) P(X)
#define FE_1(P,X,X1) P(X), FE_0(P,X1)
#define FE_2(P,X,X1,X2) P(X), FE_1(P,X1,X2)
#define FE_3(P,X,X1,X2,X3) P(X), FE_2(P,X1,X2,X3)
#define FE_4(P,X,X1,X2,X3,X4) P(X), FE_3(P,X1,X2,X3,X4)
#define FE_5(P,X,X1,X2,X3,X4,X5) P(X), FE_4(P,X1,X2,X3,X4,X5)
#define FE_6(P,X,X1,X2,X3,X4,X5,X6) P(X), FE_5(P,X1,X2,X3,X4,X5,X6)
#define FE_7(P,X,X1,X2,X3,X4,X5,X6,X7) P(X), FE_6(P,X1,X2,X3,X4,X5,X6,X7)
#define FE_8(P,X,X1,X2,X3,X4,X5,X6,X7,X8) P(X), FE_7(P,X1,X2,X3,X4,X5,X6,X7,X8)
#define FE_9(P,X,X1,X2,X3,X4,X5,X6,X7,X8,X9) P(X), FE_8(P,X1,X2,X3,X4,X5,X6,X7,X8,X9)
#define GET_FE_IMPL(_0,_1,_2,_3,_4,_5,_6,_7,_8,_9,NAME,...) NAME
#define GET_FE(A) GET_FE_IMPL A
#define GET_FE_GLUE(x, y) x y
#define FOR_EACH_VA(P,...) GET_FE_GLUE(GET_FE((__VA_ARGS__,FE_9,FE_8,FE_7,FE_6,FE_5,FE_4,FE_3,FE_2,FE_1,FE_0)), (P,__VA_ARGS__))

This requires 3 pieces. The first is GET_FE_IMPL which is used to pass the variadic arguments to the explicit expansion macro (FE_0, FE_1 etc.). The second piece is GET_FE. This macro just passes directly to GET_FE_IMPL and on other compiler is completely redundant but on Windows it is used to add an extra level of indirection that forces the input parameters to be correctly 'stringified'. Remove it and everything will fail horribly. The final piece required to get this to work is the GET_FE_GLUE macro. This is perform the same task as GET_FE in that it is required to correctly pass the variadic arguments. Both of these are required in order to get around the bugs in the compilers.

This new macro can be used to create any new macro that should perform a desired operation on each of the input arguments. All that is required is to call FOR_EACH_VA and pass it a macro followed by the variadic arguments. The passed macro will then be called and passed each individual element in the argument list.

Example:
#define DO_SOMETHING(x) /*Insert whatever code you wish to operate on the element 'x'*/
//The definition of the new variadic macro
#define DO_SOMETHING_VA(...) FOR_EACH_VA(DO_SOMETHING,__VA_ARGS__)

With the above you should be able to create any variadic macro you want.

Example usage:
//Perform DO_SOMETHING on each of the 3 arguments
DO_SOMETHING_VA(element1,element2,element3)

If you want to support an additional number of input arguments (the above only supports a maximum of 10) then all you have to do is add more explicit expansion macros (e.g. FE_10, FE_11 etc.) and then add these in sequence to GET_FE_IMPL and to FOR_EACH_VA.