Waf vs GNU Make — Incremental Build

21 stycznia 2012, 21:00:54

I don't post in English but since I couldn't google this, I've decided to share the results in a common language. Since the Waf's community is quite small, it seemed almost pointless to write in Polish.

Waf is a promising build tool. It uses checksums to determine if anything needs updating. I like this feature because when working with Git and switching branches back and forth, I often get different timestamps for files that didn't really change.

The problem with checksums is that it takes some time to compute them, so GNU Make, in theory, could be faster here... I decided to check that. The aim was to measure how long it takes to determine that nothing changed and nothing needs to be rebuilt.

I've generated a sample C++ project. The generator was extremely stupid. I've played a little with parameters: number of files, their size and number of #include statements in the generated code. After some tuning to get enough files for the build to take some time, but not forever, I got a project with:

Note: kids, use the pimpl idiom and all that stuff to reduce the number of #includes in your code. It takes forever (an hour on my PC) to compile a project that contains nothing but comments and #include statements. To get a somehow realistic dependency graph I decided that all headers will only include from the set of first 50 headers and implementation files can include from anything.

When the project was ready, I've generated different sets of build files.

First a standard makefile (marked (1) below) in a style I would write manually, i.e. using variables, %.o: %.cpp and all that stuff. It was a single file, no recursion. Dependencies for .o and .so files were listed explicitly, dependencies between headers were generated using g++ -MMD and included. Makefile had 480 lines.

Another makefile (2) was not using variables (except for $@ and $^ in the recipes) but instead it was listing all dependencies and recipes explicitly. The dependencies on included files were generated as before. Over 15900 lines.

Waf scripts were exactly how you would expect them, every task generator got its target's name and a list of sources. Nothing fancy. I've only played with the function responsible for calculating checksums. In the first case (3) it was the standard implementation that calculated md5 of file's content. Then I've tried to get rid of md5s by using just timestamp and I've also tried the md5_tmstamp extension. This extension tries to get the best of two worlds by updating the checksum only if the timestamp changed. Results were very close so I'm showing only the timestamp-based version (4).

Few things I should stress. I was using a non-recursive makefile written as one big file so this is probably the best you can get with GNU Make. Most projects either use a set of recursive makefiles or at least split Makefile into several files and include them, often using magic macros. I was planning to measure the recursive makefile but I lost the interest ;-). Furthermore, I was not using any variables that you would use in a real life project, like CC, CFLAGS, etc.

Another thing to note about GNU Make is that it is actually faster to parse a huge makefile, than to compute some values using variables and standard string substitutions.

Waf runs about 3 seconds slower here. I've noticed that it takes something around 3 seconds between starting waf and the first message about entering the build directory. I don't know waf internals but I'm guessing it's the time taken to load the state of the last build.

The calculation of checksums adds almost no penalty to the build time, while in some setups it greatly reduces the number of updated targets. Keep in mind, however, that my sources were empty so the resulting binaries were minimal. In a real life project these binaries would be quite big.

Update

I've measured a solution that uses recursive makefiles. Such setup runs in 3.5 s. Seems like the biggest challenge in my sample project was finding the order of tasks using the dependency graph. When that graph was split into several smaller parts, we got a nice speed up.

Now I'm wondering... normally, when using recursive make, you may get some incomplete dependencies resulting in incomplete builds. But my sample project didn't have such edges in its graph, so the difference is clearly triggered by the size of the graph. It seems like the algorithms used by Waf and Make could use some help from the outside.

m

21 stycznia 2012, 22:34:56

Waf runs about 3 seconds slower here. I've noticed that it takes something around 3 seconds between starting waf and the first message about entering the build directory. I don't know waf internals but I'm guessing it's the time taken to load the state of the last build.

Ja bym obstawiał 3s jako czas uruchomienie interpretera pythona + sprawdzenia czy pliki sie nie zmienily + kompilacji.

Mialbys czas zamiast zwyklego pythona sprobowac uzyc pypy: http://pypy.org/ ? :)

m

21 stycznia 2012, 22:36:54

W sensie sprawdzenie czy sie nie zmienily pliki .py i ich ewentualna kompilacja do bytecodu (pyc)

QRX

21 stycznia 2012, 23:21:28

Mialbys czas zamiast zwyklego pythona sprobowac uzyc pypy: http://pypy.org/

'build' finished successfully (8.378s)
~/src/pypy-1.7/bin/pypy waf 8,87s user 0,27s system 99% cpu 9,150 total

Jest szybciej, ale ta faza, w której nic się nie wyświetla trwa dłużej. Raczej nie jest to czas rozruchu pythona bo wtedy powinien być stały w różnych przypadkach (waf jest ten sam), tymczasem dla innych, małych projektów, jest szybciej.

PyPy dał niezłego kopa. Niestety waf uruchomiony pod PyPy nie działał stabilnie. Co kilkaset kompilacji oznajmiał, że komenda zakończyła się niepowodzeniem i przerywał. Widać też było, że kilka sekund wcześniej jeden procesor przestaje otrzymywać zadania. Nie będę badał przyczyn.

Dodaj własny

Podpis:
Treść:
Strona WWW (opcjonalnie):
Wpisz kod:code