Waf vs GNU Make — part two
Last time I've determined that the biggest problem of make and waf was a big dependency graph. To verify that, I prepared an even bigger project to test. Results follow but first, I've got some news.
Tup is another build tool. But not yet another. Mike Shal, the author of Tup, used a completely different approach to the build problem and have reverted the dependency graph. This way the algorithm starts with changed nodes and builds the graph by adding their dependencies (in this case, the products). This results in a much smaller graph to solve. I read the results from the page and they're stunning.
There are few things I like about Tup. The inverted graph is a good idea. Who cares how to build program foo. If a source file changed, and this source file is there probably not for fun, then compile it. That's what incremental build is all about. Another nice feature is an implicit handling of subdirectories (note, it's not recursion, just splitting the build files). If you want to add a subdirectory, just put a Tup-file here, no need to mention that in the parent.
Another interesting idea is the way that dependencies are handled. Tup monitors applications run during build and if those apps access filesystem, is sees that and adds that info to its database. No need to process #include files anymore. If gcc opens them, it meens they're needed. And it works for all types of files.
What I don't like about tup... first of all, yet another syntax to learn. I really agree with Waf, Scons and others, that using an existing language is better. Second thing is that it needs fuse. My current kernel has no fuse support so right now I'm not able to test tup myself.
A bigger problem is that it is so damn small. I like small things, don't get me wrong. But from a build tool, I would expect support for some common usage patterns like build variants, unit tests, configuration, etc. I like Waf's way of expressing the build process in terms of tasks generators, i.e. gimme a program built from these sources, instead of node-center view that tup, make and many others use. Compile first source, compile another source, link. Crap. Every C++ program is built basically the same way. Don't repeat yourself.
Tup implements some functionality to monitor filesystem, for exapmle it cleans unused build files (e.g. when you rename output, the old file is now unused) automatically. I haven't used it yet, I'm not sure if I like it or not.
OK, back to Waf and Make. It's now clear that they both suck, let's just see how. The project was bigger this time:
- 9100 files
- 300 MB of code
- 30487 includes in .cpp files
- 15727 includes in .h files
Yesterday I dig into waf's code, I've also payed a little more attention to the first build, because it turned out to be a hint of what's going on. Make sucks at finding the proper build order, but it's damn fast on checking the files. Git does exactly the opposite. Why? Because Git uses domain knowledge to its aid when ordering tasks. In simpler (and not entirely true) words, it just knows that you first need to compile sources, then link.
The observed behavior is that, before actually doing anything, make takes the same amount of time for the first build and for an incremental build. Every time it is run, it checks dates on all files and determines the order. Waf starts almost immediately (just some lag caused by python, parsing, etc.) on the first build, when its cache is empty. On an incremental build it reads its cache which takes forever.... and consumes 0.5 GB of memory.
So for the initial build, Waf starts just as fast no mater how big the project is. The incremental build, however, is a huge challenge on (very) big projects. Finally, the results:
- GNU Make variables (1): 20,05s user 1,11s system
- Waf standard (3): 243,03s user 4,38s system