It's hard to develop an intuition about what is fast versus what is simpler to write, test and maintain, and it’s easy to fall into the trap of not confirming an assumption.
I once traced unexpectedly poor performance back to code that purportedly prefaulted some memory into the process. In fact, the compiler was smart enough to notice the code wasn't using the memory it was touching and happily optimized the whole prefaulting code away!
A few years ago, I was debating a performance trade-off with a colleague. We wondered whether iterating over a std::vector
using index-based accesses was faster or slower than using either iterator
s or the new range-based for
.
Instead of measuring directly, we decided to compile our test programs and compare the assembly output. This quickly became burdensome, so I put together a setup with one terminal running vi
and another running something like: watch 'gcc -S example.cc -o - | c++filt'
.
As the code was updating, we got immediate feedback on what the generated assembly would look like. Thus, the idea behind Compiler Explorer was born.
Compiler Explorer is a website version of that original script: one side of the website has a text editor where code can be edited and the other side interactively shows the assembly output of the compiler running on the same snippet of code. Since its simple beginnings, improvements were made to show how the source code maps to the assembly code (via coloring).
In 2012, DRW allowed me to open source Compiler Explorer, and since then it's been developed in the open on GitHub. I also run and support a public instance of Compiler Explorer for a variety of languages: C++, D, Go and Rust. It's been amazing to see it take off! I've heard it’s being used by clang and GCC developers to help them improve code generation, and it has been awarded a Google Open Source Peer Bonus by a member of their compiler team.
Compiler Explorer has helped members of the C++ community determine best practices and has been used as the focus of several presentations at C++ conferences, including some demonstrating the zero-overhead principles of C++.
We've used Compiler Explorer to help understand some surprising optimizer behavior, interesting performance differences between signed and unsigned indices, and whether it's optimal to pass unique_ptr
by r-value reference or value. At DRW, so much of what we do is proprietary and kept internal for competitive reasons, but I appreciate that we can share with the open source community when it makes sense.
So, what did we decide about our original question: is it better to loop over std::vector
s using indices or with modern range-based for
s? Take a look for yourself!