Someone changed a line in the GCC compiler and got a 12% improvement on modern Intel and AMD chips.


Summary

  • Adding 3 to GCC’s branch misprediction scale makes it more cautious about branch mispredictions.

  • The NAB SPEC CPU 2017 test showed a ~12% speedup on modern AMD and Intel CPUs.

  • The change will arrive in GCC 17, in 2027.

It’s been a strange month for really small code tweaks that make noticeable performance improvements. It was just a few days ago. that we learned that someone modified three lines of code in the Linux kernel and achieved a 5% increase in storage speed due to this. Now, someone has come forward and claimed that a single line change to the GCC compiler added a 12% performance boost for modern AMD and Intel chips in a 2017 SPEC CPU benchmark.

Adding 3 to a variable reaped big benefits for new AMD and Intel processors

To be fair, it was a very shocking addition of 3

An image of a person holding an AMD Ryzen 7 7700 chip.

How he saw it ForonyxIntel software engineer Lili Cui has found a way to get more performance with minimal changes to the GCC compiler. The exact process Cui used to get this extra performance is a bit complex, so let’s break it down.

When a CPU executes code, it tries to “cheat” to increase its performance. When a CPU encounters a decision in the code (such as an if/else statement), it “should” wait for calculations to tell it which path to take. However, with a process called “speculative execution,” the CPU predicts which path the program will take and begins processing subsequent code in advance.

It’s like texting your friend asking if they want a burger or pizza, then predicting that they’ll want a burger and ordering a grilled burger. If you are right, you can cook the burger faster and impress your friend with your speed. If you make a mistake, you’ll have to stop, clean everything up, and cook a pizza instead. Similarly, an incorrect assumption by the CPU means that it has to go back to the decision and take the other path.

This is called “branch misprediction” and cui noticed that doing them on modern CPUs costs more performance than people first assumed:

Modern CPUs have deeper pipelines, making branch mispredictions more expensive. Increasing this cost encourages if conversion, avoiding pipeline stops due to poorly planned branches.

To address this issue, Cui modified the line of code that defined the branch misprediction scale, which GCC’s internal code generation math uses to measure whether a branch is worth the risk. All Cui did was add 3 to the scale, and now the compiler is much more cautious about generating standard branching code. This makes it more likely to optimize the code in another way, such as with a branchless sequence.

Once that was done, Cui put his processors through a SPEC CPU 2017 benchmark called 544.nab_r Nucleic Acid Builder (NAB), which calculates the physics and chemistry of molecules. Cui noticed a 12% increase in performance for Intel and AMD chips as they spent less time backtracking and more time reviewing code.

It will be a while until we see this change, as it was merged for GCC 17, which will be released next year. However, it is an interesting story about how a small adjustment can make a big difference.


The Bequiet Rich CPU Cooler.

5 hidden BIOS settings to reduce power consumption and temperatures

Get a cooler system without losing performance



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *