Modern compilers apply dozens of these optimizations in multiple passes. It's like proofreading a manuscript multiple times, each pass focusing on a different improvement: one pass eliminates ...
That is a lot of pointer chasing. Every pointer follow is potentially a cache miss — the CPU has to fetch data from main memory (~100ns) instead of the fast cache (~1ns). Tree nodes are scattered ...