Resources
- Crafting Interpreters
- A Map of the Territory · Crafting Interpreters
- Compiler Explorer
- Writing a C Compiler, Part 1 and other parts by Nora Sandler
- A Guide To Parsing Algorithms And Terminology
- Introduction to Compilers and Language Design by Douglas Thain (PDF)
- some old school, OG stuff
- Let’s Build a Compiler, by Jack Crenshaw - This fifteen-part series, written from 1988 to 1995, is a non-technical introduction to compiler construction.
- Compiler Basics - quite succinct e.g. File 2 on Basic Language Theory, Syntax, Semantics, Chomsky Normal Form, Extended Backus Naur Form (EBNF) and Computer Languages
- Compilers and Interpreters HackerNoon
- A crash course in compilers – Increment Programming Languages
- ANTLR (ANother Tool for Language Recognition) is a powerful parser generator for reading, processing, executing, or translating structured text or binary files. It’s widely used to build languages, tools, and frameworks. From a grammar, ANTLR generates a parser that can build and walk parse trees.
- ANTLR Lab - web applet to try ANTLR
- Docs
- Watch a lecture by Dr. Parr
- Introduction to yacc
Concepts
C Compiler Series by Nora Sandler
Series of posts on Writing a C Compiler by Nora Sandler
- C Compiler, Part 10: Global Variables
- C Compiler, Part 9: Functions
- C Compiler, Part 8: Loops
- Writing a C Compiler, Part 7
- Writing a C Compiler, Part 6
- Writing a C Compiler, Part 5
- Writing a C Compiler, Part 4
- Writing a C Compiler, Part 3
- Writing a C Compiler, Part 2
- Writing a C Compiler, Part 1
Compiler Optimizations
Compiler optimisations aim to improve the performance of generated code by transforming the source code or intermediate representations. Here are some common compiler optimizations:
- Constant Folding: Evaluating constant expressions at compile-time instead of runtime.
- Dead Code Elimination: Removing code that is never executed or has no effect on the program’s output.
- Common Subexpression Elimination: Identifying and eliminating redundant computations.
- Loop Optimization:
- Loop-Invariant Code Motion: Moving computations outside loops if they do not depend on loop variables.
- Loop Unrolling: Duplicating loop bodies to reduce loop overhead and increase instruction-level parallelism.
- Loop Fusion: Combining consecutive loops to reduce memory accesses and improve cache utilization.
- Loop Interchange: Reordering nested loops to improve spatial locality.
- Loop Vectorization: Transforming scalar loops into SIMD (Single Instruction, Multiple Data) instructions for parallel execution on vector units.
- Data Flow Analysis:
- Constant Propagation: Replacing variables with their known constant values.
- Copy Propagation: Replacing variables with their assigned values.
- Common Subexpression Elimination: Identifying and eliminating redundant computations.
- Dead Code Elimination: Removing unreachable code.
- Register Allocation: Assigning variables to processor registers for faster access.
- Inline Expansion: Replacing function calls with the actual function code to reduce function call overhead.
- Strength Reduction: Replacing expensive operations with cheaper ones (e.g., replacing multiplication with addition or shift operations).
- Control Flow Optimization:
- Branch Prediction: Rearranging code to improve the accuracy of branch prediction.
- Tail Recursion Elimination: Replacing tail-recursive function calls with loops to reduce stack usage.
- Loop Parallelization: Transforming loops to execute in parallel on multiple processors or cores.
- Function Inlining: Inlining the code of small functions directly into the calling code to avoid function call overhead.
- Interprocedural Optimization: Optimizing across multiple functions or translation units.
- Memory Optimization:
- Loop Blocking: Partitioning arrays to fit into cache and exploit spatial locality.
- Loop Tiling: Dividing loops into smaller blocks to improve cache utilization.
- Memory Alias Analysis: Analyzing memory references to determine if they overlap.
- Memory Copy Propagation: Replacing redundant memory copies with direct memory assignments.
These optimizations represent a subset of the many techniques employed by modern compilers to improve code performance. The exact set of optimizations may vary depending on the compiler and its configuration.