Implementation of Programming Languages
Binding refers to finding certain attributes of certain objects. There are 6 steps involved here:
- Conceptualization
- Coding
- Compiling
- Linking - Obtain addresses of external functions, data
- Loading - Actual addresses of code and data
- Execution - Variable values, user inputs are obtained. There are no unbound objects at execution stage
We are concerned with the “binding time” taken between the coding and compiling stages.
Implementation Mechanisms
There are two methods;
-
Translator which translates the source program into a target program, after which the machine executes the target program. Two distinct steps are present here, translation and execution. Inputs are taken at execution and not needed at translation.
-
Interpreter is a program which takes both source program and input data to execute the code.
Note that the source program is never executed directly on the machine in either case. This is because there is a “gap” between the “levels” between the source program and execution. That is, the source code is high-level but the machine operates on low-level.
Level | State | Operations |
---|---|---|
High Level | Variable values | Expressions, control flow |
Low Level | Register values | Machine Instructions |
-
Translator reduces the level of specification to execution level.
-
Interpreter raises execution level to specification level. The program (interpreter) executes the code, not the system.
Translation | Interpretation | |
---|---|---|
Analysis + Synthesis | Analysis + Execution | |
Bindings before runtime | Bindings at runtime | |
Turn around time | compilation + execution | analysis + execution |
The turnaround time for interpreters is faster, but compilation needs to be done only once for multiple executions, saving time.
Language | Method |
---|---|
C, C++ | Compilation (Backend) |
Java, C#, Python | Interpretation + JIT Compilation (Virtual Machine) |
JIT (Just In Time) Compilation is where a repeated part of the byte code is compiled by the virtual machine for faster execution. Do note that this is done in runtime.
Frontend - Language Backend - The machine ISA
$m \text{ frontends } + n\text{ backends } = m\times n \text{ compilers}$
Compiler Structure
The first two steps in compilation are scanning and parsing. Parsing creates a “concrete Parse tree” with terminal and non-terminal nodes.
An abstract syntax tree (AST) is constructed by the semantic analyzer to ensure that there are no type errors.
The abstract syntax tree is then used to construct the TAC list (Three Address Code). TAC stores the intermediary variables. This linearizes the control flow by flattening the nested control constructs.
RTL List converts the instructions in the TAC list to the ISA. Addresses are still unknown at this stage. RTL List is converted to the Assembly Code wherin proper assembly language with register addresses and offsets using ISA.