Summary of Chapter 6


Pick an intermediate representation: An intermediate representation is typically some combination of a graphical notation and three-address code. As in syntax trees, a node in a graphical notation represnets a construct; the children of a node represent its subconstructs. Three address code takes its name from instructions of the form x = y op z, with at most one operator per instruction. There are additional instructions for control flow.

Translate expressions: Expression with built-up operations can be unwound into a sequence of individual operations by attaching actions to each production of the form E -> E1 op E2. The action either creates a node for E with the nodes for E1 and E2 as children, or it generates a three-address instruciton that applies op to the addresses for E1 and E2 and puts the result into a new temporary name, which becomes the address for E.

Check types: The type of an expression E1 op E2 is determined by the operator op and the types of E1 and E2. A coercion is an implicit type conversion, such as from integer to float. Intermediate code contains explicit type conversions to ensure an exact match between operand types and the types expected by an operator.

Use a symbol table to implement declarations: A declaration specifies the type of a name. The width of a type is the amount of storage needed for a name with that type. Using widths, the relative address of a name at run time can be computed as an offset from the start of a data area. The type and relative address of a name are put into the symbol table due to a declaration, so the translator can subsequently get them when the name appears in an expression.

Flatten arrays: For quick access, array elements are stored in consecutive locations. Arrays of arrays are flattened so they can be treated as a one-dimensional array of individual elements. The type of an array is used to calculate the address of an array element relative to the base of the array.

Generate jumping code for boolean expressions: In short-circuit or jumping code, the value of a boolean expression is implicit in the position reached in the code. Jumping code is useful because a boolean expression is typically used for control flow, as in if (B S. Boolean values can be computed by jumping to t = true or t = false, as appropriate, where t is a temporary name. Using labels for jumps, a boolean expression can be translated by inheriting labels corresponding to its true and false exits. The constants true and false translate into a jump to the true and false exits, respectively.

Implementing statements using control flow: Statements can be translated by inheriting a label next, where next marks the first instruction after the code for this statement. The conditional S -> if (B) S1 can be translated by attaching a new label marking the beginning of the code for S1 and passing the new label and for the true and false exits, respectively.

Alternatively, use backpatching: Backpatching is a technique for generating code for boolean expressions and statements in one pass. The idea is to maintain lists of incomplete jumps, where all the jump instruction on a list have the same target. When the target becomes known, all the instructions on its list are completed by filling in the target.

Implement records: Field names in a record or class can be treated as a sequence of declarations. A record type encodes the types and relative addresses of the fields. A symbol table object can be used for this purpose.