Sunday, March 14, 2010

Building a simple compiler series(4)

Symbol Table
Symbol table is a mechanical that associate the attribute with identifiers. Just like we declare a variable in our source code, then we can refer to this variable according to the name, like temp or x. When compiling the source code, compiler would scan and parser the code and construct a table for the symbols.
There are some choices for data structure of the symbol table.
  1. Liner list
  2. Various Binary Search Tree
  3. Hash Table
Liner list is the simplest way. You only need a array or a linked list. But this way is inefficiency. The principle symbol table operation are insert , lookup and delete, other operation may be necessary. Take the liner list as an example, there are n symbols in this list.The average operation for inserting is constant O(1), the average operation for lookup is O(n).If the list is array, the average operation for delete is O(n) and you also need shift the array.
Binary Search tree is efficiency than liner list. The best average operation for insert and lookup is O(log(n)). But deleting is more complicated than that of liner list. If your language is not so complicated and the number of symbol is small, you can use the liner list or binary search tree.
Actually, the hash table is the choice. In fact, most of commercial  compiler use hash table to implement their symbol table in practice.  The hash table can provide constant time for operation.
Collisions is one of the most problems hash table meet. Perhaps the best scheme for compiler construction is open addressing, called separate chaining. In this method, each bucket actually a liner list. When collisions happen, a new item would be inserted into the bucket list.
Declarations
The behavior of a symbol table depends heavily on the properties of declarations of the language being translated. There are four basic kind of declarations in programming language:constant declarations, type declarations,variable declarations and function declarations.

No comments:

Post a Comment