Semantic analysis

Semantic analysis checks that the code makes programmatic sense by making sure that variables , constructs, and expression are used in a correct way according with the programming language’s rules.

This is accomplished by using  the parse tree ( generated from syntax analysis)  and a  symbol  table  to perform semantic checks.

Symbol Tables

A symbol (identifier) table is generated during semantic analysis which stores the attributes of each identifier including:

  • name,
    • datatype ,
    • Scope: where in the source code it was declared (Global or within functions)

For example, for the line of code below,

            x = 150;

the following table entry may be generated:

IdentifierTypeValueMemory LocationScope
xint1500x1001Global

Note that  in practice, symbol tables may store more metadata – we have simplified the concept for easy understanding.

Symbol Table Example With A Full Program

Code
#include <stdio.h>

float PI = 3.14159265359;  // Define PI as a variable
//function declaration was omitted for simplicity.
float calculateArea(float radius) {
    return PI * radius * radius;
}

int main() {
    float radius;

    printf("Enter the radius of the circle: ");
    scanf("%f", &radius);

    float area = calculateArea(radius);
    
    printf("The area of the circle with radius %.2f is %.2f\n", radius, area);

    return 0;
}

Symbol Table

The following simplified symbol table is generated:

IdentifierTypeValueMemory LocationScope
PIfloat3.141592653590x1001Global
calculateAreafloat()Global
mainint()Global
radiusfloat0x1002main
areafloat0x1003main

Taking an overview in order to perform these checks, we observer some facts of compiler design:

  • All parse trees and symbol tables are checked to see if they obey the rules of the language.
  • There are rules for the structure of the parse tree;
  • There are rules for the symbol table.
  • Together, these rules enforce the semantic rules of the language.

In this set of examples , we consider a hypothetical compiler of our own design.

Recall the parse tree generated from example 1:

Line 1: Real D,b,a,c;

Line 2: D= b*b – 4*a*c;  

The tokens making up the line 2 (shown below)….

D=b*b4*a*c;



…will generate the following tree:

Example 2 – Check Symbol Tables for consistency

It is worth noting that another possible semantic check would be to see if all token identifiers have been declared. In our case for example 1:

Line 1: Real D,b,a,c;

Line 2: D= bb – 4*a*c;  

we see that our typo could have seen our tokens read as 

D=bb4*a*c;

At this stage, we should be able to check that each token was declared.  In this case , when we check our symbol table and see that it was not declared.  The compiler can now record an error for this line stating that bb was not declared.
 

Example 3 – Check Parse tree for invalid operands (Type Mismatch)

Consider the following code:

Line 1: Real D,b,a,c

Line 2: D= “Fred”

 Line 2 produces tokens:

D=“Fred”

Which produces the tree :

Note that D is of type real.  We could implement a new rule for our compiler:

All child nodes on the same level must have the same data type found in the symbol table.

Thus after checking our tree , we can record a type mismatch error for our current line.

Symbol Table and Parse Tree Analysis Summary

  1. All symbol tables and trees are checked for rules which identify violation of rules in our source language.
  2. Every time an error is found, its line location is recorded with an appropriate message.
  3. After all symbols and trees are checked, generate a list of errors and stop compilation.
  4. If no errors were recorded, move on to the next step: Intermediate code generation

© 2024  Vedesh Kungebeharry. All rights reserved. 

Leave a comment