Angel Cui: lc3542
Nikolaus Holzer: nh2677
For hw3 grading, please refer to the Code Generation section.
- Make sure you are in the viper directory.
- Run
chmod +x vpc.shto ensure executable access to the shell script. - Run
source ./vpc.sh. This will activate a virtual environment called viper and set you up with required dependencies, and run the five code examples and return the outputs. For grading please runpython vpc.pywhich will run the Viper compiler on the ex{num}.vp files in ./examples and generate the corresponding ex{num}.py code ifsource ./parser.shfails. - And then you can run the generated .py code and compare their output with the expected ones. We run the code and compared the output in the demo video 2.
Please find the demo video at: https://drive.google.com/drive/folders/1Dgx0PggO9zZVhKScSzEIXWRdzvQyOG7i?usp=sharing
Demo video folder is open to CU emails, please log in to see the videos, or request access, we will grant access as soon as we can.
Our code generator should convert the AST into correct python code. We check for compile time type consistency in the Viper code and then turn it to python. Please refer to the ex{num}_correct.py as expected output for ex{num}.vp viper code. All examples provided have the expected python output and expected behavior.
For hw2 grading, please refer to the Tokenizing section.
- Make sure you are in the viper directory.
- Run
chmod +x parser.shto ensure executable access to the shell script. - Run
source ./parser.sh. This will activate a virtual environment called viper and set you up with required dependencies, and run the five code examples and return the outputs. For grading please runpython parserwhich will execute the__main__.pyfile in the scanner dir ifsource ./parser.shfails.
Viper -> StatementList
StatementList -> Statement StatementList | ε
Statement -> TypeDeclaration Statement' | ExpressionStatement
Statement' -> <VAR> <ASSIGN> Expression <SEMICOLON> // VariableDeclaration
| <DEF> <FUNC> <LPAREN> ParameterList <RPAREN> <PYTHON_CODE, :> FunctionBody // FunctionDefinition
TypeDeclaration -> <TYPE> <TYPE_DEC>
ParameterList -> Parameter ParameterListRest | ε
ParameterListRest -> <PYTHON_CODE, ,> Parameter ParameterListRest | ε
Parameter -> TypeDeclaration <VAR>
FunctionBody -> <LBRACE> StatementList ReturnStatement <RBRACE>
ReturnStatement -> <PYTHON_CODE, return> ExpressionStatement <SEMICOLON> | ε
ExpressionStatement -> Expression <SEMICOLON> | Expression
Expression -> SimpleExpression ExpressionPrime
ExpressionPrime -> SimpleExpression ExpressionPrime | ε
SimpleExpression -> PYTHON_CODE | VAR | FunctionCall | Loop | ParenthesizedExpression | Range | OP
Range -> <LPAREN> Python Var Python Var Range <RPAREN> | <LPAREN> Python Var Python Var Range <RPAREN> <SEMICOLON> | ε
Python -> <PYTHON_CODE> Python | ε
Var -> <VAR> Var | ε
ArithmeticExpression -> Expression <OP> Expression
FunctionCall -> <FUNC> <LPAREN> ArgumentList <RPAREN> <SEMICOLON>
ArgumentList -> Expression ArgumentListRest | ε
ArgumentListRest -> <PYTHON_CODE, ,> Expression ArgumentListRest | ε
Loop -> <PYTHON_CODE, for> <PYTHON_CODE> <PYTHON_CODE, in> Python <LBRACE> StatementList <RBRACE>
Our parser handles syntactic errors. We implement a stack based error handler that creates a new local error context for each recursive call, only propagating the errors from true syntactic errors. In this way, our parser does not throw errors coming from the regular tree search of the recursive parser. Below
"""Error handling: stack management"""
def push_error_context(self):
# Create a new temporary error context on the stack
self.error_context_stack.append([])
def pop_error_context(self, success):
# Pop the top error context and commit to main error list if unsuccessful
if self.error_context_stack:
temp_errors = self.error_context_stack.pop()
if not success and temp_errors:
self.err.extend(temp_errors)
def add_error(self, message):
# token = self.current_token()
token = None
error = ParserError(message, self.position, token)
if self.error_context_stack:
self.error_context_stack[-1].append(error)
else:
self.err.append(error)
"""Error handling: Local error context"""
def parse_statement(self):
pos = self.position
self.push_error_context()
type_decl = self.parse_type_declaration()
if type_decl is not None:
stmt_prime = self.parse_statement_prime()
if stmt_prime is not None:
self.pop_error_context(True)
return ("Statement", type_decl, stmt_prime)
self.add_error("Expected StatementPrime after TypeDeclaration")
self.position = pos
expr_stmt = self.parse_expression_statement()
if expr_stmt is not None:
self.pop_error_context(True)
return ("Statement", expr_stmt)
self.pop_error_context(False)
self.add_error("Expected a TypeDeclaration or ExpressionStatement")
return NoneWe show examples that illustrate how parsed viper code looks like. We only include the short examples below, when you run our scanner as instructed above, you would be able to see the full list of input and output (same as expected output).
Running test case 1:
Code:
int :: x_a = 10; list :: y = range(0,x_a); for i in y: { print(i); };
Tokens:
<TYPE, int>, <TYPE_DEC, ::>, <VAR, x_a>, <ASSIGN, =>, <PYTHON_CODE, 10>,
<SEMICOLON, ;>, <TYPE, list>, <TYPE_DEC, ::>, <VAR, y>, <ASSIGN, =>,
<TYPE, range>, <LPAREN, (>, <PYTHON_CODE, 0>, <PYTHON_CODE, ,>, <VAR, x_a>,
<RPAREN, )>, <SEMICOLON, ;>, <PYTHON_CODE, for>, <PYTHON_CODE, i>,
<PYTHON_CODE, in>, <PYTHON_CODE, y:>, <LBRACE, {>, <PYTHON_CODE, print>,
<LPAREN, (>, <PYTHON_CODE, i>, <RPAREN, )>, <SEMICOLON, ;>, <RBRACE, }>, <SEMICOLON, ;>
AST:
- VariableDeclaration
- TypeDeclaration
- <TYPE, int> <TYPE_DEC, ::>
- Statement'
- <VAR, x_a> <ASSIGN, => <PYTHON_CODE, 10> <SEMICOLON, ;>
- VariableDeclaration
- TypeDeclaration
- <TYPE, list> <TYPE_DEC, ::>
- Statement'
- <VAR, y> <ASSIGN, => <TYPE, range> <LPAREN, (> <PYTHON_CODE, 0> <PYTHON_CODE, ,> <VAR, x_a> <RPAREN, )> <SEMICOLON, ;>
- ExpressionStatement
- Expression
- Loop
- <PYTHON_CODE, for> <PYTHON_CODE, i> <PYTHON_CODE, in> <PYTHON_CODE, y:> <LBRACE, {> <PYTHON_CODE, print> <LPAREN, (> <PYTHON_CODE, i> <RPAREN, )> <SEMICOLON, ;> <RBRACE, }>
- <SIMILON, ;>
----------------------------------------
Running test case 2: (Error catched in parsing: python_code type_dec is not valid)
Code:
str :: def say_hello_world(){ string :: text = 'hello world'; print(text);};
Tokens:
<TYPE, str>, <TYPE_DEC, ::>, <DEF, def>, <FUNC, say_hello_world>, <LPAREN, (>,
<RPAREN, )>, <LBRACE, {>, <PYTHON_CODE, string>, <TYPE_DEC, ::>, <VAR, text>,
<ASSIGN, =>, <PYTHON_CODE, 'hello>, <PYTHON_CODE, world'>, <SEMICOLON, ;>,
<PYTHON_CODE, print>, <LPAREN, (>, <VAR, text>, <RPAREN, )>, <SEMICOLON, ;>,
<RBRACE, }>, <SEMICOLON, ;>
AST:
- VariableDeclaration
- TypeDeclaration
- <TYPE, str> <TYPE_DEC, ::>
- Statement'
- <DEF, def> <FUNC, say_hello_world> <LPAREN, (> <RPAREN, )> <LBRACE, {> <PYTHON_CODE, string>ERROR (Here there should be an error)
----------------------------------------
Running test case 3:
Code:
int :: def func(int :: a, int :: b):{ int :: c = a + b; return c;}
Tokens:
<TYPE, int>, <TYPE_DEC, ::>, <DEF, def>, <FUNC, func>, <LPAREN, (>,
<TYPE, int>, <TYPE_DEC, ::>, <VAR, a>, <PYTHON_CODE, ,>, <TYPE, int>,
<TYPE_DEC, ::>, <VAR, b>, <RPAREN, )>, <PYTHON_CODE, :>, <LBRACE, {>,
<TYPE, int>, <TYPE_DEC, ::>, <VAR, c>, <ASSIGN, =>, <VAR, a>,
<OP, +>, <VAR, b>, <SEMICOLON, ;>, <PYTHON_CODE, return>, <VAR, c>,
<SEMICOLON, ;>, <RBRACE, }>
AST:
- VariableDeclaration
- TypeDeclaration
- <TYPE, int> <TYPE_DEC, ::>
- Statement'
- <DEF, def> <FUNC, func> <LPAREN, (>
- ParameterList
- <TYPE, int> <TYPE_DEC, ::> <VAR, a> <PYTHON_CODE, ,> <TYPE, int> <TYPE_DEC, ::> <VAR, b>
- <RPAREN, )>
- FunctionBody
- <LBRACE, {>
- StatementList
- TypeDeclaration
- <TYPE, int> <TYPE_DEC, ::>
- Statement'
- <VAR, c> <ASSIGN, =>
- ExpressionStatement
- ArithmeticExpression
- <VAR, a> <OP, +> <VAR, b>
- <SEMICOLON, ;>
- ReturnStatement
- <PYTHON_CODE, return>
- ExpressionStatement
- Expression
- <VAR, c>
- <SEMICOLON>
- <RBRACE, }>
----------------------------------------
- Make sure you are in the viper directory.
- Run
chmod +x scanner.shto ensure executable access to the shell script. - Run
source ./scanner.sh. This will activate a virtual environment called viper and set you up with required dependencies, and run the five code examples and return the outputs. For grading please runpython scannerwhich will execute the__main__.pyfile in the scanner dir ifsource ./scanner.shfails.
For grading please refer to lexical_dfa.jpg for the dfa for the scanner.
This will run 5 examples shown below and parse them.
Note: we only include 5 simple examples below because the output is too long to be included in the README, more complicated examples will be printed by the script which shows the expected output. The expected output included below has a different format of printed to make it looks better in README.
We define thirteen new token classes that the viper tokenizer recognizes.
<TYPE_DEC>: type declaration token →"::"- The
::token will be used to declare the type of a variable or return value. It serves as the separator between variable names and their type annotations. - Example:
int :: x = 1;
- The
<TYPE>: Type tokens →“Any”, “bool”, “Callable”, “complex”, “dict”, “float”, “frozenset”, “int”, “list”, “Optional”, “range”, “set”, “str”, “tuple”, “Union”, “NoneType”- Tokens representing data types such as
str, float, list, tupleare recognized and reserved. - Example
list :: y = [];
- Tokens representing data types such as
<SEMICOLON>: Semicolon →";"- The semicolon terminates logical lines, offering an alternative to python's newline delinated logical lines.
- Example
print("Hello world!")
<LPAREN>: Paren left →"("- Left parenthesis
<RPAREN>: Paren right →")"- Right parenthesis
<LCBRACE>: Curly brace left →"{"- Curly braces define blocks of code (such as function bodies or control flow syntax) and replaces pythons indentation based syntax.
- Example
if a == True: {print("Hello world!");}
<RCBRACE>: Curly brace right →"}"- Curly braces define blocks of code (such as function bodies or control flow syntax) and replaces pythons indentation based syntax.
- Example
if a == True: {print("Hello world!");}
<VAR>: Variable name →"a"- Any variable name that come after
<TYPE> <TYPE_DEC>. - Example
int :: a;
- Any variable name that come after
<FUNC>: Function name →"a"- Any function name that come after
<DEF>.
- Any function name that come after
<DEF>: Function definition →"def"- Function definition
<ASSIGN>: Defines the=operator which assigns values to variables.<PYTHON_CODE>: All regular python tokens → special token for unchecked token, viper relies on python interpreter for correctness.<OP>: all type operators →["**", "*", "+", "-", "//", "/", "%"]
Our lexical parser only handles errors directly related to malformed type declarations.
We implement panic mode for handling malformed lexemes of the types that we are defining in our lexical grammar. The default error handling of our lexer is to pass lexemes through to the python code class, errors will then appear when attempting to run the python program.
We show examples that illustrate how tokenized viper code looks like. We only include the short examples below, when you run our scanner as instructed above, you would be able to see the full list of input and output (same as expected output).
Running test case 1:
Code:
int :: x_a = 10; list :: y = range(0,x_a); for i in y: { print(i); }
<TYPE, int>, <TYPE_DEC, ::>, <VAR, x_a>, <ASSIGN, =>, <PYTHON_CODE, 10>,
<SEMICOLON, ;>, <TYPE, list>, <TYPE_DEC, ::>, <VAR, y>, <ASSIGN, =>,
<TYPE, range>, <LPAREN, (>, <PYTHON_CODE, 0>, <PYTHON_CODE, ,>, <VAR, x_a>,
<RPAREN, )>, <SEMICOLON, ;>, <PYTHON_CODE, for>, <PYTHON_CODE, i>,
<PYTHON_CODE, in>, <PYTHON_CODE, y:>, <LBRACE, {>, <PYTHON_CODE, print>,
<LPAREN, (>, <PYTHON_CODE, i>, <RPAREN, )>, <SEMICOLON, ;>, <RBRACE, }>
----------------------------------------
Running test case 2:
Code:
str :: def say_hello_world(){ string :: text = 'hello world'; print(text);}
<TYPE, str>, <TYPE_DEC, ::>, <DEF, def>, <FUNC, say_hello_world>, <LPAREN, (>,
<RPAREN, )>, <LBRACE, {>, <PYTHON_CODE, string>, <TYPE_DEC, ::>, <VAR, text>,
<ASSIGN, =>, <PYTHON_CODE, 'hello>, <PYTHON_CODE, world'>, <SEMICOLON, ;>,
<PYTHON_CODE, print>, <LPAREN, (>, <VAR, text>, <RPAREN, )>, <SEMICOLON, ;>,
<RBRACE, }>
----------------------------------------
Running test case 3:
Code:
int :: def func(int :: a, int :: b):{ int :: c = a + b; return c;}
<TYPE, int>, <TYPE_DEC, ::>, <DEF, def>, <FUNC, func>, <LPAREN, (>,
<TYPE, int>, <TYPE_DEC, ::>, <VAR, a>, <PYTHON_CODE, ,>, <TYPE, int>,
<TYPE_DEC, ::>, <VAR, b>, <RPAREN, )>, <PYTHON_CODE, :>, <LBRACE, {>,
<TYPE, int>, <TYPE_DEC, ::>, <VAR, c>, <ASSIGN, =>, <VAR, a>,
<OP, +>, <VAR, b>, <SEMICOLON, ;>, <PYTHON_CODE, return>, <VAR, c>,
<SEMICOLON, ;>, <RBRACE, }>
----------------------------------------
Additionally, here are examples of how our program handles errors. Our scanner has two recovery strategy for different errors.
Incorrect type assignment error: We consider the easy mistake of programmers not inserting the correct space between a type and the declarative ::. Here, if the scanner has consumed a valid type string, and the next token is a : and not a space, then it inserts the whitespace and tokenizes correctly.
Code:
str:: t = 'hello world!';
<TYPE, str>, <TYPE_DEC, ::>, <VAR, t>, <ASSIGN, =>, <PYTHON_CODE, 'hello>,
<PYTHON_CODE, world!'>, <SEMICOLON, ;>
Mixing ints and strs: We consider the case where a variable name starts with numbers, or conversely an integer includes digits. Our compiler results to panic mode deleting str characters and _ until it reaches a valid token which may be more ints or a whitespace.
Code:
int :: 123x_a;
<TYPE, int>, <TYPE_DEC, ::>, <VAR, x_a>, <SEMICOLON, ;>
Invalid characters: Our default error handling strategy for characters that are not valid in python is to pass them through as python code. Since they are not recognized by Viper, the scanner assumes that they are valid python code and tokenizes them. In this case the Python interpreter will notify the programmer of the invalid input.
Code:
int :: a = @;
<TYPE, int>, <TYPE_DEC, ::>, <VAR, a>, <ASSIGN, =>, <PYTHON_CODE, @>, <SEMICOLON, ;>
We have more test cases that are able to output expected output based on our language in scanner/main.py, you are able to see the output from running the shell script as instructed previously.
For developpers, make sure you are in the viper directory. and run source ./setup.sh.
This should activate a virtual environment called viper and set you up with required dependencies.