This project implements a compiler that processes input source code using lexical and syntactic analysis techniques. It builds a Deterministic Finite Automaton (DFA) and Non-deterministic Finite Automaton (NFA) for lexeme identification and performs parsing using an LL(1) grammar approach.
./
│-- lexeme.txt
│-- codeinput.txt
│-- grammar.conf
│-- output.txt
src/compiler_project
│-- DFA.java
│-- DFA_State.java
│-- LL1_Grammer.java
│-- MakeNFA.java
│-- NFA.java
│-- NFA_State.java
│-- compiler_project.java
│-- make_Analysis_table.java
-
Lexical Analysis (NFA and DFA)
MakeNFA.java: Converts regular expressions into NFAs.NFA.java&NFA_State.java: Handles NFA representation and state transitions.DFA.java&DFA_State.java: Converts NFAs to DFAs for efficient lexeme recognition.
-
Syntactic Analysis (LL(1) Grammar Parsing)
LL1_Grammer.java: Implements LL(1) parsing techniques.make_Analysis_table.java: Constructs parsing tables based on grammar rules.
-
Main Compiler Logic
-
compiler_project.java: The main entry point for the compiler, responsible for:
- Reading input files (
lexeme.txt,codeinput.txt,grammar.conf) - Constructing NFAs and DFAs for tokenization
- Generating parse tables and performing syntax analysis
- Reading input files (
-
- Lexical Analysis
- Reads lexeme definitions from
lexeme.txt - Converts regular expressions to NFA and then DFA
- Tokenizes input code based on defined lexemes
- Reads lexeme definitions from
- Syntactic Analysis
- Reads grammar rules from
grammar.conf - Constructs an LL(1) parsing table
- Parses the tokenized input for syntactic correctness
- Reads grammar rules from
- File Handling
- Reads input source code from
codeinput.txt - Writes tokenized output to
output.txt
- Reads input source code from
- Java Development Kit (JDK) 8 or higher.
- Apache Commons Lang library (used in
LL1_Grammer.java).
How to use
First you need to compile the project and the run the compiler:
javac -d bin src/compiler_project/*.java
java -cp bin compiler_project.compiler_projectdigit := [0-9]
left_brace := \[
l := [a-z]|[A-Z]
char_literal := '
string_literal := "
operation := \*|-|\+\+|--|/
number_real := \digit+\.\digit+
number_integer := \digit+
comment := @@#(\number_integer)-(\string_literal).*(\string_literal)
key_word := while|int|char|for|if
id := \l(\l|_|\digit)+
int x = 10;
if (x > 0) {
print(x);
}<Prog> := { <Stmts> }
<Stmts> := <Stmt> <Stmts>
<Stmts> :=
<Stmt> := id = <Expr> ;
<Stmt> := key_word ( <Expr> ) <Stmt>
<Stmt> := <Expr> = number_real ;
<Expr> := number_integer <Etail>
<Expr> := id <Etail>
<Etail> := + <Expr>
<Etail> := <Expr>
<Etail> :=
This project is licensed under the MIT License. see the LICENSE file for details.
Enjoy and Let me know if you have any question.