A university C++ project focused on building a custom programming language inspired by Moroccan Arabic (Darija).
Darija-Engine is an educational project where we design and implement a mini interpreter from scratch. Instead of traditional English keywords, the language uses Darija words, making programming more culturally expressive while learning core compiler concepts.
Darija is a custom-made programming language that replaces common programming keywords with Moroccan Arabic equivalents. Darija (Moroccan Arabic) is the colloquial language spoken in Morocco, and this project brings it to the programming domain.
| Darija Keyword | Meaning | English Equivalent |
|---|---|---|
ila |
if |
if |
kteb |
print |
print/cout |
wadifa |
function |
function/def |
binma |
while |
while |
khod |
input |
input/cin |
This approach keeps the syntax simple while demonstrating how programming languages are designed internally The Darija-Engine project is part of a broader ecosystem of efforts to bring natural language processing and programming to the Moroccan dialect, complementing research in Darija language processing and preservation.
For the complete grammar specification, including all keywords, operators, and data types, refer to docs/GRAMMAR.md.
To learn Darija step-by-step, see the tutorial. For detailed examples, check docs/EXAMPLES.md.
The main goal is to understand how compilers and interpreters work by implementing each stage manually]. This educational approach aligns with best practices for language design and implementation.
graph TD
A[Source Code] -->|Lexer| B(Tokens)
B -->|Parser| C(AST)
C -->|Interpreter| D[Output]
By the end of the project, the engine should be capable of interpreting Darija programs. This tree-walk interpreter architecture follows established patterns in interpreter design].
First, clone the repository from GitHub using either HTTPS or SSH to get all the source code and examples.
# Using HTTPS (recommended for most users)
git clone https://github.com/Manak-hash/Darija-Engine.git
# Using SSH (if you have SSH keys configured with GitHub)
git clone git@github.com:Manak-hash/Darija-Engine.gitQuick verification / helpful git tips:
- To see the remote URL after cloning:
git remote -v - To fetch the latest changes:
git pull origin main - For a shallow clone (smaller download):
git clone --depth 1 https://github.com/Manak-hash/Darija-Engine.git
Then navigate into the project directory:
cd Darija-Engine- A C++ compiler (GCC / Clang / MSVC)
- C++17 standard support
g++ -std=c++17 src/Darija-Engine.cpp -o bin/Darija-Engine.exe# Create bin directory if it doesn't exist
mkdir -p bin
# Compile using G++ or Clang
g++ -std=c++17 src/Darija-Engine.cpp -o bin/Darija-EngineAfter building, run any .darija file with the interpreter:
bin/Darija-Engine.exe path/to/program.darijaFor example, try the hello world program:
#For Windows
bin/Darija-Engine.exe examples/hello.darija
#For MacOS & Linux
bin/Darija-Engine examples/hello.darijaThis should output:
Hello, Darija!
Sum:
30
Darija-Engine/
โโโ bin/ # Executables
โโโ docs/ # Documentation
โ โโโ GRAMMAR.md
โ โโโ TUTORIAL.md
โ โโโ EXAMPLES.md
โโโ examples/ # Sample Darija scripts
โ โโโ calculator.darija
โ โโโ factorial.darija
โ โโโ hello.darija
โ โโโ loops.darija
โโโ src/ # Source code
โ โโโ Darija-Engine.cpp
โโโ CMakeLists.txt # Build configuration
โโโ README.md # This file ^ ^
- Tutorial: Learn Darija programming.
- Examples: Detailed walkthrough of sample code.
- Grammar: Complete language specification.
- Lexer: Tokenizing Darija keywords (
sa7i7,kteb...) and literals. - Parser: Recursive descent parser for expressions and precedence].
- AST: Tree structures for
BinaryExpr,VarDecl,BlockStmt, etc.] - Runtime Environment: Variable storage, scoping, and type system (
int,float,string,bool).
- Conditionals:
ila(if) ...ola(else) logic. - Loops:
binma(while) andli(for) iteration. - Input/Output:
kteb(print) andkhod(input) integration.
- Functions: Defining
wadifawith arguments andrja3(return) statements. - Native Functions: Adding standard library calls (Math, Time).
- File I/O: Reading and writing files from Darija scripts.
- Optimization: Constant folding and dead code elimination.
- Bytecode Compiler: Switch from tree-walk interpreter to a Stack VM.
- Error Handling: Detailed error messages with suggestions.
The engine is implemented as a Tree-Walk Interpreter]:
- Lexer: Converts raw source text into a list of
Tokens. - Parser: Uses Recursive Descent to build an Abstract Syntax Tree (AST) from tokens.
- Interpreter: Traverses the AST recursively, evaluating expressions and executing statements.
This approach is well-established in interpreter design and provides a clear, understandable implementation path for educational purposes].
- Dynamic Typing: Variables can hold Integers, Floats, Strings, or Booleans.
- Scope Management: Supports block scoping using a chain of
Environmentobjects. - Control Flow:
if(ila),else(ola),while(binma), andfor(li..fi..ila) loops.
This project is part of a growing ecosystem of Darija language technology projects:
- ArmaLang: Another Darija-based programming language focused on making coding accessible to Moroccan speakers.
- Darija Open Dataset (DODa): A collaborative project providing 150,000+ Darija entries for NLP research.
- Machine Learning Applications: Research into Darija TTS systems, speech recognition, and language models continues to advance Moroccan dialect technology.
For more information on interpreter design patterns, see foundational resources on tree-walking interpreters and language implementation techniques.
Esseddyq, I. (n.d.). "ArmaLang: A Programming Language Based on Darija." GitHub Repository. Retrieved from https://github.com/ibrahimesseddyq/ArmaLang
Tommy. (2021). "Introduction to Designing/Crafting a Programming Language in C++." YouTube Software Coding Tutorials Channel. Retrieved from https://www.youtube.com/watch?v=wZFnM05gi1I
Renaghan, Q. (2024). "Simple Tree-Walking Interpreter in C." GitHub Repository. Retrieved from https://github.com/quinnrenaghan/simple_tree_walking_interpreter_c
Talkpal. (2025). "Unlocking Machine Learning Darija: A Beginner's Guide to AI in Moroccan Arabic." Retrieved from https://talkpal.ai/unlocking-machine-learning-darija-a-beginners-guide-to-ai-in-moroccan-arabic/
Darija Open Dataset Project. "Welcome to the Darija Open Dataset (DODa)." GitHub Repository. Retrieved from https://github.com/darija-open-dataset/dataset