Skip to content

Manak-hash/Darija-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

11 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿš€ Darija-Engine

A university C++ project focused on building a custom programming language inspired by Moroccan Arabic (Darija).

Darija-Engine is an educational project where we design and implement a mini interpreter from scratch. Instead of traditional English keywords, the language uses Darija words, making programming more culturally expressive while learning core compiler concepts.


โœจ What is Darija?

Darija is a custom-made programming language that replaces common programming keywords with Moroccan Arabic equivalents. Darija (Moroccan Arabic) is the colloquial language spoken in Morocco, and this project brings it to the programming domain.

๐Ÿ”ค Example Keywords

Darija Keyword Meaning English Equivalent
ila if if
kteb print print/cout
wadifa function function/def
binma while while
khod input input/cin

This approach keeps the syntax simple while demonstrating how programming languages are designed internally The Darija-Engine project is part of a broader ecosystem of efforts to bring natural language processing and programming to the Moroccan dialect, complementing research in Darija language processing and preservation.

For the complete grammar specification, including all keywords, operators, and data types, refer to docs/GRAMMAR.md.

To learn Darija step-by-step, see the tutorial. For detailed examples, check docs/EXAMPLES.md.


๐ŸŽฏ Project Objectives

The main goal is to understand how compilers and interpreters work by implementing each stage manually]. This educational approach aligns with best practices for language design and implementation.

๐Ÿง  Compiler Pipeline

graph TD
    A[Source Code] -->|Lexer| B(Tokens)
    B -->|Parser| C(AST)
    C -->|Interpreter| D[Output]
Loading

By the end of the project, the engine should be capable of interpreting Darija programs. This tree-walk interpreter architecture follows established patterns in interpreter design].


๐Ÿ› ๏ธ Build & Run

Clone the repository

First, clone the repository from GitHub using either HTTPS or SSH to get all the source code and examples.

# Using HTTPS (recommended for most users)
git clone https://github.com/Manak-hash/Darija-Engine.git


# Using SSH (if you have SSH keys configured with GitHub)
git clone git@github.com:Manak-hash/Darija-Engine.git

Quick verification / helpful git tips:

  • To see the remote URL after cloning: git remote -v
  • To fetch the latest changes: git pull origin main
  • For a shallow clone (smaller download): git clone --depth 1 https://github.com/Manak-hash/Darija-Engine.git

Then navigate into the project directory:

cd Darija-Engine

Requirements

  • A C++ compiler (GCC / Clang / MSVC)
  • C++17 standard support

Build Steps

๐ŸชŸ Windows (MinGW/Git Bash)

g++ -std=c++17 src/Darija-Engine.cpp -o bin/Darija-Engine.exe

๐ŸŽ macOS & ๐Ÿง Linux

# Create bin directory if it doesn't exist
mkdir -p bin

# Compile using G++ or Clang
g++ -std=c++17 src/Darija-Engine.cpp -o bin/Darija-Engine

Running Darija Programs

After building, run any .darija file with the interpreter:

bin/Darija-Engine.exe path/to/program.darija

For example, try the hello world program:

#For Windows
bin/Darija-Engine.exe examples/hello.darija

#For MacOS & Linux
bin/Darija-Engine examples/hello.darija

This should output:

Hello, Darija!
Sum:
30

๐Ÿ“ Project Structure

Darija-Engine/
โ”œโ”€โ”€ bin/              # Executables
โ”œโ”€โ”€ docs/             # Documentation
โ”‚   โ”œโ”€โ”€ GRAMMAR.md    
โ”‚   โ”œโ”€โ”€ TUTORIAL.md   
โ”‚   โ””โ”€โ”€ EXAMPLES.md   
โ”œโ”€โ”€ examples/         # Sample Darija scripts
โ”‚   โ”œโ”€โ”€ calculator.darija    
โ”‚   โ”œโ”€โ”€ factorial.darija   
โ”‚   โ”œโ”€โ”€ hello.darija   
โ”‚   โ””โ”€โ”€ loops.darija
โ”œโ”€โ”€ src/              # Source code
โ”‚   โ””โ”€โ”€ Darija-Engine.cpp
โ”œโ”€โ”€ CMakeLists.txt    # Build configuration
โ””โ”€โ”€ README.md           # This file ^ ^

๐Ÿ“š Documentation

  • Tutorial: Learn Darija programming.
  • Examples: Detailed walkthrough of sample code.
  • Grammar: Complete language specification.

๐Ÿ—บ๏ธ Roadmap

โœ… Phase 1: Core Interpreter

  • Lexer: Tokenizing Darija keywords (sa7i7, kteb...) and literals.
  • Parser: Recursive descent parser for expressions and precedence].
  • AST: Tree structures for BinaryExpr, VarDecl, BlockStmt, etc.]
  • Runtime Environment: Variable storage, scoping, and type system (int, float, string, bool).

โœ… Phase 2: Control Flow

  • Conditionals: ila (if) ... ola (else) logic.
  • Loops: binma (while) and li (for) iteration.
  • Input/Output: kteb (print) and khod (input) integration.

๐Ÿšง Phase 3: Functions & Modularization (Planned)

  • Functions: Defining wadifa with arguments and rja3 (return) statements.
  • Native Functions: Adding standard library calls (Math, Time).
  • File I/O: Reading and writing files from Darija scripts.

๐Ÿ”ฎ Phase 4: Beyond Interpretation (Future)

  • Optimization: Constant folding and dead code elimination.
  • Bytecode Compiler: Switch from tree-walk interpreter to a Stack VM.
  • Error Handling: Detailed error messages with suggestions.

๐Ÿ’ก Implementation Notes

Architecture

The engine is implemented as a Tree-Walk Interpreter]:

  • Lexer: Converts raw source text into a list of Tokens.
  • Parser: Uses Recursive Descent to build an Abstract Syntax Tree (AST) from tokens.
  • Interpreter: Traverses the AST recursively, evaluating expressions and executing statements.

This approach is well-established in interpreter design and provides a clear, understandable implementation path for educational purposes].

Features

  • Dynamic Typing: Variables can hold Integers, Floats, Strings, or Booleans.
  • Scope Management: Supports block scoping using a chain of Environment objects.
  • Control Flow: if (ila), else (ola), while (binma), and for (li..fi..ila) loops.

๐ŸŒ Related Projects & Resources

This project is part of a growing ecosystem of Darija language technology projects:

  • ArmaLang: Another Darija-based programming language focused on making coding accessible to Moroccan speakers.
  • Darija Open Dataset (DODa): A collaborative project providing 150,000+ Darija entries for NLP research.
  • Machine Learning Applications: Research into Darija TTS systems, speech recognition, and language models continues to advance Moroccan dialect technology.

For more information on interpreter design patterns, see foundational resources on tree-walking interpreters and language implementation techniques.


๐Ÿ‘ฅ Authors


๐Ÿ“– References

Esseddyq, I. (n.d.). "ArmaLang: A Programming Language Based on Darija." GitHub Repository. Retrieved from https://github.com/ibrahimesseddyq/ArmaLang

Tommy. (2021). "Introduction to Designing/Crafting a Programming Language in C++." YouTube Software Coding Tutorials Channel. Retrieved from https://www.youtube.com/watch?v=wZFnM05gi1I

Renaghan, Q. (2024). "Simple Tree-Walking Interpreter in C." GitHub Repository. Retrieved from https://github.com/quinnrenaghan/simple_tree_walking_interpreter_c

Talkpal. (2025). "Unlocking Machine Learning Darija: A Beginner's Guide to AI in Moroccan Arabic." Retrieved from https://talkpal.ai/unlocking-machine-learning-darija-a-beginners-guide-to-ai-in-moroccan-arabic/

Darija Open Dataset Project. "Welcome to the Darija Open Dataset (DODa)." GitHub Repository. Retrieved from https://github.com/darija-open-dataset/dataset

About

A university C++ project focused on building a custom programming language inspired by Moroccan Arabic (Darija).

Resources

Stars

Watchers

Forks

Contributors