Implementing Type Checking and Records in JoeML

This guide will walk you through implementing type checking and record types in JoeML, explaining the core concepts and design decisions along the way.

Understanding Type Systems
Type Checker Architecture
Implementing Basic Type Checking
Adding Record Types
Type Inference
Error Handling

Understanding Type Systems

Why Add Types?

JoeML currently generates JavaScript without any type checking. Adding types provides:

Early error detection: Catch type mismatches before runtime
Better tooling: Enable autocompletion and refactoring
Documentation: Types serve as inline documentation
Optimization opportunities: Type information can enable better code generation

Type System Choices

Before implementing, we need to decide on our type system's characteristics:

Static vs Dynamic: We'll implement static typing (checked at compile-time)
Explicit vs Inferred: We'll support type inference with optional annotations
Structural vs Nominal: Records will use structural typing (shape matters, not name)
Gradual vs Strict: Start strict, can add gradual features later

Type Checker Architecture

Core Components

// 1. Type Representation
type Type =
  | { kind: 'number' }
  | { kind: 'string' }
  | { kind: 'boolean' }
  | { kind: 'function', params: Type[], returns: Type }
  | { kind: 'record', fields: { [key: string]: Type } }
  | { kind: 'typevar', id: number }  // For type inference
  | { kind: 'error', message: string };

// 2. Type Environment
type TypeEnv = {
  [name: string]: Type;
};

// 3. Type Checker State
interface TypeCheckerState {
  environment: TypeEnv;
  errors: TypeError[];
  nextTypeVar: number;  // For generating fresh type variables
}

Why This Architecture?

Type representation: Algebraic data types make pattern matching easy
Environment: Track variable/function types in scope
Error collection: Accumulate errors instead of failing fast
Type variables: Essential for type inference

Implementing Basic Type Checking

Step 1: Set Up Type Infrastructure

// types.ts
export type Type = 
  | { kind: 'number' }
  | { kind: 'string' }
  | { kind: 'boolean' }
  | { kind: 'function'; params: Type[]; returns: Type }
  | { kind: 'error'; message: string };

export type TypeEnv = { [name: string]: Type };

// typechecker.ts
export class TypeChecker {
  private env: TypeEnv = {};
  private errors: TypeError[] = [];

  constructor() {
    // Initialize with built-in functions
    this.env = {
      'print': { kind: 'function', params: [{ kind: 'string' }], returns: { kind: 'void' } },
      '+': { kind: 'function', params: [{ kind: 'number' }, { kind: 'number' }], returns: { kind: 'number' } },
      // ... other built-ins
    };
  }
}

Why: Starting with a minimal type system lets us incrementally add features while ensuring the core works correctly.

Step 2: Implement Expression Type Checking

class TypeChecker {
  checkExpression(expr: Expression, env: TypeEnv): Type {
    switch (expr.type) {
      case 'number':
        return { kind: 'number' };
      
      case 'string':
        return { kind: 'string' };
      
      case 'identifier':
        const type = env[expr.name];
        if (!type) {
          this.errors.push(new TypeError(`Undefined variable: ${expr.name}`));
          return { kind: 'error', message: 'undefined variable' };
        }
        return type;
      
      case 'application':
        const funcType = this.checkExpression(expr.func, env);
        if (funcType.kind !== 'function') {
          this.errors.push(new TypeError(`Cannot call non-function`));
          return { kind: 'error', message: 'not a function' };
        }
        
        // Check argument types match parameters
        const argTypes = expr.args.map(arg => this.checkExpression(arg, env));
        for (let i = 0; i < funcType.params.length; i++) {
          if (!this.typesMatch(argTypes[i], funcType.params[i])) {
            this.errors.push(new TypeError(`Type mismatch in argument ${i + 1}`));
          }
        }
        
        return funcType.returns;
      
      // ... other cases
    }
  }
}

Why: This bottom-up approach checks subexpressions first, then uses their types to check the whole expression.

Step 3: Type Checking Functions

checkFunction(func: FunctionDeclaration, env: TypeEnv): Type {
  // Create new environment with parameters
  const funcEnv = { ...env };
  const paramTypes: Type[] = [];
  
  for (const param of func.params) {
    // For now, require type annotations on parameters
    const paramType = this.parseTypeAnnotation(param.typeAnnotation);
    paramTypes.push(paramType);
    funcEnv[param.name] = paramType;
  }
  
  // Check function body
  const returnType = this.checkExpression(func.body, funcEnv);
  
  // If function has return type annotation, verify it matches
  if (func.returnType) {
    const declaredReturn = this.parseTypeAnnotation(func.returnType);
    if (!this.typesMatch(returnType, declaredReturn)) {
      this.errors.push(new TypeError(`Function return type mismatch`));
    }
  }
  
  return {
    kind: 'function',
    params: paramTypes,
    returns: returnType
  };
}

Why: Functions create new scopes, so we extend the environment with parameters before checking the body.

Adding Record Types

Step 1: Extend the Type System

type Type = 
  // ... existing types
  | { kind: 'record'; fields: { [key: string]: Type } };

// Add record expression type
type Expression =
  // ... existing expressions
  | { type: 'record'; fields: { key: string; value: Expression }[] }
  | { type: 'field-access'; record: Expression; field: string };

Step 2: Record Type Checking

checkExpression(expr: Expression, env: TypeEnv): Type {
  switch (expr.type) {
    // ... existing cases
    
    case 'record':
      const fields: { [key: string]: Type } = {};
      for (const field of expr.fields) {
        fields[field.key] = this.checkExpression(field.value, env);
      }
      return { kind: 'record', fields };
    
    case 'field-access':
      const recordType = this.checkExpression(expr.record, env);
      if (recordType.kind !== 'record') {
        this.errors.push(new TypeError(`Cannot access field on non-record type`));
        return { kind: 'error', message: 'not a record' };
      }
      
      const fieldType = recordType.fields[expr.field];
      if (!fieldType) {
        this.errors.push(new TypeError(`Record has no field '${expr.field}'`));
        return { kind: 'error', message: 'field not found' };
      }
      
      return fieldType;
  }
}

Why: Record typing requires checking that field accesses are valid and tracking field types.

Step 3: Structural Typing for Records

typesMatch(type1: Type, type2: Type): boolean {
  if (type1.kind !== type2.kind) return false;
  
  switch (type1.kind) {
    case 'record':
      if (type2.kind !== 'record') return false;
      
      // Check that all fields in type2 exist in type1 with matching types
      for (const [field, fieldType2] of Object.entries(type2.fields)) {
        const fieldType1 = type1.fields[field];
        if (!fieldType1 || !this.typesMatch(fieldType1, fieldType2)) {
          return false;
        }
      }
      
      // For structural typing, type1 can have extra fields
      return true;
    
    // ... other cases
  }
}

Why: Structural typing matches types based on shape, not name, making the type system more flexible.

Type Inference

Step 1: Add Type Variables

type Type =
  // ... existing types
  | { kind: 'typevar'; id: number };

class TypeChecker {
  private nextTypeVar = 0;
  
  freshTypeVar(): Type {
    return { kind: 'typevar', id: this.nextTypeVar++ };
  }
}

Step 2: Constraint Collection

type Constraint = {
  type1: Type;
  type2: Type;
  location: SourceLocation;  // For error messages
};

class TypeChecker {
  private constraints: Constraint[] = [];
  
  addConstraint(type1: Type, type2: Type, location: SourceLocation) {
    this.constraints.push({ type1, type2, location });
  }
  
  inferExpression(expr: Expression, env: TypeEnv): Type {
    switch (expr.type) {
      case 'identifier':
        if (!env[expr.name]) {
          // Create a fresh type variable for unknown identifiers
          const typeVar = this.freshTypeVar();
          env[expr.name] = typeVar;
          return typeVar;
        }
        return env[expr.name];
      
      case 'application':
        const funcType = this.inferExpression(expr.func, env);
        const argTypes = expr.args.map(arg => this.inferExpression(arg, env));
        
        // Create a fresh type variable for the result
        const resultType = this.freshTypeVar();
        
        // Add constraint: funcType = (argTypes) -> resultType
        this.addConstraint(funcType, {
          kind: 'function',
          params: argTypes,
          returns: resultType
        }, expr.location);
        
        return resultType;
      
      // ... other cases
    }
  }
}

Why: Type inference works by collecting constraints on type variables, then solving them.

Step 3: Unification Algorithm

class TypeChecker {
  unify(type1: Type, type2: Type): Type | null {
    if (type1.kind === 'typevar') {
      return this.bindTypeVar(type1.id, type2);
    }
    
    if (type2.kind === 'typevar') {
      return this.bindTypeVar(type2.id, type1);
    }
    
    if (type1.kind !== type2.kind) {
      return null;  // Types don't unify
    }
    
    switch (type1.kind) {
      case 'function':
        if (type2.kind !== 'function') return null;
        
        // Unify parameters and return type
        if (type1.params.length !== type2.params.length) return null;
        
        for (let i = 0; i < type1.params.length; i++) {
          const unifiedParam = this.unify(type1.params[i], type2.params[i]);
          if (!unifiedParam) return null;
        }
        
        const unifiedReturn = this.unify(type1.returns, type2.returns);
        if (!unifiedReturn) return null;
        
        return type1;
      
      // ... other cases
    }
  }
  
  bindTypeVar(id: number, type: Type): Type {
    // Check for circular references (occurs check)
    if (this.occursIn(id, type)) {
      return null;  // Cannot create infinite type
    }
    
    // Update all constraints to replace this type variable
    this.substitute(id, type);
    
    return type;
  }
}

Why: Unification finds the most general type that satisfies all constraints.

Error Handling

Step 1: Create Error Types

class TypeError extends Error {
  constructor(
    message: string,
    public location: SourceLocation,
    public type1?: Type,
    public type2?: Type
  ) {
    super(message);
  }
}

class TypeChecker {
  private errors: TypeError[] = [];
  
  get hasErrors(): boolean {
    return this.errors.length > 0;
  }
  
  getErrors(): TypeError[] {
    return [...this.errors];
  }
}

Step 2: User-Friendly Error Messages

class TypeChecker {
  private formatType(type: Type): string {
    switch (type.kind) {
      case 'number': return 'Number';
      case 'string': return 'String';
      case 'function':
        const params = type.params.map(p => this.formatType(p)).join(', ');
        const ret = this.formatType(type.returns);
        return `(${params}) -> ${ret}`;
      case 'record':
        const fields = Object.entries(type.fields)
          .map(([k, v]) => `${k}: ${this.formatType(v)}`)
          .join(', ');
        return `{ ${fields} }`;
      case 'typevar':
        return `?${type.id}`;  // Show unresolved type variables
      default:
        return 'Unknown';
    }
  }
  
  reportError(message: string, location: SourceLocation, type1?: Type, type2?: Type) {
    let fullMessage = message;
    
    if (type1 && type2) {
      fullMessage += `\n  Expected: ${this.formatType(type1)}`;
      fullMessage += `\n  Found: ${this.formatType(type2)}`;
    }
    
    this.errors.push(new TypeError(fullMessage, location, type1, type2));
  }
}

Why: Good error messages help users fix their code quickly.

Integration with Existing Compiler

Step 1: Add Type Checking Phase

// compiler.ts
class Compiler {
  compile(source: string) {
    // 1. Parse
    const ast = parse(source);
    
    // 2. Type check
    const typeChecker = new TypeChecker();
    const typedAst = typeChecker.check(ast);
    
    if (typeChecker.hasErrors) {
      const errors = typeChecker.getErrors();
      throw new CompilationError('Type errors found', errors);
    }
    
    // 3. Generate code
    return generateJavaScript(typedAst);
  }
}

Step 2: Update AST with Type Information

interface TypedExpression extends Expression {
  inferredType?: Type;
}

class TypeChecker {
  check(ast: Program): TypedProgram {
    // Decorate AST nodes with type information
    const typedAst = this.decorateWithTypes(ast);
    
    // Run type checking
    this.checkProgram(typedAst);
    
    return typedAst;
  }
}

Why: Type-decorated ASTs enable type-driven code generation optimizations.

Implementation Order

Basic type system: Start with simple types (number, string)
Function types: Add function type checking
Type environment: Implement scoping
Error handling: Create error reporting system
Record types: Add structural typing
Type inference: Implement unification
Integration: Connect to existing compiler

Testing Strategy

Unit tests: Test each type checking component
Integration tests: Test full programs
Error tests: Verify error messages
Regression tests: Prevent reintroduction of bugs

Example test:

describe('TypeChecker', () => {
  it('detects type mismatches in function calls', () => {
    const source = `
      function add(x: Number, y: Number): Number {
        return x + y;
      }
      
      add("hello", 42);  // Should error
    `;
    
    const checker = new TypeChecker();
    const result = checker.check(parse(source));
    
    expect(checker.hasErrors).toBe(true);
    expect(checker.getErrors()[0].message).toContain('Type mismatch');
  });
});

Common Pitfalls and Solutions

Circular type references: Use occurs check in unification
Scope confusion: Carefully track environment changes
Performance issues: Cache type checking results
Error explosion: Implement error recovery
Type variable conflicts: Use fresh variables

Advanced Topics for Future Work

Generic types: Parametric polymorphism
Type aliases: Named types for better errors
Recursive types: For tree structures
Gradual typing: Mix typed and untyped code
Type guards: Refine types in conditionals

This guide provides a foundation for implementing a type system in JoeML. Start with the basics and incrementally add features as needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementing Type Checking and Records in JoeML

Table of Contents

Understanding Type Systems

Why Add Types?

Type System Choices

Type Checker Architecture

Core Components

Why This Architecture?

Implementing Basic Type Checking

Step 1: Set Up Type Infrastructure

Step 2: Implement Expression Type Checking

Step 3: Type Checking Functions

Adding Record Types

Step 1: Extend the Type System

Step 2: Record Type Checking

Step 3: Structural Typing for Records

Type Inference

Step 1: Add Type Variables

Step 2: Constraint Collection

Step 3: Unification Algorithm

Error Handling

Step 1: Create Error Types

Step 2: User-Friendly Error Messages

Integration with Existing Compiler

Step 1: Add Type Checking Phase

Step 2: Update AST with Type Information

Implementation Order

Testing Strategy

Common Pitfalls and Solutions

Advanced Topics for Future Work

FilesExpand file tree

IMPLEMENTATION_GUIDE.md

Latest commit

History

IMPLEMENTATION_GUIDE.md

File metadata and controls

Implementing Type Checking and Records in JoeML

Table of Contents

Understanding Type Systems

Why Add Types?

Type System Choices

Type Checker Architecture

Core Components

Why This Architecture?

Implementing Basic Type Checking

Step 1: Set Up Type Infrastructure

Step 2: Implement Expression Type Checking

Step 3: Type Checking Functions

Adding Record Types

Step 1: Extend the Type System

Step 2: Record Type Checking

Step 3: Structural Typing for Records

Type Inference

Step 1: Add Type Variables

Step 2: Constraint Collection

Step 3: Unification Algorithm

Error Handling

Step 1: Create Error Types

Step 2: User-Friendly Error Messages

Integration with Existing Compiler

Step 1: Add Type Checking Phase

Step 2: Update AST with Type Information

Implementation Order

Testing Strategy

Common Pitfalls and Solutions

Advanced Topics for Future Work