Skip to content
View josephgitau's full-sized avatar
🏠
Working from home
🏠
Working from home
  • Nairobi, Kenya
  • 23:45 (UTC -12:00)

Organizations

@African-Center-for-Data-Science

Block or report josephgitau

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
josephgitau/readme.md

Header

Typing SVG

Portfolio LinkedIn Zindi africdsa

Signal

I am a machine learning engineer and data scientist building from Nairobi, Kenya. My work lives where real-world data gets difficult: noisy property listings, cadastral maps, low-resource language tasks, competition datasets, and product interfaces that need to make model output understandable.

I like the whole system, not just the notebook: scrape the data, clean the edges, build the model, validate hard, ship the API, and make the result useful to someone.

current_mode:
  build:     Nairobi real estate intelligence
  compete:   Zindi ML challenges
  explore:   speech, LLMs, OCR, geospatial AI
  ship_with: Python, FastAPI, Next.js, Supabase

Competition Brain

Validation, leakage checks, feature engineering, ensembles, and leaderboard discipline.

Product Hands

APIs, dashboards, scheduled pipelines, and interfaces that make data products usable.

Local Lens

AI for African housing, maps, language, agriculture, and public-interest datasets.


📈 Live Zindi Stats

Rank
Points
Best Rank
Country


🔗 View full profile on Zindi →

Last updated: 2026-06-05 07:30:06 UTC


Build Map

Track How I use it
Models Train, validate, compare, and explain ML systems for messy tabular, text, vision, and map data
Pipelines Scrape, clean, enrich, schedule, and store analytics-ready datasets
Maps Extract parcels, detect boundaries, run OCR, and turn geospatial files into usable layers
Products Wrap models in APIs, dashboards, and workflows people can actually use

Field Projects

01. Nairobi Property Pricing Platform

An end-to-end housing intelligence system for Nairobi. It turns raw listings into structured market signals: prices, bedroom counts, neighborhoods, affordability bands, and dashboard-ready summaries.

Pipeline: listing scrape -> parsing -> Supabase -> analytics -> dashboard
Stack: Python, Supabase, Next.js, TypeScript, GitHub Actions
Why it matters: housing data is scattered and inconsistent; the product turns it into something searchable, comparable, and decision-ready.

Backend · Frontend · Live site

02. Barbados Lands and Surveys Plot Automation Challenge

A geospatial AI pipeline for cadastral survey maps, built to move from scanned map imagery to structured polygon and text outputs.

Pipeline: raster maps -> boundary segmentation -> polygon cleaning -> OCR -> merged GIS output
Modeling: segmentation for parcels, post-processing for valid geometries, OCR for map labels
Result: Public score 0.965006861 · Private score 0.970242006

Repository · Data prep notebook

03. Sentiment Story Generation Bot

An NLP experiment that detects emotional signal from text and uses it to generate contextual story responses.

Idea: sentiment -> context -> generated story
Focus: language understanding, generation, and interaction design

Repository


Operating System

input:   raw datasets, maps, listings, language, competition briefs
process: clean -> validate -> model -> evaluate -> package
output:  notebooks, APIs, dashboards, repositories, field-ready insights
I care about Because
Strong baselines They expose whether the complex idea is actually useful
Validation design A good score only matters when it survives reality
Data quality Most model problems start before training begins
Shipping A useful model needs a path into a workflow

Tools I Use

Python PyTorch TensorFlow scikit-learn Pandas NumPy OpenCV FastAPI Next.js TypeScript Supabase Docker GitHub Actions


GitHub Activity

GitHub Streak

Activity Graph


Current Missions

  • Make Nairobi property data easier to search, compare, and understand
  • Build stronger competition pipelines for NLP, vision, geospatial, and tabular ML
  • Package geospatial OCR and document-understanding workflows into reusable tools
  • Push deeper into speech and language systems for low-resource African contexts

Connect

Portfolio LinkedIn Zindi WhatsApp

Messy data in. Useful systems out.

Footer

Popular repositories Loading

  1. The-African-Trust-Safety-LLM-Challenge The-African-Trust-Safety-LLM-Challenge Public

    Python 10 3

  2. Barbados-Lands-and-Surveys-Plot-Automation-Challenge Barbados-Lands-and-Surveys-Plot-Automation-Challenge Public

    Barbados Lands and Surveys Plot Automation Challenge

    Jupyter Notebook 8 2

  3. employee_leave_predictionn employee_leave_predictionn Public

    Jupyter Notebook 2

  4. Sentiment_Story_Generation_Bot Sentiment_Story_Generation_Bot Public

    Python 2

  5. nairobi-property-pricing-frontend nairobi-property-pricing-frontend Public

    TypeScript 2 1

  6. DEVCLASS DEVCLASS Public

    1