Skip to content
View gameofdatas's full-sized avatar

Block or report gameofdatas

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
gameofdatas/README.md

Rahul Singh

Lead Data Engineer — building real-time data platforms for financial trading systems.

Currently owning the data backbone for a global FX/CFD broker: 3+ TB data lake on S3, 50,000+ trades/day across MT4/MT5/cTrader, 45+ Airflow DAGs in production. Previously built data pipelines at Apple (Maps/GIS platform).


What I Build

Domain What I Do
Real-Time Streaming Kafka, Kafka Streams (stateful position tracking), Kafka Connect, RabbitMQ, FIX Protocol, Protobuf
Batch ETL Spark (PySpark/SQL), Apache Hudi (CDC → Lakehouse), Airflow (MWAA), EMR on EKS
Data Platform CDC pipelines (DMS → Avro → Hudi → Redshift Spectrum), medallion architecture, data modeling
Infrastructure Terraform, Kubernetes, Docker, AWS (Redshift, EMR, EKS, S3, DMS, MWAA), GCP
Data Quality Built a self-healing reconciliation system — took missing transaction incidents from ~10/month to zero
AI + Analytics LLM-powered Text-to-SQL bot, DuckDB embedded analytics, Delta Lake, Marimo notebooks

Currently Building

Project Description Stack
DataForge AI-powered AWS infrastructure architect — describe your data problem, get a sized, costed architecture with Terraform Python, AWS Pricing Engine
Trading Analytics Platform Real-time tick data processing with hot/cold symbol partitioning, OHLCV bars, anomaly detection Spark Streaming, Kafka, MinIO
RaftBadger Distributed key-value store with Raft consensus + BadgerDB Golang
Datalake Local data lake with Apache Hudi + Debezium CDC Hudi, Debezium, Docker
Order Book Financial order book processing for real-time trading insights Golang

Tech Stack

Languages       Golang • Python • Scala • SQL • PySpark
Streaming       Kafka • Kafka Streams • Kafka Connect • RabbitMQ • Protobuf • Avro • FIX Protocol
Processing      Spark (SQL/DataFrames) • Hudi (CoW/MoR) • Polars • NiFi
Storage         Redshift (+ Spectrum) • S3 • PostgreSQL • DynamoDB • MongoDB • DuckDB • Delta Lake
Cloud           AWS (EMR, EKS, MWAA, DMS, Glue, Redshift) • GCP (BigQuery, GCS)
Infrastructure  Terraform • Kubernetes • Docker • GitHub Actions
Monitoring      Datadog (APM, SLO, Dashboards) • CloudWatch • PagerDuty

GitHub Stats

GitHub Stats Top Languages


Writing

I write about data engineering, streaming systems, and infrastructure on Medium.


Connect

LinkedIn Medium Twitter Email


AWS Solutions Architect – Professional | 9+ years | Hyderabad, India

Pinned Loading

  1. fake-data-producer fake-data-producer Public

    Go 1

  2. datalake datalake Public

    datalake in local machine

    Shell 2 1

  3. caching caching Public

    In-memory caching Eviction Algorithm

    Go

  4. orderbook orderbook Public

    Order Book Programming Problem

    Go 3

  5. raftbadger raftbadger Public

    distributed key-value store that combines Raft consensus

    Go

  6. ridetrends ridetrends Public

    spark scala project to produce ride trends

    Scala