Lead Data Engineer — building real-time data platforms for financial trading systems.
Currently owning the data backbone for a global FX/CFD broker: 3+ TB data lake on S3, 50,000+ trades/day across MT4/MT5/cTrader, 45+ Airflow DAGs in production. Previously built data pipelines at Apple (Maps/GIS platform).
| Domain | What I Do |
|---|---|
| Real-Time Streaming | Kafka, Kafka Streams (stateful position tracking), Kafka Connect, RabbitMQ, FIX Protocol, Protobuf |
| Batch ETL | Spark (PySpark/SQL), Apache Hudi (CDC → Lakehouse), Airflow (MWAA), EMR on EKS |
| Data Platform | CDC pipelines (DMS → Avro → Hudi → Redshift Spectrum), medallion architecture, data modeling |
| Infrastructure | Terraform, Kubernetes, Docker, AWS (Redshift, EMR, EKS, S3, DMS, MWAA), GCP |
| Data Quality | Built a self-healing reconciliation system — took missing transaction incidents from ~10/month to zero |
| AI + Analytics | LLM-powered Text-to-SQL bot, DuckDB embedded analytics, Delta Lake, Marimo notebooks |
| Project | Description | Stack |
|---|---|---|
| DataForge | AI-powered AWS infrastructure architect — describe your data problem, get a sized, costed architecture with Terraform | Python, AWS Pricing Engine |
| Trading Analytics Platform | Real-time tick data processing with hot/cold symbol partitioning, OHLCV bars, anomaly detection | Spark Streaming, Kafka, MinIO |
| RaftBadger | Distributed key-value store with Raft consensus + BadgerDB | Golang |
| Datalake | Local data lake with Apache Hudi + Debezium CDC | Hudi, Debezium, Docker |
| Order Book | Financial order book processing for real-time trading insights | Golang |
Languages Golang • Python • Scala • SQL • PySpark
Streaming Kafka • Kafka Streams • Kafka Connect • RabbitMQ • Protobuf • Avro • FIX Protocol
Processing Spark (SQL/DataFrames) • Hudi (CoW/MoR) • Polars • NiFi
Storage Redshift (+ Spectrum) • S3 • PostgreSQL • DynamoDB • MongoDB • DuckDB • Delta Lake
Cloud AWS (EMR, EKS, MWAA, DMS, Glue, Redshift) • GCP (BigQuery, GCS)
Infrastructure Terraform • Kubernetes • Docker • GitHub Actions
Monitoring Datadog (APM, SLO, Dashboards) • CloudWatch • PagerDuty
I write about data engineering, streaming systems, and infrastructure on Medium.
AWS Solutions Architect – Professional | 9+ years | Hyderabad, India



