I build production-grade AI systems focused on inference optimization, distributed ML infrastructure, agent orchestration, and scalable backend platforms.
Currently pursuing B.Tech in Electronics & Communication Engineering at BIT Mesra (CGPA: 9.0/10.0) while building systems that improve latency, throughput, reliability and deployment efficiency for modern AI applications.
- LLM Inference Optimization
- Triton & CUDA-based Systems
- AI Infrastructure Engineering
- Agent Orchestration Frameworks
- Distributed Systems
- Production ML Platforms
- Reliability & Observability
π Amazon ML Summer School Scholar (Top 0.2% Nationwide)
π CDAC Merit Scholar
π Open Source Contributor (GSSOC)
π ML Engineer @ Elevate Labs
π AI Systems Intern @ OutriX
π Cybersecurity Intern @ CDAC India
π Algorithmic Trading Intern @ Lunor AI
- Designed PyTorch training and inference pipelines for NLP and computer vision tasks; improved experiment reproducibility through structured preprocessing and automated evaluation tooling.
- Optimized inference workflows via latency profiling and batching strategies, reducing average inference time by βΌ18% across 3 deployed model variants.
- Built ML evaluation harness for model validation, benchmarking and regression testing across 5 model iterations
- Built an LLM evaluation pipeline processing 1M+ records β automated scoring, regression testing and failure triage β cutting experimentation turnaround time by 30%.
- Owned ETL/ELT data workflows feeding inference benchmarking dashboards; instrumented with OpenTelemetry for end-to-end latency observability.
- Profiled high-throughput AI inference workflows,identifying 3 bottleneck stages optimized to reduce p95 latency by βΌ18%.
- Built a 3-stage anomaly detection pipeline on structured network-intrusion datasets (βΌ50K samples): feature extraction β threshold calibration β alert triage, reducing manual review queue by βΌ35%.
- Implemented distributed validation and monitoring workflows for automated anomaly scoring across multi-source security data streams.
- Developed deterministic multi-asset trading strategies using SQL-backed financial time-series datasets.
- Built backtesting systems evaluating Sharpe ratio, volatility and maximum drawdown for strategy validation.
- Implemented volatility-adjusted optimization techniques improving risk-adjusted returns and portfolio stability.
Python β’ C++ β’ TypeScript β’ SQL
PyTorch β’ Transformers β’ LLMs β’ RAG β’ CNNs β’ Agent Systems
CUDA β’ Triton β’ TensorRT β’ FlashAttention β’ Quantization β’ KV Cache Optimization
FastAPI β’ Redis β’ PostgreSQL β’ Docker β’ Kubernetes β’ AWS
OpenTelemetry β’ MLflow β’ Monitoring β’ Performance Profiling
AsyncIO β’ Event-Driven Architecture β’ Scheduling β’ Caching β’ Message Queues
I enjoy solving engineering problems involving:
- GPU Utilization Optimization
- Inference Throughput Scaling
- Low-Latency Architectures
- Distributed Scheduling
- Agent Systems
- AI Reliability Engineering
- Production AI Deployment
πΌ LinkedIn
π» GitHub
π§ LeetCode