Skip to content
View aksingh4545's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report aksingh4545

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
aksingh4545/README.md
Typing SVG



Profile views



github contribution grid snake animation




👋 A bit about me

I work at the intersection of data engineering, cloud platforms, and real-time systems.

Most days: designing predictable pipelines, taming messy data, tuning Spark jobs, or thinking about how systems behave at scale.

I value systems that are:

  • predictable
  • observable
  • easy to maintain & reason about

Silence helps me focus. Clean logs make me happy. Calm infrastructure > shiny complexity.


🧠 Skills & Technologies

Core skills icons

Web & scripting icons (used occasionally)

Core Data Stack (daily drivers)

  • Python, SQL, PySpark
  • MySQL / PostgreSQL / Snowflake
  • Pandas & data wrangling
  • AWS (S3, Glue, Lambda, EMR), Azure (Data Factory, Databricks)
  • Airflow / dbt (orchestration & transformation)

Streaming & Scale

  • Kafka / Event Hubs
  • Spark / Databricks (exploring internals & perf tuning)
  • Docker & basic infra automation

When needed

  • FastAPI / Streamlit for data apps
  • React / TS / JS for quick UIs

🔭 What I’m working on right now

  • Streamlit apps connected to AWS S3 for quick data exploration
  • Batch & near-real-time pipelines with Python + SQL
  • Diving deeper into Azure Databricks, PySpark, Kafka → event-driven processing
  • Building cleaner, more maintainable data workflows

🌱 Currently learning / leveling up

  • Advanced data modeling for analytics & warehouse workloads
  • Spark internals, partitioning, broadcast joins, memory tuning
  • Event-driven architectures & reliable message queues
  • Writing better docs & diagrams for data systems (Mermaid / Excalidraw)

🤝 Connect with me

     


GitHub Streak

Building calm data systems • One reliable pipeline at a time • 2026

Pinned Loading

  1. image_resize image_resize Public

    This project implements an event-driven, serverless image processing pipeline on AWS. Images uploaded to Amazon S3 are automatically resized using AWS Lambda and Pillow, stored in a destination buc…

    Python 4 1

  2. streamlit_s3_pipeline streamlit_s3_pipeline Public

    The system supports real-world resumes (PDF, DOCX, TXT), handles noisy formats, and follows industry-grade data engineering practices.

    Python 3 1

  3. Login_Cognito Login_Cognito Public

    This repo about how to use AWS Congito fully managed services with streamlit application.

    Python 2 2

  4. flink-kafka flink-kafka Public

    This repository is all about working with flink with docker, and real time ingestion of data from kafka and processed in real time on Flink

  5. kafka-project kafka-project Public

    In repository i have one project that read the data from csv file and store it in the output.csv file , python create streaming of data and kafka do will streaming that

    Python

  6. mlflow_pipeline mlflow_pipeline Public

    An end-to-end Machine Learning + MLOps project that predicts a student’s final performance percentage using a production-ready ML pipeline, MLflow model registry, FastAPI inference service, and Doc…

    Python