Skip to content
View Gowthamch9's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report Gowthamch9

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Gowthamch9/README.md

Gowtham Venkat Eathamokkala Avatar

Data Engineer · Data Analyst · ML Practitioner

📍 Denton, Texas, USA  |  📧 GowthamVenkatEathamokkala@my.unt.edu  |  📞 +1-940-536-4494

LinkedIn GitHub ORCID Google Scholar


👋 About Me

I'm a Data Engineer and Data Analyst with 2+ years of industry experience and an M.S. in Advanced Data Analytics (GPA: 3.64) from the University of North Texas. I specialize in building scalable data pipelines, transforming complex datasets into actionable business insights, and delivering clean, reliable data to analytics and ML teams.

I have hands-on experience across the full data stack — from streaming ingestion with Apache Kafka and ETL design in SQL/Python, to dashboards in Power BI and Tableau and ML workflows in scikit-learn. I enjoy working at the intersection of engineering rigor and analytical storytelling.

Currently a PhD student in Information Science at UNT (Fall 2026) and actively seeking RA / TA / GRA opportunities at UNT.


💼 Professional Experience

Data Engineer — Vsion Technologies, Austin TX

Sep 2024 – Present

  • Designed streaming pipelines processing 10M+ financial records/day using Apache Kafka and PostgreSQL; reduced query latency by 40% through optimized SQL views and indexing.
  • Built layered analytical data models delivering reliable, clean data to analytics and reporting teams with full documentation.
  • Implemented modular ETL transformations achieving 100% data accuracy for downstream applications.
  • Collaborated with cross-functional stakeholders to translate business requirements into scalable, maintainable data solutions.

Data Analyst — Zetatek Technologies Pvt Ltd, Hyderabad, India

Jan 2022 – Dec 2022

  • Built automated Power BI dashboards with dynamic filtering; reduced report generation time by 80% and saved 20+ hours/month in recurring financial workflows.
  • Analyzed 500k+ operational and financial records using SQL Server / SSIS; improved resource allocation efficiency by 15%.
  • Applied statistical analysis and predictive modeling to forecast trends, improving budget planning and operational efficiency.

🎓 Education

Degree Institution Year GPA
Ph.D. Information Science (Concentration: Data Science) University of North Texas 2026–
M.S. Advanced Data Analytics University of North Texas 2024 3.64 / 4.00
B.Tech. Electronics Engineering GRIET, Hyderabad, India 2022 3.36 / 4.00

🛠️ Technical Skills

Languages & Data Science Python SQL R PySpark Pandas NumPy scikit-learn PyTorch TensorFlow

Data Engineering & Cloud Apache Kafka Apache Spark Snowflake GCP BigQuery Dataflow Pub/Sub Vertex AI PostgreSQL MySQL Microsoft SQL Server

Visualization & BI Power BI Tableau Looker Studio Excel Matplotlib Seaborn

ML & Analytics Regression Classification Clustering Random Forest SVM Time Series Forecasting PCA Model Evaluation Uncertainty Quantification

Tools Git Jupyter Notebook LaTeX FastAPI


📁 Projects & Analysis

Repository Description Tech
IPL-Analysis A deep-dive analysis of Indian Premier League cricket(2008 - 2023) using custom analytical metrics Python
superstore-profitability-analysis Analyzed a retail superstore's 4-year sales dataset (2014-2017) to uncover profitability challenges and recommend actionable strategies Python · PowerBI
trisql-framework 3-stage Text-to-SQL pipeline (TriSQL architecture) — semantic schema selector, structure-aware SQL generator, complexity-aware refiner; 70% execution accuracy on the Spider benchmark, 100% executability, zero GPU or API costs Python · FastAPI · Ollama · SQLite · sentence-transformers
TDSP-Transportation_Data_Science_Project End-to-end spatiotemporal analysis of 200k+ NYC crash records — time-series decomposition, geospatial hotspot clustering, anomaly detection; findings presented at USDOT Federal Highway Administration Python · Jupyter · GeoPandas
Data-Analysis-using-python Collection of data analysis workflows covering EDA, data cleaning, feature engineering, and statistical visualizations across real-world datasets Python · Pandas · Jupyter
Machine-Learning-using-Python Foundational to intermediate ML implementations: regression, classification, clustering, and model evaluation with real datasets Python · scikit-learn · Jupyter
Power-BI-Projects Business intelligence dashboards with dynamic filtering, KPI cards, and drill-through views for financial and operational reporting Power BI
MSSQL_Queries T-SQL query library covering complex joins, window functions, CTEs, stored procedures, and query performance tuning T-SQL · SQL Server
Pizza-Sales-Excel-Project Sales analytics dashboard built with Pivot Tables, dynamic charts, and slicers to surface revenue trends and product performance Excel
PySpark-in-DataBricks Practice work with PySpark inside Databricks, focusing on data manipulation, transformations, and analytics at scale PySpark
SQL-for-Data-Engineering This Project demonstrates my SQL skills applied in both real-world data engineering workflows and practice queries PLpgSQL
Vsion-Technologies-CaseStudy transformed raw data streams from Kafka topics into clean, business-ready datasets by designing optimized SQL views on PostgreSQL staging tables Kafka · PostgreSQL
Zetatek-DataAnalysis-CaseStudy Analyzed operational and financial datasets to uncover business trends, optimize resource allocation, and streamline reporting processes for leadership SSIS · MSSQL · Power BI · Excel

🔬 Research & Publications

  • Al-Edhari, A., Eathamokkala, G. V., & Rahouti, M. (2026). Response drift across frontier large language models. Manuscript under review at Nature Machine Intelligence.
  • B. V. Kumar et al. (2022). Analysis of an IoT based Water Quality Monitoring System. IEEE I-SMAC 2022. DOI: 10.1109/I-SMAC55078.2022.9987360

🏅 Certifications

Certification Provider Year
Data Engineering & ML Specialization Google Cloud 2024
Advanced Data Analytics Professional Google 2024
Power BI for Data Analysts Microsoft 2024
Advanced SQL for Data Engineering LinkedIn 2024

🌐 Languages

English — Professional   Telugu — Native   Hindi — Conversational   Tamil — Conversational

Pinned Loading

  1. IPL-Analysis IPL-Analysis Public

    A deep-dive into 16 seasons of Indian Premier League cricket using custom analytical metrics

    HTML

  2. superstore-profitability-analysis superstore-profitability-analysis Public

    Comprehensive Power BI analysis identifying $325K+ profit opportunities in retail operations through data transformation, DAX measures, and scenario planning.

  3. trisql-framework trisql-framework Public

    Three-stage Text-to-SQL framework achieving 70% execution accuracy on Spider benchmark — runs free on any laptop with no GPU required

    Python

  4. MSSQL_Queries MSSQL_Queries Public

    TSQL

  5. Power-BI-Projects Power-BI-Projects Public

    This repo consists of Dashboards and visualizations using Power BI.

  6. Pizza-Sales-Excel-Project Pizza-Sales-Excel-Project Public

    This Repo consists of an excel file where I have created Pivot tables, Charts and Dashboards using Microsoft Excel