📊 Data Analysis Portfolio

This constantly evolving repository brings together my projects, exercises, professional internships, and certifications in the field of data analysis. I am currently at ninth semester of a Computer Science Engineering degree, while also completing my professional internship at Nidix Networks. This allows me to apply my knowledge in a real-world environment and complement my academic training.

The portfolio includes Python projects I have developed throughout my studies for various elective courses, such as Machine Learning and Data Science. It also includes other data analysis tools like Excel, SQL, Power BI, and Minitab, among others. In addition to showcasing my technical skills, this portfolio also reflects my commitment to continuous learning, constant improvement, and the adoption of good engineering practices, such as documentation, version control, and the use of pipelines.

This repository will continue to grow progressively, incorporating the new tools, methodologies, and technologies I learn, especially now that I am exposed to real-world processes within the industry. I believe it's essential to stay up-to-date in a rapidly evolving technological environment; therefore, in addition to reinforcing what I've learned, I strive to integrate modern approaches that add value to any project or company I'm involved with.

1. 🏅 Certifications

Collection of certificates from courses and training programs I have completed or am about to complete (checklist in my main repository: Allan19k), including:

Kaggle Learn (Python, Pandas, Data Cleaning, Data Visualization, SQL, Machine Learning, Geospatial Analysis, etc.)
Santander Open Academy (Excel, ChatGPT Fundamentals, Power BI)

2. 📈 Excel Projects

Excel exercises applied to analysis and dashboarding (in progress), including:

Basic exercises from the Santander Open Academy Excel course
Interactive dashboards and conditional formatting
Simulation exercises and automated reports
Intermediate and advanced exercises using various Excel functions and tools

3. 🤖 Machine Learning Projects

Exercises and projects developed in 7th semester during the optional course of Machine Learning with the help of Dr. Graciela María de Jesús Ramírez Alonso, in addition to various complementary Kaggle Learn courses:

Algebra Review for Neural Networks (Algebra exercises using the NumPy library, focused on reinforcing essential knowledge for ML)
Hyperparameter Search (GridSearchCV with MLPClassifier on load_wine)
Time Series Prediction (RNN vs. LSTM for EUR/USD)
Transfer Learning with ResNet50 for image classification
Smart Dairy Farming: Milk Yield Classification App (Final Project for the course) This consists of a mobile application that uses a computer vision-trained classification model to predict milk production levels (high, medium, or low) from images. It was published as a scientific article and demonstrates the practical application of machine learning in the agricultural sector.
Complementary Courses:

Intro to Machine Learning

4. 📉 Minitab Projects

Applied Statistics exercises completed in 5th semester under the guidance of Teacher Patricia Guadalupe Orpinel Ureña:

Chi-square Tests (goodness of fit and independence)
ANOVA (one-way and two-way)
Linear Regression (simple and multiple)
Formal conclusions, graphs, and validation of assumptions

5. 🐍 Python Projects

Projects and exercises on Python using various libraries for data analysis:

Python Fundamentals (syntax, functions, lists, conditionals…)
Statistical Analysis using Statistics
Generation of Dummy data with Faker for export to CSV and XLSX
Kaggle Learn courses (Pandas, Data Cleaning, and Data Visualization)
Scraping and automation (automating data retrieval from various sources (APIs or web scraping) using threads to improve performance)
Kaggle notebook adaptations

6. 💾 SQL Projects

Exercises from the Database Fundamental course developed along with Professor José Saúl de Lira Miramontes in 6th semester and other SQL exercises and projects :

Installation and configuration of Oracle 21c XE and HR schema
Basic queries: SELECT, WHERE, JOIN, GROUPBY, subqueries, DML, views
Exercises organized by topic with screenshots and explanations

7. 🦾 Intro to AI Ethics

Kaggle Learn course focused on the ethical principles of using AI. Through examples and real-world cases, key concepts such as algorithmic bias, privacy, fairness, and accountability in automated systems were explored.

I included this content in my portfolio because I consider it fundamental to understand the social impact of the tools we develop. In particular, I am interested in applying these principles within data analysis and artificial intelligence projects in an ethical and transparent manner.

8. 📊 Power BI Projects

Projects and exercises carried out with Microsoft Power BI, applying data import, creation of interactive reports, filters, conditional formatting, and transformations with Power Query, as part of the Power BI Fundamentals course – by Santander Open Academy, as well as other exercises I will do later to reinforce the knowledge acquired from the course in question or more challenging projects.

9. 🥼 Data Science

Optional course that I took in 8th semester guided by Dr. Olanda Prieto Ordaz.

Practical Data Science course focused on the complete cycle of a Machine Learning project: data acquisition and cleaning, exploratory analysis, supervised and unsupervised modeling, validation, and deployment. This experience allowed me to apply ML techniques to real-world problems and build reproducible artifacts for my portfolio.

Social Network Analysis Introductory exercise developed in Google Colab using the book Data Science from Scratch as a reference. The main objective was to apply data structures in Python to answer questions related to a small, fictitious social network of employees.
Web Scraping Following a YouTube tutorial on web scraping in Python; an analysis stage was added that extracts mentions of Open Source tools for Data Science and groups them by categories (data management, integration, visualization, etc.). Graphs were created showing the frequency of mentions per tool/category.
Linear Regression Car price prediction. Includes: Train/Test partitioning, EDA, preprocessing pipeline (imputation, encoding, scaling when applicable), linear regression baseline, hyperparameter search (Grid/Random), evaluation with RMSE, and final validation in the test.
End-to-end project End-to-end project inspired by the hands-on-ml2 repository (Andreas Géron). Complete workflow: EDA, preprocessing pipeline, model training (Linear Regression, Decision Tree, Random Forest), fitting with RandomizedSearchCV, comparison by RMSE, and local deployment of the best model with Streamlit (interface for making predictions).
Lung Cancer dataset projects Implementation and comparison of multiple Machine Learning models on a Kaggle database about Lung Cancer.
Amazon Predictor Complete system to predict the Adjusted Close (Adj Close) of Amazon using historical data: EDA, creation of lags (lagged features), pipelines, comparison of classic models and networks (Ridge, SVR, RandomForest, Voting, AdaBoost, GradientBoosting, XGBoost, MLP, DNN, LSTM). The best model was selected based on metrics (RMSE, MAE, MSE) and deployed locally using Streamlit for interactive prediction.

🚧 🧭 In Progress...

Complete all Kaggle courses to strengthen SQL, Machine Learning, and other related topics
Add new tools currently used in Data Analysis
Conduct Intermediate and Advanced Excel exercises using custom or Kaggle databases
Add more personal projects with real or simulated data
Continuously improve the documentation and design of the portfolio

Repositories and sections updated as of 25/01/2026.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📊 Data Analysis Portfolio

1. 🏅 Certifications

2. 📈 Excel Projects

3. 🤖 Machine Learning Projects

4. 📉 Minitab Projects

5. 🐍 Python Projects

6. 💾 SQL Projects

7. 🦾 Intro to AI Ethics

8. 📊 Power BI Projects

9. 🥼 Data Science

🚧 🧭 In Progress...

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 123 Commits
Certifications		Certifications
Data Science		Data Science
Excel Projects		Excel Projects
Intro to AI Ethics (Kaggle)		Intro to AI Ethics (Kaggle)
Machine Learning Projects		Machine Learning Projects
Minitab Projects		Minitab Projects
PowerBi Projects		PowerBi Projects
Python Projects		Python Projects
SQL Projects		SQL Projects
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

📊 Data Analysis Portfolio

1. 🏅 Certifications

2. 📈 Excel Projects

3. 🤖 Machine Learning Projects

4. 📉 Minitab Projects

5. 🐍 Python Projects

6. 💾 SQL Projects

7. 🦾 Intro to AI Ethics

8. 📊 Power BI Projects

9. 🥼 Data Science

🚧 🧭 In Progress...

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages