Hello! I'm Szymon Pająk, a 3rd year student of Automatics and Robotics with a passion for programming. Main fields of my interests are Data Science and Machine Learning (what you can see looking through this portfolio).
This portfolio showcases a selection of projects that I've worked on. Each project summary provides insights into the objectives, methodologies, and outcomes, giving you a glimpse of my capabilities.
Feel free to explore and dive into the details of each project to get a better understanding of my skills and contributions.
- Premier League Stats
- OR Final Project App
- Classification Mini Project
- Predict Bulldozer Price
- Heart Disease Project
- To be added soon*.
*Due to the current lack of time, some TensorFlow projects need to wait for final edits (to make them ready for public view). But it will change soon ;)
This project aims to scrape Premier League statistics from the fbref.com website and store them in a MySQL database. It also provides interactive plots using the Plotly library.
- Objective: Gather and analyze Premier League team and player stats.
- Data Source: Scraped from fbref.com.
- Components:
- main.ipynb: Scrapes and processes data, then creates the MySQL database.
- visualization.ipynb: Generates interactive plots using Plotly.
NOTE: Plots made with plotly aren't displayed due to a Github error.
-
Data Scraping:
- Uses web scraping techniques to extract Premier League stats.
- Gathers both team and player data.
-
MySQL Database:
- Constructs a structured database for storing scraped data.
- Includes tables for teams, players, matches, and other pertinent information.
-
Interactive Plots:
- Utilizes Plotly for creating dynamic visualizations.
- Permits users to explore trends, compare teams, and analyze player performance.
This repository contains the code and files for optimizing the schedule of the Premier League using Operations Research techniques. It was created as a final project for an Operations Research subject, developed in collaboration by Szymon Pająk and Klaudiusz Grobelski.
The example screenshot of the app:
This repository showcases the implementation of a neural network for multiclass classification using TensorFlow. The project leverages TensorFlow's Fashion MNIST dataset, which contains 28x28 grayscale images of fashion items categorized into 10 classes:
- T-shirt/top
- Trouser
- Pullover
- Dress
- Coat
- Sandal
- Shirt
- Sneaker
- Bag
- Ankle boot
The goal is to train a neural network model that accurately classifies these clothing items into their respective categories. The model architecture includes input, hidden, and output layers, designed to handle the complexity of the Fashion MNIST dataset and make precise predictions for multiclass classification tasks.
The goal of this project is to predict the future sale prices of bulldozers based on their characteristics and historical data. The dataset is sourced from the Kaggle Bluebook for Bulldozers competition, which includes training, validation, and test sets.
The model's performance is evaluated using the RMSLE (Root Mean Squared Log Error) metric, quantifying the accuracy of predicting bulldozer sale prices over time.
This project aims to build a machine learning model using Python libraries to predict whether an individual has heart disease based on their medical attributes.
The dataset used for this project is sourced from the UCI Machine Learning Repository and Kaggle. It includes features such as age, gender, chest pain type, blood pressure, cholesterol level, and more.
The model's performance will be evaluated based on its accuracy in predicting the presence or absence of heart disease using these medical features.
