A repository to help the TVET Curriculum educators to explore some practical concepts in machine learning based on how RTB curriculum structured, This repo will be updated with new Learning outcomes soon, stay checking updates.
This Jupyter notebook demonstrates key data preprocessing techniques using a small synthetic dataset. It is designed as a learning tool for Teachers / students and beginners in Machine Learning.
you are welcome to contribute to this repo if you are passionate to help others especially educators
- Handling missing data
- Feature encoding:
- Label Encoding
- Binary Encoding
- Target Encoding (with and without smoothing)
- Feature scaling (MinMax Scaler, Standard Scaler)
- Date/time feature extraction and transformation and Cycling encoder( SINUS AND COSINE representation to detect the close betwen DECEMBER(12) AND JANUARY(1))
- Correlation analysis And interpretation
- Normality testing (e.g., Kolmogorov–Smirnov test)
- Dropping irrelevant features
data cleaning with syntetic data.ipynb– Main notebook with all examples and explanations.Data.csv– Small dataset created for teaching purposes.
AS We learn by doing real project, This Bank note classification project focuses on classifying banknotes as FAKE or REAL using machine learning algorithms. It uses a dataset containing features extracted from images of banknotes, including statistical properties like variance, skewness, kurtosis, and entropy.
🧠 Models Used:
- K-Nearest Neighbor (KNN)
- Logistic Regression
- Choosing the best value of K to be used in KNN using K-fold Cross validation
- training KNN MODEL
- train Logistic Regression Model
- Evaluation Metrics:
- Accuracy
- Precision
- Recall
- F1 score
- Confusion Matrix
BANK NOTE CLASSIFICATION– Main notebook with all examples and explanations.BankNote_Authentication.xls– Small dataset used from Kaggle platform.
Make sure you have Python 3.x and pip installed. You can install Python from the official website.
git clone https://github.com/your-username/your-repo-name.git
cd your-repo-name
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
pandas
numpy
scikit-learn
category_encoders
matplotlib
seaborn
scipy
Let me know if you'd like a version for publishing on **Kaggle**, **Google Colab**, or a **custom webpage** too!NIYONSHUTI Yves
Assistant Lecturer – Rwanda Polytechnic, Tumba College
Founder & CEO – Mpuza Inc.
📧 yniyonshuti@rp.ac.rw 📧 info@mpuza.com
https://mpuza.com https://www.linkedin.com/company/mpuza/?viewAsMember=true
📞 +250 786 397 515
CONTACT ME FOR ANY FURTHER EXPLANATION AND TECHNICAL SUPPORT
This repository serves as a teaching resource to help learners understand and practice data preprocessing techniques before moving on to real-world data science problems.
Feel free to use and share this notebook with attribution. The content is intended for educational purposes.