A machine learning project to predict Ford car prices using regression analysis with an accuracy of 82%.
- Source: ford.csv
- Features: year, mileage, tax, mpg, engineSize, model, transmission, fuelType
- Target: price
- Analyzed dataset shape and structure
- Examined data types and statistical summaries
- Verified data quality (no null values)
- Generated correlation heatmap
- Removed duplicate records
- Retained all original features
- Histograms with KDE for numerical features
- Line plots for price trends by year and model
- Violin and box plots for categorical analysis
- Correlation heatmap
- One-Hot Encoding: Converted categorical variables (model, transmission, fuelType)
- Feature Scaling: Applied StandardScaler to numerical features (mileage, tax, mpg, engineSize, year)
- Training set: 80%
- Testing set: 20%
- Random state: 42
- Algorithm: Linear Regression
- Successfully trained on preprocessed features
- R² Score: 0.82 (82% accuracy)
- Adjusted R² Score: Calculated based on test set dimensions
The model explains 82% of the variance in Ford car prices, demonstrating strong predictive performance.