An AI-driven agricultural analytics system that predicts crop yield based on historical farming data and market indicators. The project helps farmers, researchers, and policymakers make data-driven decisions to improve agricultural productivity and resource planning.
-
Crop Yield Prediction
- Predicts crop yield using historical agricultural data
- Uses a Random Forest Regression model for accurate predictions
-
Data Preprocessing & Cleaning
- Handles missing values and feature scaling
- Encodes categorical variables such as crop type and season
-
Model Evaluation
-
Performance measured using:
- RMSE (Root Mean Squared Error)
- R² Score (Coefficient of Determination)
-
-
Visualization & Analytics
- Actual vs Predicted Yield comparison
- Feature importance analysis
- Statistical insights into agricultural factors
-
Integrates state-level market indicators (min, max, modal prices)
-
Analyzes impact of:
- Rainfall
- Fertilizer usage
- Pesticide usage
- Cultivation area
-
Soil data analyzed independently for future integration
- Dataset Collection (Crop Yield & Market Data)
- Data Cleaning & Feature Engineering
- Train-Test Split
- Model Training (Random Forest)
- Model Evaluation
- Visualization & Result Interpretation
- Model Saving for Deployment
- Python 3.8+
- Pandas & NumPy – Data processing
- scikit-learn – Machine learning models
- Matplotlib & Seaborn – Data visualization
- Joblib – Model persistence
- R² Score: ~0.98
- RMSE: Low error indicating strong predictive performance
Minor deviation between actual and predicted values reflects realistic real-world conditions and avoids overfitting.
-
Crop Yield Dataset
- Crop
- Crop Year
- Season
- State
- Area
- Rainfall
- Fertilizer
- Pesticide
- Yield
-
Market Indicators Dataset
- State-wise agricultural pricing data
Soil nutrient data analyzed separately due to spatial granularity limitations.
- Actual vs Predicted Crop Yield Graph
- Feature Importance Bar Chart
- Statistical Distribution Plots
- User provides historical crop data
- Model learns patterns from agricultural and market indicators
- Trained model predicts expected crop yield
- Results are visualized and evaluated for accuracy
- Integration of real-time weather APIs
- District-level soil data mapping
- Deep learning models (LSTM for time-series yield prediction)
- Web interface using Flask or Streamlit
- Farmer-friendly mobile dashboard
git clone https://github.com/Pooja0629/CropYieldPrediction
cd CropYieldPrediction
pip install -r requirements.txtpython crop_yield_prediction.pyOr run the Jupyter Notebook:
jupyter notebookThis project is licensed under the MIT License – see the LICENSE file for details.
- Kaggle for agricultural datasets
- scikit-learn for machine learning tools
- Open-source community for continuous support
Author: Pooja S Email: poojashree2266@gmail.com GitHub: Pooja0629
- Repository: https://github.com/Pooja0629/CropYieldPrediction
- Issues: https://github.com/Pooja0629/CropYieldPrediction/issues
Empowering agriculture with data-driven insights for smarter and sustainable crop yield decisions.