AutoYield-AI is an autonomous wafer quality pipeline that combines computer vision inference, explainability, drift monitoring, and GenAI-assisted root-cause analysis. It is structured as a clean full-stack monorepo containing a FastAPI backend, a React/Vite operations dashboard, and a Streamlit demo UI.
The more we studied semiconductor fabrication, the more one fact stood out:
By the time a defect is discovered, the damage is already done.
A wafer has already passed through dozens of processing stages. It has consumed significant energy, expensive chemicals, and thousands of liters of ultra-pure water. Engineers must then pause production, investigate logs, analyze inspection images, and trace the root cause across an extremely complex manufacturing process.
We kept asking ourselves:
- What if this investigation could begin before the wafer fails?
- What if inspection data could actively help engineers prevent waste instead of simply explaining it afterward?
- What if engineers could interact with an intelligent assistant that understands manufacturing history and process knowledge?
- What if they could test corrective actions in a virtual environment before applying them on the production floor?
AutoYield AI was born from the desire to reduce engineering effort, financial losses, and environmental waste simultaneously.
AutoYield AI analyzes wafer inspection images to detect defect patterns at an early stage and assist engineers in understanding what might go wrong next.
The platform combines computer vision, continual learning, Digital Twin technology, and a Retrieval-Augmented Generation (RAG) based Large Language Model to create an intelligent yield optimization ecosystem.
The core idea is simple: the system does not remain static after deployment.
Whenever AutoYield encounters a defect it is uncertain about, it learns from that experience. Over time, the platform becomes increasingly capable of recognizing rare, evolving, and previously unseen defect patterns.
Rather than treating inspection data as historical information, AutoYield transforms it into a continuously improving decision-support system for yield engineers.
We designed AutoYield using a practical two-model architecture:
ConvNeXt-Small (Base Model)
- High-accuracy foundation model
- Trained on curated wafer defect datasets
- Achieves approximately 92% classification accuracy
- Provides robust baseline defect detection
EfficientNet (Adaptive Model)
- Lightweight and computationally efficient
- Retrains quickly on newly discovered defects
- Continuously adapts to changing manufacturing conditions
Rare defects are often difficult to learn due to limited training samples.
To address this challenge, AutoYield employs a Generative Adversarial Network (GAN) that generates realistic synthetic defect variations.
These generated samples help the system:
- Improve generalization
- Handle class imbalance
- Learn emerging defect patterns faster
- Increase detection robustness
Whenever the system detects a low-confidence defect:
- The image is reviewed and correctly labeled.
- GAN-generated defect variations are created.
- The EfficientNet model is retrained using the expanded dataset.
- Future predictions improve automatically.
This creates a self-learning feedback loop without requiring retraining of the entire base model, significantly reducing computational cost and deployment complexity.
AutoYield incorporates a Digital Twin of the semiconductor manufacturing process.
The Digital Twin acts as a virtual replica of the fabrication environment, continuously updated using real inspection data, process parameters, and defect observations.
Before engineers implement corrective actions, they can:
- Simulate process parameter changes
- Predict potential yield impacts
- Evaluate defect propagation risks
- Compare multiple optimization strategies
- Reduce costly trial-and-error experiments
This allows manufacturers to make informed decisions before affecting physical wafers, reducing waste and improving production efficiency.
AutoYield includes a Retrieval-Augmented Generation (RAG) powered Large Language Model designed specifically for semiconductor manufacturing.
The assistant retrieves information from:
- Historical defect cases
- Process documentation
- Engineering reports
- Equipment logs
- Standard operating procedures
- Internal knowledge bases
Engineers can ask questions such as:
- "Why are scratch defects increasing in Lot A?"
- "What process steps are commonly associated with ring defects?"
- "Have we observed similar failures before?"
- "Which corrective action produced the best results historically?"
Instead of generating generic answers, the RAG system grounds responses in actual factory knowledge and historical evidence, providing explainable and actionable recommendations.
A key aspect of AutoYield AI is its focus on sustainable manufacturing.
By detecting defects earlier, predicting failures, and enabling virtual experimentation through the Digital Twin, manufacturers can avoid unnecessary processing and rework.
This helps reduce:
- Material waste
- Energy consumption
- Water usage
- Chemical usage
- Carbon emissions
The platform also provides estimates of:
- Energy saved
- Water conserved
- Material waste prevented
- Carbon emissions avoided
allowing organizations to track both operational and environmental benefits.
Our vision is to transform wafer inspection from a passive quality-control step into an intelligent, self-improving decision system.
By combining advanced computer vision, continual learning, Digital Twin simulation, and RAG-powered engineering intelligence, AutoYield AI helps semiconductor manufacturers improve yield, reduce costs, accelerate root-cause analysis, and move toward a more sustainable future.
Tech Stack
- Backend: FastAPI, PyTorch, torchvision, OpenCV, NumPy, scikit-learn
- Frontend: React, Vite, React Router
- Optional GenAI: Google Gemini via
google-generativeai
AutoYeildAI/ (Root Workspace)
├── client/ (Frontend Application - React + Vite)
├── server/ (Backend Application - FastAPI + ML Pipelines)
└── docs/ (Overall system architecture and documentation)
# Navigate to the server folder
cd server
# Install dependencies (use virtual environment if desired)
pip install -r requirements.txt
pip install torch torchvision opencv-python pillow
# Create server/.env from server/.env.example and set MONGO_URI
uvicorn api.app:app --reload --port 8000The API exposes:
POST /api/analyzefor image analysisGET /api/historyfor recent inspectionsGET /api/metricsfor dashboard summary
# Navigate to the client folder
cd client
# Install dependencies
npm install
# Create client/.env from client/.env.example if needed
npm run devThe UI expects the API at http://localhost:8000.
# Navigate to the server folder
cd server
# Run the Streamlit dashboard
streamlit run ui/dashboard.pyMONGO_URI=mongodb+srv://YOUR_DB_USER:YOUR_DB_PASSWORD@cluster0.0dwbwfe.mongodb.net/?retryWrites=true&w=majority&appName=Cluster0
MONGO_DB_NAME=autoyield
MONGO_SERVER_SELECTION_TIMEOUT_MS=5000
CORS_ORIGINS=http://localhost:5173,http://127.0.0.1:5173If your MongoDB password contains special characters (@, :, /, ?, #, %), URL-encode it in MONGO_URI.
VITE_API_BASE_URL=http://localhost:8000GEMINI_API_KEY=your_key_here
GEMINI_MODEL=gemini-1.5-flashWithout these, the system uses fallback rules.
client/ React/Vite frontend dashboard
server/ Backend workspace
├── api/ FastAPI service
├── src/ Core ML pipeline (inference, explainability, drift, reasoning)
├── scripts/ RAG ingestion/indexing scripts
├── config/ YAML configs and prompt templates
├── models/ Model checkpoints
├── data/ Raw and processed wafer datasets
├── ui/ Streamlit demo
├── tests/ Unit and integration tests
├── outputs/ Local inference/run-time outputs (uploads, heatmaps, synthetics)
├── reports/ Generated PDF/HTML reports
└── runlogs/ Execution logs
docs/ Overall system architecture and documentation
This repository includes large datasets and model files under server/data/, server/outputs/, and server/models/. If you plan to push to GitHub, consider using Git LFS or moving large assets to external storage.