This project builds an automated artificial intelligence system to identify Ukrainian children who were abducted or forcibly resettled in Russia. The system uses computer vision techniques to analyze and match facial features with state-of-the-art models from the DeepFace library.
We collected data from three main sources:
- Russian data: 37,022 photos from usynovite.ru (Russian adoption website) downloaded on November 17, 2024 and March 30, 2025
- Belarusian data: 4,017 photos from dadomu.by (Belarusian adoption website) collected on March 30, 2025
- Ukrainian data: 758 photos from childrenofwar.gov.ua (Ukrainian government database for displaced children) downloaded on November 17, 2024 and March 30, 2025
Scrapers and downloaded data are available in a separate repository: missing-children-scrapers.
- Clone the repository:
git clone https://github.com/yourusername/missing-children-ai-search.git- Navigate to the project directory:
cd missing-children-ai-search- Create and activate a virtual environment:
python3 -m venv env
source env/bin/activate- Install dependencies:
pip install -r requirements.txtRun the preprocessing notebook to prepare the dataset:
jupyter notebook preprocessing.ipynbRun the face recognition script to identify matches:
python face_recognition.pyAfter running the face recognition script, automatically identified matches are saved to the results directory and require manual review to confirm each match.
Note: Using a GPU significantly speeds up face recognition processing on large datasets.
missing-children-ai-search/
├── data/ # Preprocessed dataset files
├── model_selection/ # Face detection and recognition benchmarks
├── results/ # Output files
├── preprocessing.ipynb # Data preprocessing notebook
├── face_recognition.py # Main recognition script
├── requirements.txt # Python dependencies
└── README.md # Project documentation
We used the DeepFace library for facial detection and recognition. After benchmarking multiple detector-model combinations, RetinaFace with ArcFace achieved the highest accuracy on our validation dataset.
Read more: model_selection/Detector_Comparison.md