Real-time object detection in your browser — upload a photo or use your webcam.
Upload any image (or snap from your webcam) and the app instantly identifies objects using YOLOv8 — one of the fastest object-detection models available. Results include annotated bounding boxes, a confidence score per object, and a count table you can filter with a threshold slider.
- Image upload + webcam — two input modes in one UI
- Live confidence slider — filter out weak detections on the fly (0.10 – 0.90)
- Bounding-box annotations — color-coded per class with label + confidence %
- Detection table — class name, mean confidence, count; sorted by confidence
- Inference timer — reports exact milliseconds per detection run
- 3 sample images — pre-loaded street / kitchen / office scenes to try instantly
- CPU-only — no GPU or CUDA required
- Auto model download — YOLOv8n weights fetched automatically on first run
# 1. Install dependencies
pip install -r requirements.txt
# 2. Generate sample images (one-time)
python generate_samples.py
# 3. Launch the app
python app.pyThen open http://localhost:7860 in your browser.
YOLOv8 passes the entire image through a convolutional neural network once — it never crops and re-classifies regions the way older two-stage detectors do. The network predicts bounding-box coordinates and class probabilities simultaneously across a grid of cells, then applies Non-Maximum Suppression to collapse overlapping boxes into clean detections. That single forward pass is what makes it fast enough to run on a laptop CPU in under 100 ms.
| Category | Examples |
|---|---|
| People | person |
| Vehicles | car, truck, bus, motorcycle, bicycle, airplane, boat, train |
| Animals | dog, cat, bird, horse, cow, elephant, bear, zebra, giraffe, sheep |
| Electronics | laptop, TV, cell phone, keyboard, mouse, remote |
| Kitchen | bottle, cup, bowl, fork, knife, spoon, wine glass |
| Food | pizza, sandwich, banana, apple, orange, cake, donut, hot dog |
| Furniture | chair, couch, bed, dining table, toilet, potted plant |
| Outdoor | traffic light, stop sign, fire hydrant, bench, parking meter |
| Sports | sports ball, frisbee, skateboard, surfboard, tennis racket, kite |
| Accessories | backpack, handbag, umbrella, tie, suitcase |
| Appliances | microwave, oven, refrigerator, sink, toaster |
| Misc | book, clock, vase, scissors, toothbrush, teddy bear |
Add a screen-recording GIF here after your first run.
[GIF placeholder]
To convert a screen recording to GIF (see ffmpeg command in the section below):
ffmpeg -i demo.mov -vf "fps=12,scale=960:-1:flags=lanczos,split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse" -loop 0 demo.gifobject-detection-app/
├── app.py # Gradio Blocks UI (3 tabs)
├── detector.py # YOLOv8 inference + box drawing
├── utils.py # Image conversion, resize, display helpers
├── generate_samples.py # One-time script to create sample images
├── requirements.txt
└── sample_images/
├── street.jpg
├── kitchen.jpg
└── office.jpg
MIT © 2024