Add AI to describe each image in a single sentence

https://huggingface.co/vikhyatk/moondream2 integration for Damselfly would be great. When given a prompt such as:

> Describe this image and its style in a very detailed manner, follow the format of describing: what, who, where, when, how. You don't need to fill in all if they are irrelevant. Please remove What, Who, Where, When, How prefixes and make it one sentence.

... and fed a photo, you might get back:

> A woman with blonde hair walks along a beach, her back to the camera, with the ocean and mountains visible in the background.

What you then do is output a json file with the descriptions, tags and URLs to the photos. You submit that to an agentic AI with the prompt:

> Examine the photos containing women with blonde hair taken in the year 2025 by downloading and inspecting the image linked per entry. List only those entries containing a billboard with the text "Coca Cola" on them and where a red setter dog was present.

I'm sure you can see the utility when you have a very large photo collection and need to reduce your search space. If you have access to a Mac, the 'Draw Things' app lets you run 'Investigate' AIs on images so you can test this for yourself.

Moondream2 needs about 4 Gb of RAM to run, it will be a lot more costly than any of the AI you use so far, but if you leave it running long enough it'll get there and it's a one time processing cost.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add AI to describe each image in a single sentence #553

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Add AI to describe each image in a single sentence #553

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions