MVP design

_by @volkyeth_

quick first draft of the architecture for this MVP

documents are any piece of content in this context (a tweet, a blog post, etc)

- We'll use [Gemini 2.0 flash](https://ai.google.dev/gemini-api/docs/models#gemini-2.0-flash) to do semantic chunking of the documents (extract the main ideas of the document and generate standalone excerpts with all the required context)
It's fast, cheap and has great performance, and also one of the biggest context windows out there (1M tokens)

- We'll use [Gemini Embeddings](https://developers.googleblog.com/en/gemini-embedding-text-model-now-available-gemini-api/), which is a new SOTA embeddings model that was just released, with all the bells and whistles

- Planning on using SQLite for simplicity of the local setup

- Instead of rolling our own UI, we'll start off by using [Nomic Atlas](https://atlas.nomic.ai/) which allows us to browse, query and enrich the dataset. This will be the fastest way to start putting the semantic engine to use and to discover how we can improve.

Defender will be able to use it to improve his workflow and start connecting more researchers working in similar fields, and he'll also be able to manually add tags, investigate patterns and figure out our blind spots so we can keep iterating and improving the semantic engine

![Image](https://github.com/user-attachments/assets/7fa4f653-2bd1-4e65-8c1d-e28830ed2cf1)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MVP design #2

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

MVP design #2

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions