Skip to content

MVP design #2

@DefenderOfBasic

Description

@DefenderOfBasic

by @volkyeth

quick first draft of the architecture for this MVP

documents are any piece of content in this context (a tweet, a blog post, etc)

  • We'll use Gemini 2.0 flash to do semantic chunking of the documents (extract the main ideas of the document and generate standalone excerpts with all the required context)
    It's fast, cheap and has great performance, and also one of the biggest context windows out there (1M tokens)

  • We'll use Gemini Embeddings, which is a new SOTA embeddings model that was just released, with all the bells and whistles

  • Planning on using SQLite for simplicity of the local setup

  • Instead of rolling our own UI, we'll start off by using Nomic Atlas which allows us to browse, query and enrich the dataset. This will be the fastest way to start putting the semantic engine to use and to discover how we can improve.

Defender will be able to use it to improve his workflow and start connecting more researchers working in similar fields, and he'll also be able to manually add tags, investigate patterns and figure out our blind spots so we can keep iterating and improving the semantic engine

Image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions