Skip to content

Vectorize Docs for RAG Chatbot#320

Closed
patriciaflutterflow wants to merge 2 commits intomainfrom
patricia/VectorizeDocsRAGChatbot
Closed

Vectorize Docs for RAG Chatbot#320
patriciaflutterflow wants to merge 2 commits intomainfrom
patricia/VectorizeDocsRAGChatbot

Conversation

@patriciaflutterflow
Copy link
Copy Markdown
Contributor

@patriciaflutterflow patriciaflutterflow commented Apr 17, 2025

Link to RAG Chatbot PRD

There are 2 tables in BigQuery associated with documentation for the RAG Chatbot:

  1. doc_text (includes doc id + doc text)
  2. doc_text_vector (includes doc id + chunk id + vectorized chunk + chunk text)
    The code here ultimately processes each documentation page to write to the tables in BigQuery.

What happens:

  • Github action run everytime there is a change in a markdown file in docs (vectorize_docs.yaml), function is process_single_file in vectorize.py
  • the markdown file is chunked and each chunk is vectorized
  • these changes are added to bigquery
  • there is also a function that does a backfill for all existing docs

ToDo: right now, I am getting a 429 error quota exceeded after vectorizing sometimes. Check to see what quota limits are and how to get around it.

@bolt-new-by-stackblitz
Copy link
Copy Markdown

Review PR in StackBlitz Codeflow Run & review this pull request in StackBlitz Codeflow.

@github-actions github-actions Bot requested a review from PoojaB26 April 17, 2025 06:38
@PoojaB26
Copy link
Copy Markdown
Collaborator

Is there a ticket to this or maybe an explanation of what it does? @patriciaflutterflow

@patriciaflutterflow patriciaflutterflow changed the title added logic to vectorize docs and github action Vectorize Docs for RAG Chatbot Apr 17, 2025
@patriciaflutterflow
Copy link
Copy Markdown
Contributor Author

Is there a ticket to this or maybe an explanation of what it does? @patriciaflutterflow

Hello @PoojaB26 ! Yes added a brief overview at the top

@PoojaB26
Copy link
Copy Markdown
Collaborator

Can we close this ticket? @patriciaflutterflow

@patriciaflutterflow
Copy link
Copy Markdown
Contributor Author

@PoojaB26 if it's alright with you can I leave it open for now? I may create a new branch for this but using a lot of the existing code here. Basically, this is to update the database for the chatbot automatically every time new documentation is added or when existing documentation is changed

@PoojaB26
Copy link
Copy Markdown
Collaborator

@patriciaflutterflow I mean the branch can be open, but will this PR be merged anytime soon? We can just close the PR and keep the branch for you to compare code?

@patriciaflutterflow
Copy link
Copy Markdown
Contributor Author

Ohhhh I gotchu sounds good I'll close it :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants