Fix: Shorten embedding filenames using model_id_to_filename (#122)#123
Fix: Shorten embedding filenames using model_id_to_filename (#122)#123Rahul29999 wants to merge 1 commit intoIBM:mainfrom
Conversation
Signed-off-by: Rahulkumarsharma01 <20je0749@iitism.ac.in>
7437f29 to
b332bf1
Compare
|
Hi @cassiasamp , I noticed the acknowledgment on the issue thread 👍. Just checking in here on the PR (#123) to see if it’s ready for merge or if you’d like me to make any further changes. |
|
Hi @Rahul29999, in the notebook I reviewed there was no change regarding the json filenames. Just to be sure, am I looking at the correct notebook? There is also an specification about the issue, that I am probably going to fix because that might have been confusing.. but the issue mainly has to do with the generated and translated json files. I believe there can also be an update to the |
|
Hi @Rahul29999, I'm temporarily unassigning the #122 issue due to doubts regarding this PR. If there are any updates, I can reassign it later ✌️ |
|
Hi @cassiasamp , I have successfully updated the logic in populate_embeddings.ipynb to use model_id_to_filename for shortened filenames. I have also ensured the commit is signed off, and the DCO check is now green. Please review the changes and reassign the issue #122 to me. |
Description
Fixes Issue #122 by shortening embedding output filenames using a simplified function
model_id_to_filename(), which extracts the final part of the HuggingFace model ID.Changes Made
model_id_to_filename()function to extract short model namescookbook/populate_embeddings.ipynbOutcome
This resolves the issue of long file names like:
prompt_sentences-sentence-transformers-all-MiniLM-L6-v2.jsonand converts them to shorter ones like:
prompt_sentences-all-minilm-l6-v2.jsonLet me know if further refinements are needed!