Vocab.txt file is not provided in the repo, hence I created a vocab.txt file on my own. First try, I put all dictionary words in the file, but that really slowed down the computation and crashed my laptop as there were many words. Second try, I made a vocab file containing the top 10,000 English words but in that case also many words in the train dataset were not recognised as they are not part of the vocab file, which is giving suboptimal results. Can you please let me know how to create the vocab.txt file or if you can share the vocab.txt file you used that will be great. Thanks a lot!
Vocab.txt file is not provided in the repo, hence I created a vocab.txt file on my own. First try, I put all dictionary words in the file, but that really slowed down the computation and crashed my laptop as there were many words. Second try, I made a vocab file containing the top 10,000 English words but in that case also many words in the train dataset were not recognised as they are not part of the vocab file, which is giving suboptimal results. Can you please let me know how to create the vocab.txt file or if you can share the vocab.txt file you used that will be great. Thanks a lot!