Hi all,
README has python -m nltk.downloader all
grep for nltk gives:
./lexmapr/predefined_resources/README.txt:import nltk
./lexmapr/predefined_resources/README.txt:from nltk.tokenize import sent_tokenize, word_tokenize
./lexmapr/predefined_resources/README.txt:from nltk import pos_tag, ne_chunk
./lexmapr/pipeline_classification.py:from nltk import word_tokenize
./lexmapr/pipeline.py:from nltk.tokenize import word_tokenize
./lexmapr/pipeline_resources.py:from nltk import word_tokenize
./lexmapr/pipeline_helpers.py:from nltk.tokenize import word_tokenize
./lexmapr/pipeline_helpers.py:from nltk.tokenize.treebank import TreebankWordDetokenizer
./lexmapr/pipeline_helpers.py:from nltk import pos_tag
Looks like you are using less than the majority of the ~100 data items provided by NLTK.
Would be nice to have in the README only the nltk data items that are necessary.
Hi all,
README has
python -m nltk.downloader allgrep for nltk gives:
Looks like you are using less than the majority of the ~100 data items provided by NLTK.
Would be nice to have in the README only the nltk data items that are necessary.