-
Notifications
You must be signed in to change notification settings - Fork 93
Open
Description
Having started to use this package coming from quanteda in R, I feel some convenience features could greatly streamline common workflows:
- Top features in corpus/subset of corups/document
- Filtering DTMs by eg. proportion of documents that contain a word, or a regex pattern
- Clearer documentation around
DocumentTermMatrixvsdtm()functions, eg. on the ordering of terms - Coding a document term matrix with e.g. a sentiment dictionary (cf.
quanteda.dictionaries) package inR
I'd be happy to contribute PRs for these, if any/all are desired functionality. That said, I'm quite new to the ecosystem so if I'm missing something do let me know!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels