Use tidyverse style and testthat 3e#87
Conversation
Made changes as per [this vignette](https://testthat.r-lib.org/articles/third-edition.html). Ran all tests after making changes and they passed.
Otherwise building the vignettes fails.
Reformat code and documentation to not exceed 80 characters
Also added in aliases to the old camel case versions of the code that have an appropriate deprecation notice.
Ensured that lines do not (for the most part) exceed 80 characters. And, that function names and variable names are snake case rather than camel case. Also added deprecation notices for old function aliases.
To new snake_case versions
Ensured that lines do not exceed 80 characters. Changed function names and (most) variables to snake case. Added in aliases for old functions with deprecation notices.
Previous commit accidentally deleted part of the Dtm2DocsC function call.
According to tidyverse style commented out code should be removed.
As per tidyverse style
Also added alias functions that provide a deprecation notice.
And some changes to update.lda_topic_model to make the function declaration fit in the 80 character limit.
Primarily not defning which package provides certain functions and using `seq_len` instead of `1:n`
Also added an alias function to keep code backwards compatible for now
Including snake case, 80 char limits. Also added aliases of the old names.
This reverts commit d1ff54f. Shouldn't have edited that by hand...
executed (at leas that I could find). Instead we just keep the old names as alternatives, rather than trying to be too fancy yet.
which function we are using and which package it's from.
|
Ah. So it’s that moved into a different (forthcoming) package. https://tommyjones/tidylda
Not opposed to adding more tidy in textmineR, but I do want to preserve the API’s backwards compatibility.
Once tidylda hits CRAN, I’m going to have textmineR wrap it for LDA.
Tommy
…On Jul 3, 2021, 6:22 PM -0400, Mattias ***@***.***>, wrote:
I could have sworn there was a project or similar on the main repo that said about making this more in-line with the tidyverse style. I may have imagined it, but in case that is still relevant then these changes are primarily formatting changes to the code, using a combination of styler and manual work (there was probably a cleverer way of doing this).
Major changes (i.e. not just formatting) include the change of user-facing function names to be closer to the tidyverse style, argument names have not been changed to maintain backwards compatibility. Aliases with the old function names have also been added to make sure that previous code won't suddenly break.
There may be something to add as.character(sys.call(sys.parent()))[1L])as.character(sys.call(sys.parent()))[1L]) in the call to check if the old name is being used and displaying a warning or message if that's the case.
I'll keep this as a draft for now, in case the whole tidyverse style isn't relevant anymore though.
You can view, comment on, or merge this pull request online at:
#87
Commit Summary
• Updated tests and DESCRIPTION to use testthat 3e
• Adding markdown to suggests
• Changes made from running styler::style_pkg()
• Updated corpus_functions to tidyverse style line limit
• Updated corpus_functions to be snake case instead of camel case
• Rebuild documentation
• Updated evaluation_metrics to tidyverse styleguide
• Changed calls to CalcProbCoherence and CalcTopicModelR2 calls
• Updated topic_modeling_utilities to conform to tidyverse style
• Built documentation
• Updated examples to use new snake case variables
• Updated internal call to Dtm2Tcm from camel to snake version
• Fixed spelling misstake
• Added @param ... to old camel case alias versions.
• Removed commented out code
• Rebuilt documentation
• Removed some unneeded whitespace
• Made tropic_modeling_core not exceed 80 characters
• Renamed topic_modeling_core functions to be snake_case
• Some more 80 character finicking with the documentation
• Removed some commented code and removed some superflous whitespaces
• Reformatted Matrix call to fit in the 80 characters.
• Silencing some lintr errors.
• Renamed R calls to C functions to be snake_case
• Updated other_utilites to closer follow the tidyverse style
• Updated calls to tm_parallel_apply to the new name
• Updated distance_functions to be tidyverse compliant
• Revert "Renamed R calls to C functions to be snake_case"
• Removed aliases as .Deprecated doesn't allow the function to be...
• Rebuilt documentation
• Fixing the C calls to be back to orignal, tests now pass again.
• Rebuilt documentation
• Added examples to seperate files and linked them in the roxygen docs
• Included textmineR:: specification to be more specific about...
• Exceeded the 80 character limit, so reformatted this call
• Moved examples folder into the R directory
• Fixed call to Hellinger_cpp in distance_functions
• Use the snake_case in the example for calc_hellinger_dist
File Changes
• M DESCRIPTION (6)
• M NAMESPACE (22)
• M R/corpus_functions.R (595)
• M R/distance_functions.R (148)
• M R/evaluation_metrics.R (528)
• A R/examples/calc_gamma.R (8)
• A R/examples/calc_hellinger_dist.R (6)
• A R/examples/calc_js_divergence.R (6)
• A R/examples/calc_likelihood.R (12)
• A R/examples/calc_prob_coherence.R (6)
• A R/examples/calc_topic_model_r2.R (13)
• A R/examples/cluster_2_topic_model.R (10)
• A R/examples/create_dtm.R (17)
• A R/examples/create_tcm.R (17)
• A R/examples/dtm_2_docs.R (11)
• A R/examples/dtm_2_lexicon.R (9)
• A R/examples/dtm_2_tcm.R (3)
• A R/examples/fit_ctm_model.R (31)
• A R/examples/fit_lda_model.R (23)
• A R/examples/fit_lsa_model.R (14)
• A R/examples/get_probable_terms.R (15)
• A R/examples/get_top_terms.R (6)
• A R/examples/label_topics.R (13)
• A R/examples/posterior.lda_topic_model.R (9)
• A R/examples/predict.ctm_topic_model.R (12)
• A R/examples/predict.lda_topic_model.R (26)
• A R/examples/predict.lsa_topic_model.R (15)
• A R/examples/summarize_topics.R (3)
• A R/examples/term_doc_freq.R (8)
• A R/examples/tm_parallel_apply.R (5)
• A R/examples/update.lda_topic_model.R (50)
• M R/other_utilities.R (96)
• M R/textmineR-deprecated.R (4)
• M R/textmineR.R (2)
• M R/topic_modeling_core.R (1369)
• M R/topic_modeling_utilities.R (312)
• R man/calc_gamma.Rd (29)
• R man/calc_hellinger_dist.Rd (27)
• R man/calc_js_divergence.Rd (27)
• R man/calc_likelihood.Rd (41)
• R man/calc_prob_coherence.Rd (20)
• R man/calc_topic_model_r2.Rd (42)
• R man/cluster_2_topic_model.Rd (26)
• R man/create_dtm.Rd (87)
• R man/create_tcm.Rd (104)
• R man/dtm_2_docs.Rd (29)
• R man/dtm_2_lexicon.Rd (23)
• R man/dtm_2_tcm.Rd (18)
• R man/fit_ctm_model.Rd (70)
• R man/fit_lda_model.Rd (74)
• R man/fit_lsa_model.Rd (28)
• R man/get_probable_terms.Rd (36)
• R man/get_top_terms.Rd (21)
• R man/label_topics.Rd (22)
• M man/posterior.lda_topic_model.Rd (19)
• M man/predict.ctm_topic_model.Rd (15)
• M man/predict.lda_topic_model.Rd (40)
• M man/predict.lsa_topic_model.Rd (11)
• R man/summarize_topics.Rd (47)
• R man/term_doc_freq.Rd (18)
• M man/textmineR.Rd (2)
• R man/tm_parallel_apply.Rd (26)
• M man/update.lda_topic_model.Rd (104)
• M tests/spelling.R (9)
• M tests/testthat.R (2)
• M tests/testthat/test-corpus_functions.R (181)
• M tests/testthat/test-distance_functions.R (50)
• M tests/testthat/test-topic_modeling_core.R (497)
• M tests/testthat/test-topic_modeling_utilities.R (54)
Patch Links:
• https://github.com/TommyJones/textmineR/pull/87.patch
• https://github.com/TommyJones/textmineR/pull/87.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
|
Appreciate the contributions, BTW. You’re motivating me to write down plans to get them out of my head. 😅
Tommy
…On Jul 3, 2021, 6:22 PM -0400, Mattias ***@***.***>, wrote:
I could have sworn there was a project or similar on the main repo that said about making this more in-line with the tidyverse style. I may have imagined it, but in case that is still relevant then these changes are primarily formatting changes to the code, using a combination of styler and manual work (there was probably a cleverer way of doing this).
Major changes (i.e. not just formatting) include the change of user-facing function names to be closer to the tidyverse style, argument names have not been changed to maintain backwards compatibility. Aliases with the old function names have also been added to make sure that previous code won't suddenly break.
There may be something to add as.character(sys.call(sys.parent()))[1L])as.character(sys.call(sys.parent()))[1L]) in the call to check if the old name is being used and displaying a warning or message if that's the case.
I'll keep this as a draft for now, in case the whole tidyverse style isn't relevant anymore though.
You can view, comment on, or merge this pull request online at:
#87
Commit Summary
• Updated tests and DESCRIPTION to use testthat 3e
• Adding markdown to suggests
• Changes made from running styler::style_pkg()
• Updated corpus_functions to tidyverse style line limit
• Updated corpus_functions to be snake case instead of camel case
• Rebuild documentation
• Updated evaluation_metrics to tidyverse styleguide
• Changed calls to CalcProbCoherence and CalcTopicModelR2 calls
• Updated topic_modeling_utilities to conform to tidyverse style
• Built documentation
• Updated examples to use new snake case variables
• Updated internal call to Dtm2Tcm from camel to snake version
• Fixed spelling misstake
• Added @param ... to old camel case alias versions.
• Removed commented out code
• Rebuilt documentation
• Removed some unneeded whitespace
• Made tropic_modeling_core not exceed 80 characters
• Renamed topic_modeling_core functions to be snake_case
• Some more 80 character finicking with the documentation
• Removed some commented code and removed some superflous whitespaces
• Reformatted Matrix call to fit in the 80 characters.
• Silencing some lintr errors.
• Renamed R calls to C functions to be snake_case
• Updated other_utilites to closer follow the tidyverse style
• Updated calls to tm_parallel_apply to the new name
• Updated distance_functions to be tidyverse compliant
• Revert "Renamed R calls to C functions to be snake_case"
• Removed aliases as .Deprecated doesn't allow the function to be...
• Rebuilt documentation
• Fixing the C calls to be back to orignal, tests now pass again.
• Rebuilt documentation
• Added examples to seperate files and linked them in the roxygen docs
• Included textmineR:: specification to be more specific about...
• Exceeded the 80 character limit, so reformatted this call
• Moved examples folder into the R directory
• Fixed call to Hellinger_cpp in distance_functions
• Use the snake_case in the example for calc_hellinger_dist
File Changes
• M DESCRIPTION (6)
• M NAMESPACE (22)
• M R/corpus_functions.R (595)
• M R/distance_functions.R (148)
• M R/evaluation_metrics.R (528)
• A R/examples/calc_gamma.R (8)
• A R/examples/calc_hellinger_dist.R (6)
• A R/examples/calc_js_divergence.R (6)
• A R/examples/calc_likelihood.R (12)
• A R/examples/calc_prob_coherence.R (6)
• A R/examples/calc_topic_model_r2.R (13)
• A R/examples/cluster_2_topic_model.R (10)
• A R/examples/create_dtm.R (17)
• A R/examples/create_tcm.R (17)
• A R/examples/dtm_2_docs.R (11)
• A R/examples/dtm_2_lexicon.R (9)
• A R/examples/dtm_2_tcm.R (3)
• A R/examples/fit_ctm_model.R (31)
• A R/examples/fit_lda_model.R (23)
• A R/examples/fit_lsa_model.R (14)
• A R/examples/get_probable_terms.R (15)
• A R/examples/get_top_terms.R (6)
• A R/examples/label_topics.R (13)
• A R/examples/posterior.lda_topic_model.R (9)
• A R/examples/predict.ctm_topic_model.R (12)
• A R/examples/predict.lda_topic_model.R (26)
• A R/examples/predict.lsa_topic_model.R (15)
• A R/examples/summarize_topics.R (3)
• A R/examples/term_doc_freq.R (8)
• A R/examples/tm_parallel_apply.R (5)
• A R/examples/update.lda_topic_model.R (50)
• M R/other_utilities.R (96)
• M R/textmineR-deprecated.R (4)
• M R/textmineR.R (2)
• M R/topic_modeling_core.R (1369)
• M R/topic_modeling_utilities.R (312)
• R man/calc_gamma.Rd (29)
• R man/calc_hellinger_dist.Rd (27)
• R man/calc_js_divergence.Rd (27)
• R man/calc_likelihood.Rd (41)
• R man/calc_prob_coherence.Rd (20)
• R man/calc_topic_model_r2.Rd (42)
• R man/cluster_2_topic_model.Rd (26)
• R man/create_dtm.Rd (87)
• R man/create_tcm.Rd (104)
• R man/dtm_2_docs.Rd (29)
• R man/dtm_2_lexicon.Rd (23)
• R man/dtm_2_tcm.Rd (18)
• R man/fit_ctm_model.Rd (70)
• R man/fit_lda_model.Rd (74)
• R man/fit_lsa_model.Rd (28)
• R man/get_probable_terms.Rd (36)
• R man/get_top_terms.Rd (21)
• R man/label_topics.Rd (22)
• M man/posterior.lda_topic_model.Rd (19)
• M man/predict.ctm_topic_model.Rd (15)
• M man/predict.lda_topic_model.Rd (40)
• M man/predict.lsa_topic_model.Rd (11)
• R man/summarize_topics.Rd (47)
• R man/term_doc_freq.Rd (18)
• M man/textmineR.Rd (2)
• R man/tm_parallel_apply.Rd (26)
• M man/update.lda_topic_model.Rd (104)
• M tests/spelling.R (9)
• M tests/testthat.R (2)
• M tests/testthat/test-corpus_functions.R (181)
• M tests/testthat/test-distance_functions.R (50)
• M tests/testthat/test-topic_modeling_core.R (497)
• M tests/testthat/test-topic_modeling_utilities.R (54)
Patch Links:
• https://github.com/TommyJones/textmineR/pull/87.patch
• https://github.com/TommyJones/textmineR/pull/87.diff
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Ah I see! Looks like I got a bit carried away on this part 😄.
Happy to! It's a really useful package and probably the main reason I started using R for text analysis / mining and topic modelling, so glad I can give back a bit. Looks like the checks passed so I'm gonna mark as ready for review; again, all previous API calls have been preserved, I did initially go down the route of marking them as |
|
I've been meaning to pull together a textmineR 4.0 branch. I think it's a good idea to put your two PRs in there. IMO, that's the best way forward. A major version update like that will require some telegraphing through incremental CRAN updates to give users time to prepare for major changes. I'll try to do that this weekend. Realistically, it might take a little while. (Sorry I haven't gotten to your PRs yet.) This is a holiday weekend in the States and I'm getting ready for a couple weeks holiday. So I've been in a sprint at work to prepare. But I'll get there. I do appreciate the motivation to improve textmineR. :) |
|
I'm happy with them going into the v4 branch; the Happy fourth of July btw and hope you enjoy your holiday! |
|
Ok. Back to work! Some news: tidylda is now on CRAN. So there is a clear path (I believe) for most of what I hope to do with a textmineR v4.0 release. I owe you a written down plan and will figure out how to move this PR and the mvrsquared PR to a v4.0 branch. Thanks again for the leg work you've put in! |
|
Woo! Welcome back, hope you had a good holiday :) That sounds fantastic, guess I'll have to dive through tidylda now as well; been meaning to get more familiar with tidymodels as well so this is a great excuse. Best, |
I could have sworn there was a project or similar on the main repo that said about making this more in-line with the tidyverse style. I may have imagined it, but in case that is still relevant then these changes are primarily formatting changes to the code, using a combination of
stylerand manual work (there was probably a cleverer way of doing this).Major changes (i.e. not just formatting) include the change of user-facing function names to be closer to the tidyverse style, argument names have not been changed to maintain backwards compatibility. Aliases with the old function names have also been added to make sure that previous code won't suddenly break.
There may be something to add
as.character(sys.call(sys.parent()))[1L])as.character(sys.call(sys.parent()))[1L])in the call to check if the old name is being used and displaying a warning or message if that's the case.I'll keep this as a draft for now, in case the whole tidyverse style isn't relevant anymore though.Looks like the checks are run anyway so might as well not have it as a draft...