Skip to content

Use tidyverse style and testthat 3e#87

Open
Matt-Int wants to merge 38 commits intoTommyJones:mainfrom
Matt-Int:use_tidyverse_style
Open

Use tidyverse style and testthat 3e#87
Matt-Int wants to merge 38 commits intoTommyJones:mainfrom
Matt-Int:use_tidyverse_style

Conversation

@Matt-Int
Copy link
Copy Markdown
Contributor

@Matt-Int Matt-Int commented Jul 3, 2021

I could have sworn there was a project or similar on the main repo that said about making this more in-line with the tidyverse style. I may have imagined it, but in case that is still relevant then these changes are primarily formatting changes to the code, using a combination of styler and manual work (there was probably a cleverer way of doing this).

Major changes (i.e. not just formatting) include the change of user-facing function names to be closer to the tidyverse style, argument names have not been changed to maintain backwards compatibility. Aliases with the old function names have also been added to make sure that previous code won't suddenly break.

There may be something to add as.character(sys.call(sys.parent()))[1L])as.character(sys.call(sys.parent()))[1L]) in the call to check if the old name is being used and displaying a warning or message if that's the case.

I'll keep this as a draft for now, in case the whole tidyverse style isn't relevant anymore though. Looks like the checks are run anyway so might as well not have it as a draft...

Matt-Int added 30 commits July 1, 2021 22:21
Made changes as per [this vignette](https://testthat.r-lib.org/articles/third-edition.html).

Ran all tests after making changes and they passed.
Otherwise building the vignettes fails.
Reformat code and documentation to not exceed 80 characters
Also added in aliases to the old camel case versions of the code that have an appropriate deprecation notice.
Ensured that lines do not (for the most part) exceed 80 characters. And, that function names and variable names are
snake case rather than camel case. Also added deprecation notices for old function aliases.
Ensured that lines do not exceed 80 characters.

Changed function names and (most) variables to snake case. Added in aliases for old functions with
deprecation notices.
Previous commit accidentally deleted part of the Dtm2DocsC function call.
According to tidyverse style commented out code should be removed.
Also added alias functions that provide a deprecation notice.
And some changes to update.lda_topic_model to make the function declaration fit in the 80 character limit.
Primarily not defning which package provides certain functions and using `seq_len` instead of `1:n`
Also added an alias function to keep code backwards compatible for now
Including snake case, 80 char limits.

Also added aliases of the old names.
This reverts commit d1ff54f.

Shouldn't have edited that by hand...
executed (at leas that I could find).

Instead we just keep the old names as alternatives, rather than trying to be too fancy yet.
@TommyJones
Copy link
Copy Markdown
Owner

TommyJones commented Jul 3, 2021 via email

@TommyJones
Copy link
Copy Markdown
Owner

TommyJones commented Jul 3, 2021 via email

@Matt-Int
Copy link
Copy Markdown
Contributor Author

Matt-Int commented Jul 3, 2021

Ah. So it’s that moved into a different (forthcoming) package. https://tommyjones/tidylda
Not opposed to adding more tidy in textmineR, but I do want to preserve the API’s backwards compatibility.
Once tidylda hits CRAN, I’m going to have textmineR wrap it for LDA.

Ah I see!

Looks like I got a bit carried away on this part 😄.

Appreciate the contributions, BTW. You’re motivating me to write down plans to get them out of my head. 😅

Happy to! It's a really useful package and probably the main reason I started using R for text analysis / mining and topic modelling, so glad I can give back a bit.

Looks like the checks passed so I'm gonna mark as ready for review; again, all previous API calls have been preserved, I did initially go down the route of marking them as .Deprecated but went with a simple alias instead, so no previous code should be adversely affected.

@Matt-Int Matt-Int marked this pull request as ready for review July 3, 2021 22:43
@TommyJones
Copy link
Copy Markdown
Owner

I've been meaning to pull together a textmineR 4.0 branch. I think it's a good idea to put your two PRs in there. IMO, that's the best way forward. A major version update like that will require some telegraphing through incremental CRAN updates to give users time to prepare for major changes.

I'll try to do that this weekend. Realistically, it might take a little while. (Sorry I haven't gotten to your PRs yet.) This is a holiday weekend in the States and I'm getting ready for a couple weeks holiday. So I've been in a sprint at work to prepare. But I'll get there. I do appreciate the motivation to improve textmineR. :)

@Matt-Int
Copy link
Copy Markdown
Contributor Author

Matt-Int commented Jul 4, 2021

I'm happy with them going into the v4 branch; the mvrsquared one is pretty non-invasive though, especially when compared to this one! 😉

Happy fourth of July btw and hope you enjoy your holiday!

@TommyJones
Copy link
Copy Markdown
Owner

Ok. Back to work! Some news: tidylda is now on CRAN. So there is a clear path (I believe) for most of what I hope to do with a textmineR v4.0 release. I owe you a written down plan and will figure out how to move this PR and the mvrsquared PR to a v4.0 branch. Thanks again for the leg work you've put in!

@Matt-Int
Copy link
Copy Markdown
Contributor Author

Woo! Welcome back, hope you had a good holiday :)

That sounds fantastic, guess I'll have to dive through tidylda now as well; been meaning to get more familiar with tidymodels as well so this is a great excuse.

Best,
Mattias

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants