Conversation
331456d to
8133017
Compare
Unit Tests Summary 1 files 274 suites 14m 40s ⏱️ Results for commit 3826377. ♻️ This comment has been updated with latest results. |
Unit Test Performance DifferenceTest suite performance difference
Additional test case details
Results for commit 4d40a27 ♻️ This comment has been updated with latest results. |
|
hi @munoztd0 , thanks for submitting the PR, i would suggest to snapshot the ascii tables instead of rtf, the table is much more readable for comparing on github, rtf is less ideal in this case. i am guessing you would like to use rtf due to output formatting? |
@shajoezhu I 100% agree but you're also right on your guess. If that is a blocker I could put it up the food chain but I have no hope for that. |
|
New error on tsfae13 appeared but only on the CRAN pass, still need to investigate |
|
@munoztd0 https://github.com/insightsengineering/scda.test/blob/main/tests/testthat/test-table_jnj_tbl1.R |
|
i suggest we not include rtf files here. otherwise it is going to fail always if minor changes which is not efficent, and we wont be able to test for anything. i think just create simple static snapshot is good. thanks guys! |
I could do that but that would be quite a mess and also not have the same names that in internal, but lets discuss it. |
There was a problem hiding this comment.
Thanks for the PR! I would like to echo Joe about rtf files. It seems a bit of a fragile test considering that not everyone has the same rtf software on their machine. I strongly suggest to snapshot-test the final table and not the rtf file. Decoration details can be tested as strings if not included directly in the table.
It is important to have readable snapshots as possible changes are easier to understand and review that way. Furthermore, simple changes may have a large line changes count that would add to the .git file size. Reducing that size down after some time would be a non-trivial task.
I 100% agree, I will try to discuss that internally and see if I can make a case for it.. Changing the PR to draft for now, |
|
Hi @shajoezhu @Melkiades, The rtf files are snapshotted as text, rather than as binary files. Specifically with Note this prevents issues issues with OS-specific line endings (guess why this is how we test them...). This should prevent the most egregious case of Furthermore, these files are not generated by third party libraries with OS-specific behaviors, but rather by concatenation of strings in {tidytlg} (this has its own issues, and I have ... feelings about it, but exact text brittleness is not one of them). The issue with just testing the table is that the because J&J doesn't use ASCII for their submissions, the table is insufficient to prevent regressions from our submission-requirement specs. If I had my way this wouldn't be an issue but it's multiple orders of magnitude above my pay grade and simply put, 100% not going to change. I'd like to provisionally include the rtf files in the tests and then if we are getting (unexpected) changes we can take the full snapshots out and fall back to an imperfect backup plan. My issue with doing string checking for the (unfortunately for us) mandatory decorations/formatting is that that is likely to be substantially more brittle and prone to false negatives (ie tests passing when they shouldn't) than the snapshot tests. We can keep this in our pocket as a back up, as I alluded to above, but I'd like to try the better solution first, with the expectation that it won't have the Let me know what you think. I'll also discuss this with you, Joe, when we meet next if desired. |
|
hi @gmbecker and @munoztd0 , I would like to request the following test, and better understand the limitation (if there is any) using rtf and see how much impact that the future result I would like to propose several tests and branch off from this jnj_templates_scripts, and propose merge request, and see the differences of changes
or above should be in 4 seperate branches, so it would be eaiser for us to investigate |
|
hi @munoztd0 , thanks for the PRs, I can see you have put in a lot of efforts into these test cases. i was wodnering if you can update with your rtf snapshot changes please? |
|
Hi @gmbecker and @shajoezhu, thanks for the detailed set of proposed tests. First off: sorry for the mess/noise and all the notifications. Managing multiple PRs plus the CI/CD runs ended up being a bit of a nightmare 😅, but I’ve now completed the requested investigations and kept each experiment isolated to make review easier. PRs created (one per test)1) Template-level rounding change decimals
2)
|
|
hi @munoztd0 , this is the final PR i think? can you please fix the style. thanks |
8008d1a to
c6d3c9f
Compare
Yes indeed, I rebased on main, cleaned up commits history and reran the pipelines, everything good now @gmbecker @shajoezhu @Melkiades |
As discussed we want to integrate JnJ innovative medecine newly release open-source TLG catalog scripts.
Key finding
Across these experiments, we confirmed that RTF snapshot diffs are highly sensitive to computed column widths, resulting in a lot of false positives even when the intended change is only numeric formatting.
Mitigation implemented
To reduce these false positives, I computed the column widths ahead of time and hardcode them into our tests, so width calculation doesn’t fluctuate between runs/changes.