Fix path dependent weighting and accept data frames#31
Merged
Conversation
There was a problem hiding this comment.
Pull Request Overview
This PR updates glex to version 0.5.1 by adding data frame input support, improving the path-dependent weighting algorithm, and expanding tests and documentation.
- Add support for data frame inputs in
glex.xgb.Boosterandglex.ranger, with factor-column checks and matrix conversion - Compute node covers dynamically in C++ (
calculate_data_driven_covers) and pass them torecursePathDependent - Add tests for data frame inputs and update documentation, NEWS, and README
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/testthat/test-glex-dataframe.R | New tests verifying that glex() yields identical results on matrix vs. data frame inputs |
| src/path_dependent_explain_tree.cpp | Implemented calculate_data_driven_covers, updated recursePathDependent to use covers |
| R/glex.R | Added input checks for factor columns and conversion of data frames to matrices |
| README.md | Added reference to new paper on fast PD estimation |
| NEWS.md | Bumped version to 0.5.1 and added release notes |
| DESCRIPTION | Updated package version from 0.5.0 to 0.5.1 |
Comments suppressed due to low confidence (1)
src/path_dependent_explain_tree.cpp:95
- The function is still invoked as a template (
recursePathDependent<...>) but its definition no longer has atemplate<typename Comparison>declaration. Re-add the template header before the function signature or remove the template arguments in calls to ensure it compiles.
NumericMatrix recursePathDependent(
e342afb to
4952eea
Compare
4952eea to
1fb8e9b
Compare
jemus42
requested changes
Jun 5, 2025
Collaborator
jemus42
left a comment
There was a problem hiding this comment.
👍🏻 looks good aside from the README.Rmd thing
Collaborator
There was a problem hiding this comment.
Please don't modify README.md directly - it is generated from README.Rmd. Usually there also should be a pre-commit hook to make it impossible to only ever commit one of the two files but I guess that's not installed automatically.
MHiabu
approved these changes
Jun 6, 2025
MHiabu
approved these changes
Jun 6, 2025
…umns in data frames and update example usage in documentation.
94ff343 to
6003c42
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces a new version (0.5.1) of the
glexpackage, focusing on improving functionality, addressing input handling, and enhancing the algorithm's robustness. Key changes include support for data frame inputs, stricter input validation, and a more robust path-dependent algorithm. Additionally, new tests ensure the correctness of these updates.Version Update and Documentation:
DESCRIPTIONto reflect the new version (0.5.1).NEWS.md, highlighting the fixes to the path-dependent algorithm and support for data frame inputs inglex().Input Handling Improvements:
glex.xgb.Boosterandglex.rangerfunctions inR/glex.Rto accept data frames as input, converting them to matrices when necessary. Added validation to ensure no factor columns are present, with clear error messages. [1] [2]Algorithm Enhancements:
src/path_dependent_explain_tree.cppto compute node covers manually, replacing reliance on precomputed "Cover" values. This ensures correctness and adaptability to different data scenarios. [1] [2]recursePathDependentto use the newly computed node covers for more accurate weight calculations.Testing:
tests/testthat/test-glex-dataframe.Rto verify thatglex()works correctly with both matrix and data frame inputs for XGBoost and Ranger models.Miscellaneous:
README.mdto include references to an additional paper on partial dependence functions.