Conversation
…, split_decay_rate, delete_leaves ) and different behaviour of split_try
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Some Enhancements/modifications.
Three new parameters added
max_candidates(default=50):After many splits the number of split-option for which one needs to calculate the resulting loss explodes. Now the number of possible options is changed from (t_try * possible options) to max(max_candidates , t_try * possible options ).
With this change the
splitsparameter can be set much higher because computational cost now only grows linearly instead of quadratic.split_decay_rate(default=0.1):Possible splits are initiated with age=0. Whenever a possible split becomes a split_candidate (i.e. it has been drawn when drawing max(max_candidates , t_try * possible options ) times) it ages by +1. The age of the single split-candidate that has actually minimal loss is reset to zero. A high decay rate means faster aging.
split_decay_rate=0 results in no aging and therefore mimics the old behaviour.delete_leaves(default=1):Originally if a leaf is split with respect to a variable that is already part of the leaf, then the leaf is deleted/replaced by the newly created children. This is also the default behaviour now.
Additionally, one can choose
delete_leaves=0. In this case leaves are never deleted. I.e., a split results always in two new leaves while still keeping the parent.New behaviour of split_try
Before, split_try was determining how many split points to try out in each leaf of every split_candidate. Therefore the number of splits that have been tried out in a split_candidate was (number of leaves in split_candidate)*
split_try.With this change in every split_candidate,
split_trycombinations of leaves and split points are chosen. Hence the number of splits that are being tried out in a split_candidate is reduced to be justsplit_try.