fix portfolio_learning_strategies#794
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #794 +/- ##
==========================================
+ Coverage 80.51% 82.01% +1.49%
==========================================
Files 56 56
Lines 9013 9033 +20
==========================================
+ Hits 7257 7408 +151
+ Misses 1756 1625 -131
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
chiara-fb
left a comment
There was a problem hiding this comment.
Consider substituting line 453 of EnergyLearningStrategy from:
bid_prices = actions * self.max_bid_price
to:
bid_prices = min_max_scale(actions, -1,1, self.min_bid_price, self.max_bid_price)
to allow for more flexibility.
kim-mskw
left a comment
There was a problem hiding this comment.
This will break inital exploration mode
| # actions are in the range [-1,1], we need to transform them into actual bids | ||
| # we can use our domain knowledge to guide the bid formulation | ||
| bid_prices = actions * self.max_bid_price | ||
| bid_prices = min_max_scale( |
There was a problem hiding this comment.
If we want to change this, we need to adjust the intial exploration as well, because it now relies on the fact that it is symmetric around zero see:
marginal_cost = next_observation[
-1
].detach() # ensure no gradients flow through
# Add marginal cost to the action directly for initial random exploration
curr_action += marginal_cost
There was a problem hiding this comment.
and because both are scaled with max_bid_price only beforehand
There was a problem hiding this comment.
correct. This is not the case in the portfolio strategy, which uses the current load status to proxy for the level of price markup that the operator can apply, see:
curr_action = 2 * next_observation[2] - 1 + noise
Possibly the initial experience could be shifted to be centered around (max_bid_price - min_bid_price) / 2 + noise? That would make it independent from marginal costs though ...
There was a problem hiding this comment.
Yeah, you're completely right @kim-mskw, I missed that.
To achieve sensible exploration around the marginal costs, we could change the scaled mc in the observation space to the [min_bid_price, max_bid_price range]?
There was a problem hiding this comment.
Changing the action space scaling for the other RL strategies needs further thinking. My proposition: revert it for normal RL Strategies and push the rest regarding the portfolio learning strategy. Ok @chiara-fb, @kim-mskw?
|
We will merge the initial PR to fix the missing function and later work on the improvements @mthede |
this function was omitted in the addition of portfolio strategies
… min_max_rescale by switching bounds
…) deprecation warning
7831c29 to
f87c81c
Compare
Removed changes under discussion and only kept the bugfix.
Related Issue
Closes #784
Description
Improvements:
min_max_scalefunction: Added a test for the portfolio learning strategy andmin_max_rescalefunction to ensure its functionality and stability.Bug Fixes:
min_max_rescalefunction was missing fromutils.py, causing anImportErrorinportfolio_learning_strategies.py. Resolved by extendingmin_max_scaleto cover the rescaling use case. And fix minor construction bug for observation space.Checklist
docfolder updates etc.)