Skip to content

fix portfolio_learning_strategies#794

Open
maurerle wants to merge 12 commits into
mainfrom
minmaxrescale
Open

fix portfolio_learning_strategies#794
maurerle wants to merge 12 commits into
mainfrom
minmaxrescale

Conversation

@maurerle
Copy link
Copy Markdown
Member

@maurerle maurerle commented Apr 15, 2026

Related Issue

Closes #784

Description

Improvements:

  • Add test for portfolio learning strategy and min_max_scale function: Added a test for the portfolio learning strategy and min_max_rescale function to ensure its functionality and stability.

Bug Fixes:

  • Fix errors in portfolio learning strategies: The min_max_rescale function was missing from utils.py, causing an ImportError in portfolio_learning_strategies.py. Resolved by extending min_max_scale to cover the rescaling use case. And fix minor construction bug for observation space.

Checklist

  • Documentation updated (docstrings, READMEs, user guides, inline comments, doc folder updates etc.)
  • New unit/integration tests added (if applicable)
  • Changes noted in release notes (if any)
  • Consent to release this PR's code under the GNU Affero General Public License v3.0

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 15, 2026

Codecov Report

❌ Patch coverage is 82.50000% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 82.01%. Comparing base (4998487) to head (4410a34).

Files with missing lines Patch % Lines
assume/reinforcement_learning/learning_utils.py 60.00% 4 Missing ⚠️
assume/common/fast_pandas.py 62.50% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #794      +/-   ##
==========================================
+ Coverage   80.51%   82.01%   +1.49%     
==========================================
  Files          56       56              
  Lines        9013     9033      +20     
==========================================
+ Hits         7257     7408     +151     
+ Misses       1756     1625     -131     
Flag Coverage Δ
pytest 82.01% <82.50%> (+1.49%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maurerle maurerle requested a review from mthede April 15, 2026 16:20
@mthede mthede requested a review from chiara-fb April 17, 2026 13:54
Copy link
Copy Markdown
Collaborator

@chiara-fb chiara-fb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider substituting line 453 of EnergyLearningStrategy from:
bid_prices = actions * self.max_bid_price
to:
bid_prices = min_max_scale(actions, -1,1, self.min_bid_price, self.max_bid_price)

to allow for more flexibility.

@mthede mthede requested a review from kim-mskw April 29, 2026 15:06
@mthede mthede changed the title utils: add min_max_rescale fix portfolio_learning_strategies and reuse improvements for other learning strategies Apr 29, 2026
@mthede mthede requested a review from chiara-fb April 29, 2026 15:56
Copy link
Copy Markdown
Contributor

@kim-mskw kim-mskw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will break inital exploration mode

# actions are in the range [-1,1], we need to transform them into actual bids
# we can use our domain knowledge to guide the bid formulation
bid_prices = actions * self.max_bid_price
bid_prices = min_max_scale(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to change this, we need to adjust the intial exploration as well, because it now relies on the fact that it is symmetric around zero see:

                marginal_cost = next_observation[
                    -1
                ].detach()  # ensure no gradients flow through
                # Add marginal cost to the action directly for initial random exploration
                curr_action += marginal_cost

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and because both are scaled with max_bid_price only beforehand

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct. This is not the case in the portfolio strategy, which uses the current load status to proxy for the level of price markup that the operator can apply, see:

curr_action = 2 * next_observation[2] - 1 + noise

Possibly the initial experience could be shifted to be centered around (max_bid_price - min_bid_price) / 2 + noise? That would make it independent from marginal costs though ...

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, you're completely right @kim-mskw, I missed that.

To achieve sensible exploration around the marginal costs, we could change the scaled mc in the observation space to the [min_bid_price, max_bid_price range]?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing the action space scaling for the other RL strategies needs further thinking. My proposition: revert it for normal RL Strategies and push the rest regarding the portfolio learning strategy. Ok @chiara-fb, @kim-mskw?

@maurerle
Copy link
Copy Markdown
Member Author

maurerle commented May 13, 2026

We will merge the initial PR to fix the missing function and later work on the improvements @mthede

@maurerle maurerle force-pushed the minmaxrescale branch 2 times, most recently from 7831c29 to f87c81c Compare May 13, 2026 09:56
@mthede mthede changed the title fix portfolio_learning_strategies and reuse improvements for other learning strategies fix portfolio_learning_strategies May 13, 2026
@mthede mthede dismissed kim-mskw’s stale review May 15, 2026 12:29

Removed changes under discussion and only kept the bugfix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

missing function 'min_max_rescale' in portfolio_learning_strategies.py

4 participants