Skip to content

[API 2]: Desparsified Lasso#381

Merged
jpaillard merged 92 commits intomind-inria:mainfrom
lionelkusch:PR_desparsified_lasso
Nov 7, 2025
Merged

[API 2]: Desparsified Lasso#381
jpaillard merged 92 commits intomind-inria:mainfrom
lionelkusch:PR_desparsified_lasso

Conversation

@lionelkusch
Copy link
Copy Markdown
Collaborator

No description provided.

@codecov
Copy link
Copy Markdown

codecov Bot commented Sep 8, 2025

Codecov Report

❌ Patch coverage is 97.84483% with 5 lines in your changes missing coverage. Please review.
✅ Project coverage is 98.94%. Comparing base (f1cf295) to head (48f28bc).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
src/hidimstat/desparsified_lasso.py 98.04% 4 Missing ⚠️
src/hidimstat/ensemble_clustered_inference.py 94.73% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #381      +/-   ##
==========================================
- Coverage   99.19%   98.94%   -0.25%     
==========================================
  Files          24       23       -1     
  Lines        1364     1424      +60     
==========================================
+ Hits         1353     1409      +56     
- Misses         11       15       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lionelkusch lionelkusch marked this pull request as ready for review September 8, 2025 15:37
Copy link
Copy Markdown
Collaborator

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall. I left a few suggestions.

Comment thread examples/plot_2D_simulation_example.py Outdated
Comment thread examples/plot_2D_simulation_example.py
Comment thread src/hidimstat/ensemble_clustered_inference.py Outdated
Comment thread test/test_desparsified_lasso.py Outdated
Comment thread src/hidimstat/desparsified_lasso.py Outdated
Comment thread src/hidimstat/desparsified_lasso.py Outdated
Comment thread src/hidimstat/desparsified_lasso.py
Comment thread src/hidimstat/desparsified_lasso.py Outdated
Comment thread src/hidimstat/desparsified_lasso.py
Comment thread src/hidimstat/desparsified_lasso.py Outdated
Comment thread src/hidimstat/desparsified_lasso.py Outdated
Comment on lines +366 to +371
clf=clone(self.lasso).set_params(
alpha=alphas[i],
precompute=np.delete(np.delete(gram, i, axis=0), i, axis=1),
random_state=np.random.RandomState(
np.random.default_rng(streams[i]).bit_generator
),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will not be intuitive for the user that the parameters of the model are overwritten inside the method. IMO, the model's parameter should be left unchanged. The _alpha_max trick could be shown in an example as a way to find the list of alphas, but not hidden in the function.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bthirion What do you think?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the problem with working with a given random generator or seed, since the problems are different.
I agree that we should not hide parameter settings in the functions/methods.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added an option to change alphas.

Comment thread src/hidimstat/desparsified_lasso.py Outdated
@jpaillard jpaillard changed the title API 2: Desparsified Lasso [API 2]: Desparsified Lasso Oct 30, 2025
@jpaillard
Copy link
Copy Markdown
Collaborator

I made a pass to solve the last discussion points.

I also added an example using EnCluDL on the MNIST dataset. By writing it, I got a bit confused by the output of ensemble_clustered_inference_pvalue. The docstring indicates that the p-values, derived from a two-sided test on the DL coefficients $\hat\beta$, have a sign corresponding to the sign of $\hat\beta$. However, as illustrated in the example, what is computed is not very clear.

@jpaillard jpaillard requested a review from bthirion October 31, 2025 17:04
Copy link
Copy Markdown
Collaborator

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is mostly the one_minus_pvalue_ thing to fix. It's looking good otherwise.

Comment thread examples/plot_2D_simulation_example.py Outdated
selected_dl = np.logical_or(
pval_corr < fwer_target / 2, one_minus_pval_corr < fwer_target / 2
desparsified_lasso.pvalues_corr_ < fwer_target / 2,
1 - desparsified_lasso.pvalues_corr_ < fwer_target / 2,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this is incorrect I'm afraid, because due to the correction one_minus_pval_corr is not 1 - desparsified_lasso.pvalues_corr_

Comment thread examples/plot_digits.py Outdated
)

beta_hat_1_7, selected_ecdl_1_7 = ensemble_clustered_inference_pvalue(
X_1_7.shape[0], False, ward_1_7, desparsified_lassos_1_7, fdr=fdr
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather use fwer control rather than fdr control. I think it makes more sense for EncluDL

Comment thread examples/plot_digits.py Outdated
# %%
# References
# ----------
# TODO
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add it ?

interior_support = support_size - margin_size
extended_support = support_size + margin_size
n_bootstraps = 4
fdr = 0.3
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit lenient as a control.
And we should rather check that FWER control is rigorous.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to change this ?

Comment thread examples/plot_digits.py Outdated
from hidimstat.ensemble_clustered_inference import ensemble_clustered_inference_pvalue

n_jobs = 5
fdr = 0.5
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds too weak: "in average, half of your discoveries are garbage"

Comment thread src/hidimstat/desparsified_lasso.py Outdated
Comment thread src/hidimstat/desparsified_lasso.py
@jpaillard
Copy link
Copy Markdown
Collaborator

  • I fixed the 1 - pvalues

  • I removed the "digits" example for now. It's nice to look at the coefficient maps, but the selection is way less clear. The "background cluster" is always selected, probably because the text between images is not aligned. We can discuss it in the PR about EnCluDL

  • Same for the tests on EnCluDL, I suggest making a pass on all the tests of EnCluDL in the dedicated PR

  • In addition, it is strange that ensemble_clustered_inference_pvalue actually doesn't return a pvalue but a selection mask. We can also discuss this inthe next PR

@jpaillard jpaillard requested a review from bthirion November 5, 2025 15:15
Copy link
Copy Markdown
Collaborator

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thx. I have 2 relatively minor comments pending.

Comment thread src/hidimstat/desparsified_lasso.py
interior_support = support_size - margin_size
extended_support = support_size + margin_size
n_bootstraps = 4
fdr = 0.3
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to change this ?

@jpaillard
Copy link
Copy Markdown
Collaborator

Done.
I also added the reproducibility/randomness tests for consistency, as stated in the documentation

@jpaillard jpaillard requested a review from bthirion November 6, 2025 08:48
Copy link
Copy Markdown
Collaborator

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGT

@jpaillard jpaillard merged commit 1f97f5b into mind-inria:main Nov 7, 2025
23 of 24 checks passed
@jpaillard jpaillard deleted the PR_desparsified_lasso branch November 7, 2025 08:33
@jpaillard jpaillard mentioned this pull request Dec 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

desparsified functions

3 participants