Skip to content

Commit 4a9b086

Browse files
chore: handle gated HF dataset and add CI authentication support
- Improved [src/spatial_transcript_former/data/download.py] to catch 401 Unauthorized errors from Hugging Face and output clear instructions for accepting dataset terms and authenticating. - Updated [.github/workflows/ci.yml] to utilize the `HF_TOKEN` secret during data download and test execution steps.
1 parent 63ccf91 commit 4a9b086

2 files changed

Lines changed: 38 additions & 0 deletions

File tree

.github/workflows/ci.yml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,9 +31,13 @@ jobs:
3131
black --check .
3232
3333
- name: Download test data
34+
env:
35+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
3436
run: |
3537
python scripts/download_hest.py --id TENX29 --skip-wsis --skip-patches --yes
3638
3739
- name: Test with pytest
40+
env:
41+
HF_TOKEN: ${{ secrets.HF_TOKEN }}
3842
run: |
3943
pytest tests/

src/spatial_transcript_former/data/download.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -42,6 +42,23 @@ def download_metadata(local_dir: str, force: bool = False) -> str:
4242
logger.info(f"Metadata downloaded to {path}")
4343
return path
4444
except Exception as e:
45+
if "401" in str(e) or "Unauthorized" in str(e):
46+
logger.error("\n" + "=" * 60)
47+
logger.error("AUTHENTICATION REQUIRED: The HEST dataset is gated.")
48+
logger.error(
49+
"1. Accept the dataset terms at: https://huggingface.co/datasets/MahmoodLab/hest"
50+
)
51+
logger.error(
52+
"2. Get an access token from: https://huggingface.co/settings/tokens"
53+
)
54+
logger.error(
55+
"3. Run 'huggingface-cli login' OR set the HF_TOKEN environment variable."
56+
)
57+
logger.error("=" * 60 + "\n")
58+
import sys
59+
60+
sys.exit(1)
61+
4562
logger.error(f"Failed to download metadata: {e}")
4663
raise
4764

@@ -129,6 +146,23 @@ def download_hest_subset(
129146
)
130147
logger.info("Download completed successfully.")
131148
except Exception as e:
149+
if "401" in str(e) or "Unauthorized" in str(e):
150+
logger.error("\n" + "=" * 60)
151+
logger.error("AUTHENTICATION REQUIRED: The HEST dataset is gated.")
152+
logger.error(
153+
"1. Accept the dataset terms at: https://huggingface.co/datasets/MahmoodLab/hest"
154+
)
155+
logger.error(
156+
"2. Get an access token from: https://huggingface.co/settings/tokens"
157+
)
158+
logger.error(
159+
"3. Run 'huggingface-cli login' OR set the HF_TOKEN environment variable."
160+
)
161+
logger.error("=" * 60 + "\n")
162+
import sys
163+
164+
sys.exit(1)
165+
132166
logger.error(f"Download failed: {e}")
133167
raise
134168

0 commit comments

Comments
 (0)