Objective
Download APOGEE allStar file and create data loading infrastructure with quality filtering.
Dependencies
None - this is the foundation phase.
Tasks
Files to Create
| File |
Purpose |
src/__init__.py |
Package init |
src/data/__init__.py |
Subpackage init |
src/data/apogee_loader.py |
Load APOGEE data |
src/data/quality_filters.py |
Filter bad data |
configs/data_config.yaml |
Data paths |
Starter Code
# src/data/apogee_loader.py
"""APOGEE DR17 data loader."""
from pathlib import Path
import pandas as pd
from astropy.io import fits
from astropy.table import Table
DEFAULT_COLUMNS = [
"APOGEE_ID", "RA", "DEC",
"TEFF", "TEFF_ERR", "LOGG", "LOGG_ERR",
"FE_H", "FE_H_ERR", "ALPHA_M", "ALPHA_M_ERR",
"ASPCAPFLAG", "STARFLAG", "SNR", "J", "H", "K"
]
def load_apogee_allstar(
filepath: str,
columns: list[str] | None = None
) -> pd.DataFrame:
"""Load APOGEE allStar FITS file into DataFrame."""
filepath = Path(filepath)
if not filepath.exists():
raise FileNotFoundError(f"FITS file not found: {filepath}")
cols_to_load = columns or DEFAULT_COLUMNS
with fits.open(filepath, memmap=True) as hdul:
table = Table.read(hdul[1])
df = table[cols_to_load].to_pandas()
return df
Definition of Done
Technical Notes
- FITS file can be >1GB; use
memmap=True for memory efficiency
- Quality flags are bitmasks - use bitwise AND to check specific flags
- Missing values encoded as -9999 in APOGEE data
References
Part of #1 (Meta Issue)
Objective
Download APOGEE allStar file and create data loading infrastructure with quality filtering.
Dependencies
None - this is the foundation phase.
Tasks
src/data/apogee_loader.pywithload_apogee_allstar()functionsrc/data/quality_filters.pyfor quality flag filteringconfigs/data_config.yamlwith data pathsnotebooks/01_data_exploration.ipynbfor initial EDAFiles to Create
src/__init__.pysrc/data/__init__.pysrc/data/apogee_loader.pysrc/data/quality_filters.pyconfigs/data_config.yamlStarter Code
Definition of Done
Technical Notes
memmap=Truefor memory efficiencyReferences
Part of #1 (Meta Issue)