Skip to content

[META] Stellar Classification Implementation Tracker #1

@Sakeeb91

Description

@Sakeeb91

Overview

This issue tracks the overall implementation progress for the Stellar Classification and Atmospheric Parameters project.

Objective

Build an ML system for classifying stars and predicting atmospheric parameters (Teff, log g, [Fe/H], [alpha/Fe]) from spectroscopic survey data.

Implementation Phases

  • Phase 1: Data Foundation - APOGEE data loading and quality filtering
  • Phase 2: Preprocessing Pipeline - Data cleaning, normalization, feature selection
  • Phase 3: Baseline Classification - Stellar type classifier (>85% accuracy target)
  • Phase 4: Parameter Regression - Predict Teff, log g, [Fe/H], [alpha/Fe]
  • Phase 5: Cross-Survey Validation - Validate on GALAH overlap
  • Phase 6: Production Pipeline - End-to-end pipeline with CLI

Key Metrics

Parameter Target MAE
Teff < 100 K
log g < 0.2 dex
[Fe/H] < 0.1 dex
Classification > 85% accuracy

Data Sources

  • APOGEE DR17: ~650,000 stars with high-resolution IR spectra
  • GALAH DR3: ~600,000 stars for cross-validation
  • LAMOST DR7: Millions of stars for scale testing

Technical Stack

  • Python 3.10+, scikit-learn, XGBoost
  • astropy for FITS file handling
  • pandas, numpy for data processing

Documentation

See docs/IMPLEMENTATION_PLAN.md for full implementation details.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions