[Feat] train.py multi-base rolling window 학습 구조 반영#12
Open
ohchanju3 wants to merge 35 commits into
Open
Conversation
Feat/loader multibase
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
개요
multi-base + rolling window 파이프라인 설계에 맞게 train.py 전반을 수정했습니다. 항공사 전환 시 config.py 한 줄만 바꾸면 되도록 항공사 설정을 분리하고 feat/ip-multibase encoder 변경 반영 및 Phase 2 CG dual feedback을 구현했습니다.
변경 내용
train.py
loader/environment 인터페이스 교체
load_flights_multiday→build_airport_map+bases_to_ids+load_flights_rollingstep_end_duty()제거 →step()하나로 통합상수 정리
PAIRING_COST,BASE_PENALTY등 → config.py로 이관, 중복 제거OVERNIGHT_PENALTY,LEG_BONUS→ 수정된 reward 설계에 없는 값으로 제거flights_to_tensors()수정window_days * 24기준 정규화fly_times추가 — encoder input_dim airport_emb×2+3 맞춤 (feat/ip-multibase 반영)constraint_to_tensor()정규화 추가config.CONSTRAINT_NORMS로 나눠 [0,1] 정규화state_to_vec()수정 (38 → 71차원)base_airport_emb(32)추가 — 모델이 에피소드 목표 base 인식duty_elapsed→current_time - duty_start_time— FAA 기준 실제 경과 시간rest_remaining스칼라 추가run_episode()수정run_curriculum_stage()시그니처 변경flight_sampler함수로 대체train()재작성flight_sampler(): 에피소드마다 base + offset_days 랜덤, base 출발 편 없는 window 스킵max_pairing_days→WINDOW_DAYS(4) 제한 (window 밖 deadhead 방지)config.STAGE3_CONSTRAINT_RANGES참조)bases,window_days,max_time추가 저장Phase 2 — CG dual feedback 구현
_rollout_with_pairings(): pairing 구조체(legs, cost) 수집 (evaluate_ip.py 참고)_collect_pool(): stochastic×30 + greedy×1 rollout → 중복 제거 poolrun_episode_with_dual(): flight 배정 시reward += π[f](LP dual variable 피드백)run_phase2(): LP interval(10ep)마다 pool 수집 →solve_lp_relaxation()→ dual_vars 갱신config.py
AIRLINE,AIRLINE_BASES: 항공사 전환 시AIRLINE한 줄만 수정CONSTRAINT_NORMS: FiLM 입력 정규화 기준값STAGE3_CONSTRAINT_RANGES: Stage 3 augmentation 범위 (constraint 확정 후 교체 TODO)PHASE2_POOL_ROLLOUTS=30,PHASE2_LP_INTERVAL=10,PHASE2_N_EPISODES=1000협의 필요
WINDOW_DAYS = 4확정 여부STAGE3_CONSTRAINT_RANGES범위값 constraint 확정 후 교체 필요