diff --git a/CHANGELOG.md b/CHANGELOG.md index 4b4c4fa..e127a95 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,8 @@ ## [Unreleased] +## [0.7.0] - 2026-04-11 + ### Added - ホーム設定 overlay と `config.toml` に `audio_voice` を追加し、macOS / Windows の local voice を選択して次回起動後も使い続けられるようにしました。 @@ -14,6 +16,8 @@ - `eitango review [choice|write]` は due が 0 件のとき、確認後に過去に出題済み語だけを使う reviewed-only ランダム復習へ入れるようになりました。 - reviewed-only fallback 復習では SRS の interval / due を更新せず、choice / write とも feedback は `Enter` で次へ進むだけの練習フローになりました。 +- bundled core 辞書の `dict_version` 更新時は、既存 core 語の SRS 進捗を保持したまま `lemma/pos` ベースで差分同期し、新語だけを追加するようにしました。 +- bundled core から外れた旧語は履歴参照用に保持しつつ新規セッション計画から外すようにし、`reset --reseed` だけを明示的な全リセット経路として維持しました。 ### Fixed @@ -23,6 +27,7 @@ - reviewed-only fallback セッションは通常 review と別ラベルで表示されるようになり、ホーム画面や確認オーバーレイで区別が失われないようにしました。 - reviewed-only fallback の choice feedback で、ヘルプ文言と狭幅レイアウト判定を Enter-only フローに合わせて修正しました。 - 中断された reviewed-only fallback セッションは再開せず破棄するようにし、通常の SRS review として誤って復元されないようにしました。 +- bundled core の version bump 直前に active session が残っていた場合はその session を `abandoned` にし、辞書同期後に設問文面や choice 候補が drift した状態で再開されないようにしました。 ## [0.6.1] - 2026-04-09 @@ -183,7 +188,8 @@ - 通知不要時に古い update tag が画面に残る問題を修正しました。 - `dev` など非 semver の build でも update availability を正しく判定するようにしました。 -[Unreleased]: https://github.com/harumiWeb/eitango/compare/v0.6.1...HEAD +[Unreleased]: https://github.com/harumiWeb/eitango/compare/v0.7.0...HEAD +[0.7.0]: https://github.com/harumiWeb/eitango/compare/v0.6.1...v0.7.0 [0.6.1]: https://github.com/harumiWeb/eitango/compare/v0.6.0...v0.6.1 [0.6.0]: https://github.com/harumiWeb/eitango/compare/v0.5.2...v0.6.0 [0.5.2]: https://github.com/harumiWeb/eitango/compare/v0.5.1...v0.5.2 diff --git a/assets/migrations/006_words_is_active.sql b/assets/migrations/006_words_is_active.sql new file mode 100644 index 0000000..057fef0 --- /dev/null +++ b/assets/migrations/006_words_is_active.sql @@ -0,0 +1,4 @@ +ALTER TABLE words ADD COLUMN is_active INTEGER NOT NULL DEFAULT 1; + +CREATE INDEX IF NOT EXISTS idx_words_source_active ON words(source, is_active); +CREATE INDEX IF NOT EXISTS idx_words_active_pos_rank ON words(is_active, pos, frequency_rank, id); diff --git a/docs/adr/0002-bundled-core-dictionary-lifecycle.md b/docs/adr/0002-bundled-core-dictionary-lifecycle.md index e79bc5f..929e717 100644 --- a/docs/adr/0002-bundled-core-dictionary-lifecycle.md +++ b/docs/adr/0002-bundled-core-dictionary-lifecycle.md @@ -13,16 +13,19 @@ - `assets/words_core.jsonl` を bundled core の正規 runtime asset とし、起動時は embed された内容を読み込んで利用する。 - bundled core は `dict.LoadCoreWords()` で parse と validation を通したうえで seed する。core では `lemma`, `meaning_ja`, `pos`, `level`, `frequency_rank`, `distractor_group` を必須とし、`(lemma, pos)` と `frequency_rank` の重複を許さず、各 `distractor_group` は最低 4 語を要求する。 - bundled core の版は `dict.CoreWordsVersion` で管理し、DB 内では `app_meta.dict_version` と `source = "core"` を使って現在の seed 状態を追跡する。 -- 初回 seed で core 語彙を投入し、`dict_version` が変わった場合と `reset --reseed` 実行時は core source の語彙を置き換え、学習履歴テーブルもリセットする。 +- 初回 seed で core 語彙を投入し、`dict_version` が変わった場合は `source = "core"` の既存 row を `normalized(lemma, pos)` 単位で差分同期する。matched row は同じ `word_id` を維持したまま metadata を更新し、新語だけを追加し、消えた旧 core row は inactive として退役させる。 +- `reset --reseed` は bundled core の明示的な破壊的再投入導線として残し、core source を全置換したうえで学習履歴テーブルもリセットする。 - import 語彙は `import:*` source として core から分離し、`core` は予約済み source として扱う。 - raw の Leipzig / Japanese WordNet 入力は配布物へ含めず、再生成条件は `scripts/vocab/source_manifest.json` と repository tooling に固定する。 ## Consequences - 学習時に参照する core 辞書の品質と整合性を、アプリ起動時 validation と DB metadata の両方で担保できる。 -- core 更新時に `dict_version` と reseed 導線が揃うため、古い core と新しい進捗が半端に混ざる状態を避けられる。 +- core 更新時も matched row の `word_id` と SRS 履歴を保持できるため、bundled core の語彙追加や軽微な metadata 更新を学習進捗の全消去なしで配布できる。 +- 退役した core 語彙を inactive row として残すため、過去 session・review・export の参照整合性を壊さずに新規計画対象からだけ除外できる。 - core と import を source で分けるため、標準辞書の更新とユーザー追加データの保守方針を分離できる。 -- core の版更新は学習履歴のリセットを伴うため、辞書差し替えは軽微な見た目変更として扱わず、意図的に実施する必要がある。 +- core の版更新は `normalized(lemma, pos)` を軸にした互換性契約を伴うため、同一 key の語は同じ学習対象として扱える範囲の更新に留める必要がある。 +- 破壊的な全リセットは `reset --reseed` に限定されるため、CLI とテストの両方でこの導線を明示的に維持し続ける必要がある。 - データ由来と再配布条件を repository 内で維持し続ける責務が残る。 ## Rationale diff --git a/internal/app/cmds_test.go b/internal/app/cmds_test.go index 8f601e3..536756b 100644 --- a/internal/app/cmds_test.go +++ b/internal/app/cmds_test.go @@ -541,6 +541,7 @@ func TestSessionStartErrMsgReloadsHomeEvenWhenStatsReloadFails(t *testing.T) { example_en TEXT, example_ja TEXT, source TEXT NOT NULL, + is_active INTEGER NOT NULL DEFAULT 1, created_at TEXT NOT NULL )`, `CREATE TABLE progress ( diff --git a/internal/store/doctor.go b/internal/store/doctor.go index 0691f74..d4b546f 100644 --- a/internal/store/doctor.go +++ b/internal/store/doctor.go @@ -155,6 +155,11 @@ func (s *Store) checkDictionary(ctx context.Context) DiagnosticCheck { return diagnosticCheckError("dictionary", "dict_version could not be read", err.Error()) } + hasIsActive, err := s.tableHasColumn(ctx, "words", "is_active") + if err != nil { + return diagnosticCheckError("dictionary", "word schema could not be inspected", err.Error()) + } + wordCount, err := s.wordCount(ctx) if err != nil { return diagnosticCheckError("dictionary", "word count could not be read", err.Error()) @@ -164,16 +169,38 @@ func (s *Store) checkDictionary(ctx context.Context) DiagnosticCheck { return diagnosticCheckError("dictionary", "core word count could not be read", err.Error()) } importWordCount := wordCount - coreWordCount + activeCoreWordCount := coreWordCount + retiredCoreWordCount := 0 + if hasIsActive { + activeCoreWordCount, err = s.countWordsBySourceActive(ctx, WordSourceCore, true) + if err != nil { + return diagnosticCheckError("dictionary", "active core word count could not be read", err.Error()) + } + retiredCoreWordCount = coreWordCount - activeCoreWordCount + } switch { - case version == "" && coreWordCount == 0: + case version == "" && activeCoreWordCount == 0: return diagnosticCheckError("dictionary", "core words are not seeded", fmt.Sprintf("expected dict_version %q", dict.CoreWordsVersion)) case version == "": - return diagnosticCheckError("dictionary", "dict_version is missing", fmt.Sprintf("core words: %d", coreWordCount), fmt.Sprintf("imported words: %d", importWordCount)) - case coreWordCount == 0: - return diagnosticCheckError("dictionary", "dict_version exists but core words are missing", fmt.Sprintf("dict_version: %s", version), fmt.Sprintf("imported words: %d", importWordCount)) + details := []string{fmt.Sprintf("active core words: %d", activeCoreWordCount)} + if hasIsActive && retiredCoreWordCount > 0 { + details = append(details, fmt.Sprintf("retired core words: %d", retiredCoreWordCount)) + } + details = append(details, fmt.Sprintf("imported words: %d", importWordCount)) + return diagnosticCheckError("dictionary", "dict_version is missing", details...) + case activeCoreWordCount == 0: + details := []string{fmt.Sprintf("dict_version: %s", version)} + if hasIsActive && retiredCoreWordCount > 0 { + details = append(details, fmt.Sprintf("retired core words: %d", retiredCoreWordCount)) + } + details = append(details, fmt.Sprintf("imported words: %d", importWordCount)) + return diagnosticCheckError("dictionary", "dict_version exists but active core words are missing", details...) case version != dict.CoreWordsVersion: - details := []string{fmt.Sprintf("core words: %d", coreWordCount)} + details := []string{fmt.Sprintf("active core words: %d", activeCoreWordCount)} + if hasIsActive && retiredCoreWordCount > 0 { + details = append(details, fmt.Sprintf("retired core words: %d", retiredCoreWordCount)) + } if importWordCount > 0 { details = append(details, fmt.Sprintf("imported words: %d", importWordCount)) } @@ -183,7 +210,10 @@ func (s *Store) checkDictionary(ctx context.Context) DiagnosticCheck { details..., ) default: - summary := fmt.Sprintf("%d core words seeded at %s", coreWordCount, version) + summary := fmt.Sprintf("%d active core words seeded at %s", activeCoreWordCount, version) + if hasIsActive && retiredCoreWordCount > 0 { + summary += fmt.Sprintf(" (+%d retired core)", retiredCoreWordCount) + } if importWordCount > 0 { summary += fmt.Sprintf(" (+%d imported)", importWordCount) } @@ -338,43 +368,127 @@ LIMIT ? } func (s *Store) checkWordMetadata(ctx context.Context) DiagnosticCheck { + hasIsActive, err := s.tableHasColumn(ctx, "words", "is_active") + if err != nil { + return diagnosticCheckError("word metadata", "word schema could not be inspected", err.Error()) + } + type metadataIssue struct { - label string - count int - query string + label string + count int + countQuery string + activeCountQuery string + sampleQuery string + activeSampleQuery string } issues := []metadataIssue{ { label: "missing pos", - query: ` + countQuery: ` +SELECT COUNT(*) +FROM words +WHERE TRIM(COALESCE(pos, '')) = '' +`, + activeCountQuery: ` SELECT COUNT(*) FROM words +WHERE is_active = 1 AND TRIM(COALESCE(pos, '')) = '' +`, + sampleQuery: ` +SELECT lemma +FROM words WHERE TRIM(COALESCE(pos, '')) = '' +ORDER BY id ASC +LIMIT ? +`, + activeSampleQuery: ` +SELECT lemma +FROM words +WHERE is_active = 1 AND TRIM(COALESCE(pos, '')) = '' +ORDER BY id ASC +LIMIT ? `, }, { label: "missing level", - query: ` + countQuery: ` +SELECT COUNT(*) +FROM words +WHERE TRIM(COALESCE(level, '')) = '' +`, + activeCountQuery: ` SELECT COUNT(*) FROM words +WHERE is_active = 1 AND TRIM(COALESCE(level, '')) = '' +`, + sampleQuery: ` +SELECT lemma +FROM words WHERE TRIM(COALESCE(level, '')) = '' +ORDER BY id ASC +LIMIT ? +`, + activeSampleQuery: ` +SELECT lemma +FROM words +WHERE is_active = 1 AND TRIM(COALESCE(level, '')) = '' +ORDER BY id ASC +LIMIT ? `, }, { label: "missing frequency rank", - query: ` + countQuery: ` +SELECT COUNT(*) +FROM words +WHERE COALESCE(frequency_rank, 0) <= 0 +`, + activeCountQuery: ` SELECT COUNT(*) FROM words +WHERE is_active = 1 AND COALESCE(frequency_rank, 0) <= 0 +`, + sampleQuery: ` +SELECT lemma +FROM words WHERE COALESCE(frequency_rank, 0) <= 0 +ORDER BY id ASC +LIMIT ? +`, + activeSampleQuery: ` +SELECT lemma +FROM words +WHERE is_active = 1 AND COALESCE(frequency_rank, 0) <= 0 +ORDER BY id ASC +LIMIT ? `, }, { label: "missing distractor group", - query: ` + countQuery: ` SELECT COUNT(*) FROM words WHERE TRIM(COALESCE(distractor_group, '')) = '' +`, + activeCountQuery: ` +SELECT COUNT(*) +FROM words +WHERE is_active = 1 AND TRIM(COALESCE(distractor_group, '')) = '' +`, + sampleQuery: ` +SELECT lemma +FROM words +WHERE TRIM(COALESCE(distractor_group, '')) = '' +ORDER BY id ASC +LIMIT ? +`, + activeSampleQuery: ` +SELECT lemma +FROM words +WHERE is_active = 1 AND TRIM(COALESCE(distractor_group, '')) = '' +ORDER BY id ASC +LIMIT ? `, }, } @@ -383,7 +497,14 @@ WHERE TRIM(COALESCE(distractor_group, '')) = '' totalIssueCount := 0 for i := range issues { - count, err := s.countRows(ctx, issues[i].query) + countQuery := issues[i].countQuery + sampleQuery := issues[i].sampleQuery + if hasIsActive { + countQuery = issues[i].activeCountQuery + sampleQuery = issues[i].activeSampleQuery + } + + count, err := s.countRows(ctx, countQuery) if err != nil { return diagnosticCheckError("word metadata", fmt.Sprintf("%s could not be counted", issues[i].label), err.Error()) } @@ -393,20 +514,14 @@ WHERE TRIM(COALESCE(distractor_group, '')) = '' } totalIssueCount += count - samples, err := s.sampleStringRows(ctx, fmt.Sprintf(` -SELECT lemma -FROM words -WHERE %s -ORDER BY id ASC -LIMIT ? -`, metadataConditionForLabel(issues[i].label)), doctorSampleLimit) + samples, err := s.sampleStringRows(ctx, sampleQuery, doctorSampleLimit) if err != nil { return diagnosticCheckError("word metadata", fmt.Sprintf("%s samples could not be loaded", issues[i].label), err.Error()) } details = append(details, formatStringSamples(issues[i].label, count, samples)) } - duplicateRankCount, err := s.countRows(ctx, ` + duplicateRankCountQuery := ` SELECT COUNT(*) FROM ( SELECT source, frequency_rank @@ -415,13 +530,8 @@ FROM ( GROUP BY source, frequency_rank HAVING COUNT(*) > 1 ) -`) - if err != nil { - return diagnosticCheckError("word metadata", "duplicate frequency ranks could not be counted", err.Error()) - } - if duplicateRankCount > 0 { - totalIssueCount += duplicateRankCount - rows, err := s.db.QueryContext(ctx, ` +` + duplicateRankSampleQuery := ` SELECT source, frequency_rank, GROUP_CONCAT(lemma, ', ') FROM ( SELECT source, frequency_rank, lemma @@ -433,7 +543,40 @@ GROUP BY source, frequency_rank HAVING COUNT(*) > 1 ORDER BY source ASC, frequency_rank ASC LIMIT ? -`, doctorSampleLimit) +` + if hasIsActive { + duplicateRankCountQuery = ` +SELECT COUNT(*) +FROM ( + SELECT source, frequency_rank + FROM words + WHERE is_active = 1 AND frequency_rank IS NOT NULL + GROUP BY source, frequency_rank + HAVING COUNT(*) > 1 +) +` + duplicateRankSampleQuery = ` +SELECT source, frequency_rank, GROUP_CONCAT(lemma, ', ') +FROM ( + SELECT source, frequency_rank, lemma + FROM words + WHERE is_active = 1 AND frequency_rank IS NOT NULL + ORDER BY source ASC, frequency_rank ASC, lemma ASC +) +GROUP BY source, frequency_rank +HAVING COUNT(*) > 1 +ORDER BY source ASC, frequency_rank ASC +LIMIT ? +` + } + + duplicateRankCount, err := s.countRows(ctx, duplicateRankCountQuery) + if err != nil { + return diagnosticCheckError("word metadata", "duplicate frequency ranks could not be counted", err.Error()) + } + if duplicateRankCount > 0 { + totalIssueCount += duplicateRankCount + rows, err := s.db.QueryContext(ctx, duplicateRankSampleQuery, doctorSampleLimit) if err != nil { return diagnosticCheckError("word metadata", "duplicate frequency rank samples could not be loaded", err.Error()) } @@ -464,21 +607,6 @@ LIMIT ? return diagnosticCheckOK("word metadata", "all words have metadata needed for ranking and distractors") } -func metadataConditionForLabel(label string) string { - switch label { - case "missing pos": - return "TRIM(COALESCE(pos, '')) = ''" - case "missing level": - return "TRIM(COALESCE(level, '')) = ''" - case "missing frequency rank": - return "COALESCE(frequency_rank, 0) <= 0" - case "missing distractor group": - return "TRIM(COALESCE(distractor_group, '')) = ''" - default: - panic("unsupported metadata label: " + label) - } -} - func (s *Store) checkOrphanProgress(ctx context.Context) DiagnosticCheck { count, err := s.countRows(ctx, ` SELECT COUNT(*) @@ -779,6 +907,8 @@ func doctorTableInfoQuery(tableName string) (string, error) { switch tableName { case "sessions": return "PRAGMA table_info(sessions)", nil + case "words": + return "PRAGMA table_info(words)", nil default: return "", fmt.Errorf("unsupported table %q", tableName) } @@ -821,7 +951,16 @@ func (s *Store) tableHasColumn(ctx context.Context, tableName, columnName string } func (s *Store) checkQuizability(ctx context.Context) DiagnosticCheck { - wordCount, err := s.wordCount(ctx) + hasIsActive, err := s.tableHasColumn(ctx, "words", "is_active") + if err != nil { + return diagnosticCheckError("quizability", "word schema could not be inspected", err.Error()) + } + + wordCountQuery := `SELECT COUNT(*) FROM words` + if hasIsActive { + wordCountQuery = `SELECT COUNT(*) FROM words WHERE is_active = 1` + } + wordCount, err := s.countRows(ctx, wordCountQuery) if err != nil { return diagnosticCheckError("quizability", "words could not be loaded", err.Error()) } @@ -835,32 +974,65 @@ func (s *Store) checkQuizability(ctx context.Context) DiagnosticCheck { // least four distinct meanings. That keeps doctor fast on CI-sized datasets, // but it can miss edge cases from the runtime distractor filters // (distractor_group, level, frequency proximity, excluded IDs). - const quizabilityCountsCTE = ` + failureCountQuery := ` WITH pos_meaning_counts AS ( SELECT IFNULL(pos, '') AS pos_key, COUNT(DISTINCT meaning_ja) AS distinct_meaning_count FROM words GROUP BY IFNULL(pos, '') ) -` - - failureCount, err := s.countRows(ctx, quizabilityCountsCTE+` SELECT COUNT(*) FROM words w LEFT JOIN pos_meaning_counts pmc ON IFNULL(w.pos, '') = pmc.pos_key WHERE COALESCE(pmc.distinct_meaning_count, 0) < ? -`, doctorQuizChoiceSize) - if err != nil { - return diagnosticCheckError("quizability", "same-pos distractor meanings could not be counted", err.Error()) - } - if failureCount > 0 { - rows, err := s.db.QueryContext(ctx, quizabilityCountsCTE+` +` + failureSampleQuery := ` +WITH pos_meaning_counts AS ( + SELECT IFNULL(pos, '') AS pos_key, COUNT(DISTINCT meaning_ja) AS distinct_meaning_count + FROM words + GROUP BY IFNULL(pos, '') +) SELECT w.lemma, w.pos FROM words w LEFT JOIN pos_meaning_counts pmc ON IFNULL(w.pos, '') = pmc.pos_key WHERE COALESCE(pmc.distinct_meaning_count, 0) < ? ORDER BY COALESCE(w.frequency_rank, 999999) ASC, w.id ASC LIMIT ? -`, doctorQuizChoiceSize, doctorSampleLimit) +` + if hasIsActive { + failureCountQuery = ` +WITH pos_meaning_counts AS ( + SELECT IFNULL(pos, '') AS pos_key, COUNT(DISTINCT meaning_ja) AS distinct_meaning_count + FROM words + WHERE is_active = 1 + GROUP BY IFNULL(pos, '') +) +SELECT COUNT(*) +FROM words w +LEFT JOIN pos_meaning_counts pmc ON IFNULL(w.pos, '') = pmc.pos_key +WHERE w.is_active = 1 AND COALESCE(pmc.distinct_meaning_count, 0) < ? +` + failureSampleQuery = ` +WITH pos_meaning_counts AS ( + SELECT IFNULL(pos, '') AS pos_key, COUNT(DISTINCT meaning_ja) AS distinct_meaning_count + FROM words + WHERE is_active = 1 + GROUP BY IFNULL(pos, '') +) +SELECT w.lemma, w.pos +FROM words w +LEFT JOIN pos_meaning_counts pmc ON IFNULL(w.pos, '') = pmc.pos_key +WHERE w.is_active = 1 AND COALESCE(pmc.distinct_meaning_count, 0) < ? +ORDER BY COALESCE(w.frequency_rank, 999999) ASC, w.id ASC +LIMIT ? +` + } + + failureCount, err := s.countRows(ctx, failureCountQuery, doctorQuizChoiceSize) + if err != nil { + return diagnosticCheckError("quizability", "same-pos distractor meanings could not be counted", err.Error()) + } + if failureCount > 0 { + rows, err := s.db.QueryContext(ctx, failureSampleQuery, doctorQuizChoiceSize, doctorSampleLimit) if err != nil { return diagnosticCheckError("quizability", "unquizzable word samples could not be loaded", err.Error()) } diff --git a/internal/store/doctor_test.go b/internal/store/doctor_test.go index 8dd6c2f..e127979 100644 --- a/internal/store/doctor_test.go +++ b/internal/store/doctor_test.go @@ -16,8 +16,21 @@ func TestRunDiagnosticsHealthyStore(t *testing.T) { ctx := context.Background() st := newTestStore(t) - if err := st.SeedWords(ctx, doctorTestEntries(), dict.CoreWordsVersion); err != nil { - t.Fatalf("SeedWords() error = %v", err) + if _, err := st.db.ExecContext(ctx, ` +INSERT INTO words (id, lemma, pos, meaning_ja, level, frequency_rank, distractor_group, source) +VALUES + (1, 'adopt', 'verb', '採用する', 'core-1', 100, 'basic-verb-action', ?), + (2, 'apply', 'verb', '応募する', 'core-1', 120, 'basic-verb-action', ?), + (3, 'cancel', 'verb', '取り消す', 'core-1', 140, 'basic-verb-action', ?), + (4, 'deliver', 'verb', '届ける', 'core-1', 160, 'basic-verb-action', ?) +`, WordSourceCore, WordSourceCore, WordSourceCore, WordSourceCore); err != nil { + t.Fatalf("insert legacy words: %v", err) + } + if _, err := st.db.ExecContext(ctx, ` +INSERT INTO app_meta (key, value) +VALUES ('dict_version', ?) +`, dict.CoreWordsVersion); err != nil { + t.Fatalf("insert legacy dict_version: %v", err) } report := st.RunDiagnostics(ctx) @@ -57,15 +70,102 @@ func TestRunDiagnosticsDetectsDictionaryVersionMismatch(t *testing.T) { } } -func TestRunDiagnosticsDetectsOrphanRows(t *testing.T) { +func TestRunDiagnosticsIgnoresRetiredCoreWordsForMetadataAndQuizability(t *testing.T) { t.Parallel() ctx := context.Background() st := newTestStore(t) - if err := st.SeedWords(ctx, doctorTestEntries(), dict.CoreWordsVersion); err != nil { + if err := st.SeedWords(ctx, doctorTestEntries(), "test-v1"); err != nil { t.Fatalf("SeedWords() error = %v", err) } + rows, err := st.db.QueryContext(ctx, ` +SELECT id, lemma +FROM words +WHERE source = ? +ORDER BY frequency_rank ASC, id ASC +`, WordSourceCore) + if err != nil { + t.Fatalf("query core words: %v", err) + } + + var retiredWordID int64 + for rows.Next() { + var ( + id int64 + lemma string + ) + if err := rows.Scan(&id, &lemma); err != nil { + t.Fatalf("scan core word: %v", err) + } + if lemma == "cancel" { + retiredWordID = id + break + } + } + if err := rows.Err(); err != nil { + t.Fatalf("iterate core words: %v", err) + } + if err := rows.Close(); err != nil { + t.Fatalf("close core words rows: %v", err) + } + if retiredWordID == 0 { + t.Fatal("cancel word id not found") + } + + nextEntries := []dict.Entry{ + doctorTestEntries()[0], + doctorTestEntries()[1], + doctorTestEntries()[3], + { + Lemma: "coach", + Pos: "verb", + MeaningJA: "指導する", + Level: "core-1", + FrequencyRank: 180, + DistractorGroup: "basic-verb-action", + }, + } + if err := st.SeedWords(ctx, nextEntries, dict.CoreWordsVersion); err != nil { + t.Fatalf("SeedWords() version bump error = %v", err) + } + + if _, err := st.db.ExecContext(ctx, ` +UPDATE words +SET pos = '', level = '', frequency_rank = 0, distractor_group = '' +WHERE id = ? +`, retiredWordID); err != nil { + t.Fatalf("corrupt retired word metadata: %v", err) + } + + report := st.RunDiagnostics(ctx) + if report.HasIssues() { + t.Fatalf("RunDiagnostics() reported issues with only retired-word corruption: %+v", report.Checks) + } +} + +func TestRunDiagnosticsDetectsOrphanRows(t *testing.T) { + t.Parallel() + + ctx := context.Background() + st := newTestStore(t) + if _, err := st.db.ExecContext(ctx, ` +INSERT INTO words (id, lemma, pos, meaning_ja, level, frequency_rank, distractor_group, source) +VALUES + (1, 'adopt', 'verb', '採用する', 'core-1', 100, 'basic-verb-action', ?), + (2, 'apply', 'verb', '応募する', 'core-1', 120, 'basic-verb-action', ?), + (3, 'cancel', 'verb', '取り消す', 'core-1', 140, 'basic-verb-action', ?), + (4, 'deliver', 'verb', '届ける', 'core-1', 160, 'basic-verb-action', ?) +`, WordSourceCore, WordSourceCore, WordSourceCore, WordSourceCore); err != nil { + t.Fatalf("insert legacy words: %v", err) + } + if _, err := st.db.ExecContext(ctx, ` +INSERT INTO app_meta (key, value) +VALUES ('dict_version', ?) +`, dict.CoreWordsVersion); err != nil { + t.Fatalf("insert legacy dict_version: %v", err) + } + if _, err := st.db.ExecContext(ctx, `PRAGMA foreign_keys = OFF;`); err != nil { t.Fatalf("disable foreign keys: %v", err) } @@ -440,16 +540,32 @@ applied_at TEXT NOT NULL DEFAULT CURRENT_TIMESTAMP } } - if err := st.SeedWords(ctx, doctorTestEntries(), dict.CoreWordsVersion); err != nil { - t.Fatalf("SeedWords() error = %v", err) + if _, err := st.db.ExecContext(ctx, ` +INSERT INTO words (id, lemma, pos, meaning_ja, level, frequency_rank, distractor_group, source) +VALUES + (1, 'adopt', 'verb', '採用する', 'core-1', 100, 'basic-verb-action', ?), + (2, 'apply', 'verb', '応募する', 'core-1', 120, 'basic-verb-action', ?), + (3, 'cancel', 'verb', '取り消す', 'core-1', 140, 'basic-verb-action', ?), + (4, 'deliver', 'verb', '届ける', 'core-1', 160, 'basic-verb-action', ?) +`, WordSourceCore, WordSourceCore, WordSourceCore, WordSourceCore); err != nil { + t.Fatalf("insert legacy words: %v", err) } - - words, err := st.ListWordsByPOS(ctx, "verb", 1, nil) - if err != nil { - t.Fatalf("ListWordsByPOS() error = %v", err) + if _, err := st.db.ExecContext(ctx, ` +INSERT INTO app_meta (key, value) +VALUES ('dict_version', ?) +`, dict.CoreWordsVersion); err != nil { + t.Fatalf("insert legacy dict_version: %v", err) } - if len(words) == 0 { - t.Fatal("ListWordsByPOS() returned no words") + + var wordID int64 + if err := st.db.QueryRowContext(ctx, ` +SELECT id +FROM words +WHERE pos = 'verb' +ORDER BY frequency_rank ASC, id ASC +LIMIT 1 +`).Scan(&wordID); err != nil { + t.Fatalf("load legacy word id: %v", err) } if _, err := st.db.ExecContext(ctx, ` @@ -461,7 +577,7 @@ VALUES (?, CURRENT_TIMESTAMP, ?, 1, 0, ?) if _, err := st.db.ExecContext(ctx, ` INSERT INTO session_items (session_id, ordinal, word_id, kind, status) VALUES (?, 1, ?, ?, ?) -`, "legacy-active", words[0].ID, ItemKindNew, ItemStatusPending); err != nil { +`, "legacy-active", wordID, ItemKindNew, ItemStatusPending); err != nil { t.Fatalf("insert legacy session item: %v", err) } diff --git a/internal/store/migrate.go b/internal/store/migrate.go index abdb12d..a0b34ab 100644 --- a/internal/store/migrate.go +++ b/internal/store/migrate.go @@ -5,6 +5,7 @@ import ( "database/sql" "fmt" "sort" + "time" projectassets "github.com/harumiWeb/eitango/assets" "github.com/harumiWeb/eitango/internal/dict" @@ -87,20 +88,22 @@ func (s *Store) SeedWords(ctx context.Context, entries []dict.Entry, version str return fmt.Errorf("begin seed words: %w", err) } + var upsertErr error if coreWordCount > 0 && currentVersion != version { - if err := resetLearningTablesTx(ctx, tx, nil); err != nil { + if err := abandonActiveSessionsTx(ctx, tx, time.Now().UTC()); err != nil { _ = tx.Rollback() - return fmt.Errorf("reset before seed: %w", err) + return fmt.Errorf("abandon active sessions before core sync: %w", err) } - if _, err := deleteWordsBySourceTx(ctx, tx, WordSourceCore); err != nil { + if _, err := syncCoreWordsTx(ctx, tx, entries); err != nil { _ = tx.Rollback() - return fmt.Errorf("replace core words before seed: %w", err) + return fmt.Errorf("sync core words before seed: %w", err) } + } else { + _, upsertErr = upsertWordsTx(ctx, tx, WordSourceCore, entries) } - - if _, err := upsertWordsTx(ctx, tx, WordSourceCore, entries); err != nil { + if upsertErr != nil { _ = tx.Rollback() - return err + return upsertErr } if err := s.setMetaTx(ctx, tx, "dict_version", version); err != nil { diff --git a/internal/store/store_test.go b/internal/store/store_test.go index 70baca1..2d52505 100644 --- a/internal/store/store_test.go +++ b/internal/store/store_test.go @@ -595,7 +595,7 @@ func TestLoadStatsSnapshotCountsConsecutiveReviewDays(t *testing.T) { } } -func TestSeedWordsVersionChangeResetsUserData(t *testing.T) { +func TestSeedWordsVersionChangePreservesMatchedCoreProgress(t *testing.T) { t.Parallel() ctx := context.Background() @@ -642,16 +642,31 @@ func TestSeedWordsVersionChangeResetsUserData(t *testing.T) { t.Fatalf("progress before reseed = %d, want 1", got) } - nextEntries := append(testEntries(), dict.Entry{ - Lemma: "coach", - Pos: "verb", - MeaningJA: "指導する", - Level: "core-1", - FrequencyRank: 400, - DistractorGroup: "basic-verb-action", - ExampleEN: "They coach the team every weekend.", - ExampleJA: "彼らは毎週末チームを指導する。", - }) + trackedWordID := words[0].ID + nextEntries := []dict.Entry{ + { + Lemma: words[0].Lemma, + Pos: words[0].Pos, + MeaningJA: "採用する(更新)", + Level: "core-1", + FrequencyRank: 100, + DistractorGroup: "basic-verb-action", + ExampleEN: "They adopt the updated plan.", + ExampleJA: "彼らは更新後の計画を採用する。", + }, + testEntries()[1], + testEntries()[2], + { + Lemma: "coach", + Pos: "verb", + MeaningJA: "指導する", + Level: "core-1", + FrequencyRank: 400, + DistractorGroup: "basic-verb-action", + ExampleEN: "They coach the team every weekend.", + ExampleJA: "彼らは毎週末チームを指導する。", + }, + } if err := st.SeedWords(ctx, nextEntries, "test-v2"); err != nil { t.Fatalf("SeedWords() version bump error = %v", err) } @@ -659,17 +674,17 @@ func TestSeedWordsVersionChangeResetsUserData(t *testing.T) { if got := mustCountRows(t, st, "words"); got != len(nextEntries) { t.Fatalf("words after version bump = %d, want %d", got, len(nextEntries)) } - if got := mustCountRows(t, st, "sessions"); got != 0 { - t.Fatalf("sessions after version bump = %d, want 0", got) + if got := mustCountRows(t, st, "sessions"); got != 1 { + t.Fatalf("sessions after version bump = %d, want 1", got) } - if got := mustCountRows(t, st, "session_items"); got != 0 { - t.Fatalf("session_items after version bump = %d, want 0", got) + if got := mustCountRows(t, st, "session_items"); got != 1 { + t.Fatalf("session_items after version bump = %d, want 1", got) } - if got := mustCountRows(t, st, "reviews"); got != 0 { - t.Fatalf("reviews after version bump = %d, want 0", got) + if got := mustCountRows(t, st, "reviews"); got != 1 { + t.Fatalf("reviews after version bump = %d, want 1", got) } - if got := mustCountRows(t, st, "progress"); got != 0 { - t.Fatalf("progress after version bump = %d, want 0", got) + if got := mustCountRows(t, st, "progress"); got != 1 { + t.Fatalf("progress after version bump = %d, want 1", got) } version, err := st.metaValue(ctx, "dict_version") @@ -680,18 +695,236 @@ func TestSeedWordsVersionChangeResetsUserData(t *testing.T) { t.Fatalf("dict_version = %q, want test-v2", version) } + trackedWord := mustLoadWordByID(t, st, trackedWordID) + if trackedWord.Lemma != words[0].Lemma { + t.Fatalf("tracked word lemma = %q, want %q", trackedWord.Lemma, words[0].Lemma) + } + if trackedWord.MeaningJA != "採用する(更新)" { + t.Fatalf("tracked word meaning = %q, want updated meaning", trackedWord.MeaningJA) + } + if !mustWordIsActive(t, st, trackedWordID) { + t.Fatalf("tracked word should remain active after version bump") + } + + progress := mustLoadProgress(t, st, trackedWordID) + if progress.State != "review" { + t.Fatalf("progress state after version bump = %q, want review", progress.State) + } + if progress.TotalCorrect != 1 || progress.TotalWrong != 0 { + t.Fatalf("progress totals after version bump = %+v, want one correct answer preserved", progress) + } + snapshot, err := st.LoadHomeSnapshot(ctx) if err != nil { t.Fatalf("LoadHomeSnapshot() error = %v", err) } - if snapshot.NewCount != len(nextEntries) { - t.Fatalf("NewCount after version bump = %d, want %d", snapshot.NewCount, len(nextEntries)) + if snapshot.NewCount != len(nextEntries)-1 { + t.Fatalf("NewCount after version bump = %d, want %d", snapshot.NewCount, len(nextEntries)-1) } if snapshot.ActiveSession != nil { t.Fatalf("ActiveSession after version bump = %+v, want nil", snapshot.ActiveSession) } } +func TestSeedWordsVersionChangeAbandonsActiveSession(t *testing.T) { + t.Parallel() + + ctx := context.Background() + st := newTestStore(t) + + if err := st.SeedWords(ctx, testEntries(), "test-v1"); err != nil { + t.Fatalf("SeedWords() error = %v", err) + } + + words, err := st.ListNewWords(ctx, 10, nil) + if err != nil { + t.Fatalf("ListNewWords() error = %v", err) + } + if len(words) == 0 { + t.Fatal("ListNewWords() returned no words") + } + + record, items, err := st.CreateSession(ctx, ModeLearn, AnswerModeChoice, []SessionItemPlan{ + {WordID: words[0].ID, Kind: ItemKindNew}, + }) + if err != nil { + t.Fatalf("CreateSession() error = %v", err) + } + if len(items) != 1 || items[0].Status != ItemStatusPending { + t.Fatalf("created items = %+v, want one pending item", items) + } + + nextEntries := []dict.Entry{ + { + Lemma: words[0].Lemma, + Pos: words[0].Pos, + MeaningJA: "採用する(更新)", + Level: "core-1", + FrequencyRank: 100, + DistractorGroup: "basic-verb-action", + ExampleEN: "They adopt the updated plan.", + ExampleJA: "彼らは更新後の計画を採用する。", + }, + testEntries()[1], + testEntries()[2], + } + if err := st.SeedWords(ctx, nextEntries, "test-v2"); err != nil { + t.Fatalf("SeedWords() version bump error = %v", err) + } + + loaded, err := st.LoadSession(ctx, record.ID) + if err != nil { + t.Fatalf("LoadSession() error = %v", err) + } + if loaded.Status != SessionStatusAbandoned { + t.Fatalf("session status after version bump = %q, want %q", loaded.Status, SessionStatusAbandoned) + } + if loaded.FinishedAt == nil { + t.Fatal("abandoned session finished_at is nil") + } + + activeRecord, activeItems, err := st.LoadActiveRuntime(ctx) + if err != nil { + t.Fatalf("LoadActiveRuntime() error = %v", err) + } + if activeRecord != nil || activeItems != nil { + t.Fatalf("active runtime after version bump = %+v / %+v, want nil", activeRecord, activeItems) + } + + snapshot, err := st.LoadHomeSnapshot(ctx) + if err != nil { + t.Fatalf("LoadHomeSnapshot() error = %v", err) + } + if snapshot.ActiveSession != nil { + t.Fatalf("ActiveSession after abandoning version-bumped session = %+v, want nil", snapshot.ActiveSession) + } +} + +func TestSeedWordsVersionChangeRetiresRemovedCoreWordsFromPlanning(t *testing.T) { + t.Parallel() + + ctx := context.Background() + st := newTestStore(t) + + if err := st.SeedWords(ctx, testEntries(), "test-v1"); err != nil { + t.Fatalf("SeedWords() error = %v", err) + } + + words, err := st.ListNewWords(ctx, 10, nil) + if err != nil { + t.Fatalf("ListNewWords() error = %v", err) + } + if len(words) < 2 { + t.Fatalf("ListNewWords() returned %d words, want at least 2", len(words)) + } + + keptWordID := words[0].ID + retiredWordID := words[1].ID + recordReviewInMode(t, st, keptWordID, AnswerModeChoice, time.Now().UTC().Add(-2*time.Hour)) + recordReviewInMode(t, st, retiredWordID, AnswerModeChoice, time.Now().UTC().Add(-90*time.Minute)) + if _, err := st.db.ExecContext(ctx, ` +UPDATE progress +SET due_at = ? +WHERE word_id = ? +`, formatTime(time.Now().UTC().Add(-time.Hour)), retiredWordID); err != nil { + t.Fatalf("update retired word due_at: %v", err) + } + + retiredWordBefore := mustLoadWordByID(t, st, retiredWordID) + nextEntries := []dict.Entry{ + { + Lemma: words[0].Lemma, + Pos: words[0].Pos, + MeaningJA: words[0].MeaningJA, + Level: words[0].Level, + FrequencyRank: words[0].FrequencyRank, + DistractorGroup: words[0].DistractorGroup, + ExampleEN: "They adopt the plan.", + ExampleJA: "彼らはその計画を採用する。", + }, + testEntries()[2], + { + Lemma: "coach", + Pos: "verb", + MeaningJA: "指導する", + Level: "core-1", + FrequencyRank: 400, + DistractorGroup: "basic-verb-action", + ExampleEN: "They coach the team every weekend.", + ExampleJA: "彼らは毎週末チームを指導する。", + }, + } + if err := st.SeedWords(ctx, nextEntries, "test-v2"); err != nil { + t.Fatalf("SeedWords() version bump error = %v", err) + } + + if mustWordIsActive(t, st, retiredWordID) { + t.Fatalf("retired word %d should be inactive after version bump", retiredWordID) + } + if keptWord := mustLoadWordByID(t, st, keptWordID); keptWord.ID != keptWordID { + t.Fatalf("kept word id changed: got %d want %d", keptWord.ID, keptWordID) + } + if retiredWord := mustLoadWordByID(t, st, retiredWordID); retiredWord.Lemma != retiredWordBefore.Lemma { + t.Fatalf("retired word history read failed: got %q want %q", retiredWord.Lemma, retiredWordBefore.Lemma) + } + + if dueWords, err := st.ListDueWords(ctx, 10); err != nil { + t.Fatalf("ListDueWords() error = %v", err) + } else if len(dueWords) != 0 { + t.Fatalf("ListDueWords() returned retired due words: %+v", dueWords) + } + + newWords, err := st.ListNewWords(ctx, 10, nil) + if err != nil { + t.Fatalf("ListNewWords() error after version bump = %v", err) + } + for _, word := range newWords { + if word.ID == retiredWordID { + t.Fatalf("retired word appeared in ListNewWords(): %+v", word) + } + } + + reviewedWords, err := st.ListReviewedWordsRandom(ctx, 10) + if err != nil { + t.Fatalf("ListReviewedWordsRandom() error = %v", err) + } + if len(reviewedWords) != 1 || reviewedWords[0].ID != keptWordID { + t.Fatalf("ListReviewedWordsRandom() = %+v, want kept reviewed word only", reviewedWords) + } + + verbWords, err := st.ListWordsByPOS(ctx, "verb", 10, nil) + if err != nil { + t.Fatalf("ListWordsByPOS() error = %v", err) + } + for _, word := range verbWords { + if word.ID == retiredWordID { + t.Fatalf("retired word appeared in ListWordsByPOS(): %+v", word) + } + } + + correct := mustLoadWordByID(t, st, keptWordID) + distractors, err := st.ListDistractorCandidates(ctx, correct, 10, []int64{correct.ID}) + if err != nil { + t.Fatalf("ListDistractorCandidates() error = %v", err) + } + for _, word := range distractors { + if word.ID == retiredWordID { + t.Fatalf("retired word appeared in ListDistractorCandidates(): %+v", word) + } + } + + snapshot, err := st.LoadHomeSnapshot(ctx) + if err != nil { + t.Fatalf("LoadHomeSnapshot() error = %v", err) + } + if snapshot.DueCount != 0 { + t.Fatalf("DueCount after retirement = %d, want 0", snapshot.DueCount) + } + if snapshot.NewCount != 2 { + t.Fatalf("NewCount after retirement = %d, want 2", snapshot.NewCount) + } +} + func TestResetProgressClearsLearningHistoryOnly(t *testing.T) { t.Parallel() @@ -948,6 +1181,26 @@ func mustLoadProgress(t *testing.T, st *Store, wordID int64) Progress { return progress } +func mustLoadWordByID(t *testing.T, st *Store, wordID int64) Word { + t.Helper() + + word, err := st.GetWord(context.Background(), wordID) + if err != nil { + t.Fatalf("GetWord(%d) error = %v", wordID, err) + } + return word +} + +func mustWordIsActive(t *testing.T, st *Store, wordID int64) bool { + t.Helper() + + var isActive int + if err := st.db.QueryRowContext(context.Background(), `SELECT is_active FROM words WHERE id = ?`, wordID).Scan(&isActive); err != nil { + t.Fatalf("query words.is_active for %d: %v", wordID, err) + } + return isActive != 0 +} + func stableUTCNoon() time.Time { now := time.Now().UTC() return time.Date(now.Year(), now.Month(), now.Day(), 12, 0, 0, 0, time.UTC) diff --git a/internal/store/word_repo.go b/internal/store/word_repo.go index ef91048..1213e08 100644 --- a/internal/store/word_repo.go +++ b/internal/store/word_repo.go @@ -114,7 +114,8 @@ SELECT w.id, w.lemma, w.pos, w.meaning_ja, w.level, w.frequency_rank, w.distractor_group, w.example_en, w.example_ja, w.source, w.created_at FROM words w JOIN progress p ON p.word_id = w.id -WHERE p.due_at IS NOT NULL AND p.due_at <= ? +WHERE w.is_active = 1 + AND p.due_at IS NOT NULL AND p.due_at <= ? ORDER BY p.due_at ASC, COALESCE(w.frequency_rank, 999999) ASC, w.id ASC LIMIT ? `, formatTime(time.Now().UTC()), limit) @@ -134,7 +135,8 @@ SELECT w.id, w.lemma, w.pos, w.meaning_ja, w.level, w.frequency_rank, w.distractor_group, w.example_en, w.example_ja, w.source, w.created_at FROM words w LEFT JOIN progress p ON p.word_id = w.id -WHERE (p.word_id IS NULL OR p.state = 'new') +WHERE w.is_active = 1 + AND (p.word_id IS NULL OR p.state = 'new') ` args := make([]any, 0, len(excludeIDs)+1) if len(excludeIDs) > 0 { @@ -170,7 +172,8 @@ FROM ( WHERE r.word_id = w.id AND r.answer_mode = ? ) THEN 1 ELSE 0 END AS write_seen FROM words w - WHERE EXISTS ( + WHERE w.is_active = 1 + AND EXISTS ( SELECT 1 FROM reviews r WHERE r.word_id = w.id AND r.answer_mode = ? @@ -213,7 +216,8 @@ func (s *Store) ListReviewedWordsRandom(ctx context.Context, limit int) ([]Word, idRows, err := s.db.QueryContext(ctx, ` SELECT DISTINCT r.word_id FROM reviews r -WHERE r.word_id IS NOT NULL +JOIN words w ON w.id = r.word_id +WHERE r.word_id IS NOT NULL AND w.is_active = 1 ORDER BY RANDOM() LIMIT ? `, limit) @@ -276,7 +280,7 @@ func (s *Store) ListWordsByPOS(ctx context.Context, pos string, limit int, exclu SELECT id, lemma, pos, meaning_ja, level, frequency_rank, distractor_group, example_en, example_ja, source, created_at FROM words -WHERE pos = ? +WHERE is_active = 1 AND pos = ? ` args := make([]any, 0, len(excludeIDs)+2) args = append(args, pos) @@ -305,7 +309,7 @@ func (s *Store) ListDistractorCandidates(ctx context.Context, correct Word, limi SELECT id, lemma, pos, meaning_ja, level, frequency_rank, distractor_group, example_en, example_ja, source, created_at FROM words -WHERE pos = ? +WHERE is_active = 1 AND pos = ? AND meaning_ja != ? ` args := make([]any, 0, len(excludeIDs)+8) @@ -353,14 +357,17 @@ LIMIT ? } func (s *Store) AbandonActiveSession(ctx context.Context) error { - _, err := s.db.ExecContext(ctx, ` -UPDATE sessions -SET status = ?, finished_at = ? -WHERE status = ? -`, SessionStatusAbandoned, formatTime(time.Now().UTC()), SessionStatusActive) + tx, err := s.db.BeginTx(ctx, nil) if err != nil { + return fmt.Errorf("begin abandon active session: %w", err) + } + if err := abandonActiveSessionsTx(ctx, tx, time.Now().UTC()); err != nil { + _ = tx.Rollback() return fmt.Errorf("abandon active session: %w", err) } + if err := tx.Commit(); err != nil { + return fmt.Errorf("commit abandon active session: %w", err) + } return nil } @@ -376,6 +383,17 @@ WHERE status = ? AND mode = ? return nil } +func abandonActiveSessionsTx(ctx context.Context, tx *sql.Tx, finishedAt time.Time) error { + if _, err := tx.ExecContext(ctx, ` +UPDATE sessions +SET status = ?, finished_at = ? +WHERE status = ? +`, SessionStatusAbandoned, formatTime(finishedAt), SessionStatusActive); err != nil { + return fmt.Errorf("update active sessions to abandoned: %w", err) + } + return nil +} + func (s *Store) CreateSession(ctx context.Context, mode, answerMode string, items []SessionItemPlan) (SessionRecord, []SessionItem, error) { if len(items) == 0 { return SessionRecord{}, nil, fmt.Errorf("create session: no session items") @@ -757,8 +775,10 @@ func (s *Store) countDueWords(ctx context.Context) (int, error) { var count int if err := s.db.QueryRowContext(ctx, ` SELECT COUNT(*) -FROM progress -WHERE due_at IS NOT NULL AND due_at <= ? +FROM progress p +JOIN words w ON w.id = p.word_id +WHERE w.is_active = 1 + AND p.due_at IS NOT NULL AND p.due_at <= ? `, formatTime(time.Now().UTC())).Scan(&count); err != nil { return 0, fmt.Errorf("count due words: %w", err) } @@ -771,7 +791,8 @@ func (s *Store) countNewWords(ctx context.Context) (int, error) { SELECT COUNT(*) FROM words w LEFT JOIN progress p ON p.word_id = w.id -WHERE p.word_id IS NULL OR p.state = 'new' +WHERE w.is_active = 1 + AND (p.word_id IS NULL OR p.state = 'new') `).Scan(&count); err != nil { return 0, fmt.Errorf("count new words: %w", err) } diff --git a/internal/store/word_write.go b/internal/store/word_write.go index b25f922..0634287 100644 --- a/internal/store/word_write.go +++ b/internal/store/word_write.go @@ -20,6 +20,17 @@ type upsertWordCounts struct { updated int } +type existingWordRow struct { + id int64 + isActive bool +} + +type syncCoreWordCounts struct { + inserted int + updated int + retired int +} + func NormalizeImportSource(name string) (string, error) { sourceName := strings.TrimSpace(name) if sourceName == "" { @@ -87,6 +98,19 @@ func (s *Store) countWordsBySource(ctx context.Context, source string) (int, err return count, nil } +func (s *Store) countWordsBySourceActive(ctx context.Context, source string, isActive bool) (int, error) { + activeValue := 0 + if isActive { + activeValue = 1 + } + + var count int + if err := s.db.QueryRowContext(ctx, `SELECT COUNT(*) FROM words WHERE source = ? AND is_active = ?`, source, activeValue).Scan(&count); err != nil { + return 0, fmt.Errorf("count words for source %q active=%t: %w", source, isActive, err) + } + return count, nil +} + func countWordsBySourceTx(ctx context.Context, tx *sql.Tx, source string) (int, error) { var count int if err := tx.QueryRowContext(ctx, `SELECT COUNT(*) FROM words WHERE source = ?`, source).Scan(&count); err != nil { @@ -110,7 +134,11 @@ func deleteWordsBySourceTx(ctx context.Context, tx *sql.Tx, source string) (int, } func upsertWordsTx(ctx context.Context, tx *sql.Tx, source string, entries []dict.Entry) (upsertWordCounts, error) { - existingIDs, err := listExistingWordIDsBySourceTx(ctx, tx, source) + if err := validateEntryKeys(entries, source); err != nil { + return upsertWordCounts{}, err + } + + existingRows, err := listExistingWordRowsBySourceTx(ctx, tx, source) if err != nil { return upsertWordCounts{}, err } @@ -125,8 +153,9 @@ frequency_rank, distractor_group, example_en, example_ja, -source -) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?) +source, +is_active +) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?) `) if err != nil { return upsertWordCounts{}, fmt.Errorf("prepare word insert for source %q: %w", source, err) @@ -137,12 +166,16 @@ source updateStmt, err := tx.PrepareContext(ctx, ` UPDATE words -SET meaning_ja = ?, +SET lemma = ?, + pos = ?, + meaning_ja = ?, level = ?, frequency_rank = ?, distractor_group = ?, example_en = ?, - example_ja = ? + example_ja = ?, + source = ?, + is_active = ? WHERE id = ? `) if err != nil { @@ -154,15 +187,15 @@ WHERE id = ? counts := upsertWordCounts{} for _, entry := range entries { - existingID, exists := existingIDs[wordKey(entry)] + existing, exists := existingRows[wordKey(entry)] if exists { - if err := updateWordTx(ctx, updateStmt, existingID, entry); err != nil { + if err := updateWordTx(ctx, updateStmt, existing.id, source, true, entry); err != nil { return upsertWordCounts{}, err } counts.updated++ continue } - if err := insertWordTx(ctx, insertStmt, source, entry); err != nil { + if err := insertWordTx(ctx, insertStmt, source, true, entry); err != nil { return upsertWordCounts{}, err } counts.inserted++ @@ -171,9 +204,94 @@ WHERE id = ? return counts, nil } -func listExistingWordIDsBySourceTx(ctx context.Context, tx *sql.Tx, source string) (map[string]int64, error) { +func syncCoreWordsTx(ctx context.Context, tx *sql.Tx, entries []dict.Entry) (syncCoreWordCounts, error) { + if err := validateEntryKeys(entries, WordSourceCore); err != nil { + return syncCoreWordCounts{}, err + } + + existingRows, err := listExistingWordRowsBySourceTx(ctx, tx, WordSourceCore) + if err != nil { + return syncCoreWordCounts{}, err + } + + insertStmt, err := tx.PrepareContext(ctx, ` +INSERT INTO words ( +lemma, +pos, +meaning_ja, +level, +frequency_rank, +distractor_group, +example_en, +example_ja, +source, +is_active +) VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?) +`) + if err != nil { + return syncCoreWordCounts{}, fmt.Errorf("prepare core word insert: %w", err) + } + defer func() { + _ = insertStmt.Close() + }() + + updateStmt, err := tx.PrepareContext(ctx, ` +UPDATE words +SET lemma = ?, + pos = ?, + meaning_ja = ?, + level = ?, + frequency_rank = ?, + distractor_group = ?, + example_en = ?, + example_ja = ?, + source = ?, + is_active = ? +WHERE id = ? +`) + if err != nil { + return syncCoreWordCounts{}, fmt.Errorf("prepare core word update: %w", err) + } + defer func() { + _ = updateStmt.Close() + }() + + counts := syncCoreWordCounts{} + for _, entry := range entries { + key := wordKey(entry) + existing, exists := existingRows[key] + if exists { + if err := updateWordTx(ctx, updateStmt, existing.id, WordSourceCore, true, entry); err != nil { + return syncCoreWordCounts{}, err + } + delete(existingRows, key) + counts.updated++ + continue + } + if err := insertWordTx(ctx, insertStmt, WordSourceCore, true, entry); err != nil { + return syncCoreWordCounts{}, err + } + counts.inserted++ + } + + for _, existing := range existingRows { + if !existing.isActive { + continue + } + // False positive: the SQL is static and the id stays parameterized. + // nosemgrep + if _, err := tx.ExecContext(ctx, `UPDATE words SET is_active = 0 WHERE id = ?`, existing.id); err != nil { + return syncCoreWordCounts{}, fmt.Errorf("retire core word %d: %w", existing.id, err) + } + counts.retired++ + } + + return counts, nil +} + +func listExistingWordRowsBySourceTx(ctx context.Context, tx *sql.Tx, source string) (map[string]existingWordRow, error) { rows, err := tx.QueryContext(ctx, ` -SELECT id, lemma, IFNULL(pos, '') +SELECT id, lemma, IFNULL(pos, ''), is_active FROM words WHERE source = ? `, source) @@ -184,33 +302,60 @@ WHERE source = ? _ = rows.Close() }() - existingIDs := make(map[string]int64) + existingRows := make(map[string]existingWordRow) for rows.Next() { var ( - id int64 - lemma string - pos string + id int64 + lemma string + pos string + isActive int ) - if err := rows.Scan(&id, &lemma, &pos); err != nil { + if err := rows.Scan(&id, &lemma, &pos, &isActive); err != nil { return nil, fmt.Errorf("scan existing word for source %q: %w", source, err) } key := strings.ToLower(strings.TrimSpace(lemma) + "\x00" + strings.TrimSpace(pos)) - if _, exists := existingIDs[key]; !exists { - existingIDs[key] = id + if _, exists := existingRows[key]; exists { + return nil, fmt.Errorf("duplicate word key %s already exists in source %q", formatWordKeyLabel(lemma, pos), source) } + existingRows[key] = existingWordRow{id: id, isActive: isActive != 0} } if err := rows.Err(); err != nil { return nil, fmt.Errorf("iterate existing words for source %q: %w", source, err) } - return existingIDs, nil + return existingRows, nil } func wordKey(entry dict.Entry) string { return strings.ToLower(strings.TrimSpace(entry.Lemma) + "\x00" + strings.TrimSpace(entry.Pos)) } -func insertWordTx(ctx context.Context, stmt *sql.Stmt, source string, entry dict.Entry) error { +func validateEntryKeys(entries []dict.Entry, source string) error { + seen := make(map[string]struct{}, len(entries)) + for _, entry := range entries { + key := wordKey(entry) + if _, exists := seen[key]; exists { + return fmt.Errorf("duplicate word key %s in source %q", formatWordKeyLabel(entry.Lemma, entry.Pos), source) + } + seen[key] = struct{}{} + } + return nil +} + +func formatWordKeyLabel(lemma, pos string) string { + trimmedLemma := strings.TrimSpace(lemma) + trimmedPos := strings.TrimSpace(pos) + if trimmedPos == "" { + trimmedPos = "no-pos" + } + return fmt.Sprintf("%q [%s]", trimmedLemma, trimmedPos) +} + +func insertWordTx(ctx context.Context, stmt *sql.Stmt, source string, isActive bool, entry dict.Entry) error { rank := nullableFrequencyRank(entry.FrequencyRank) + activeValue := 0 + if isActive { + activeValue = 1 + } if _, err := stmt.ExecContext( ctx, nullableString(entry.Lemma), @@ -222,21 +367,30 @@ func insertWordTx(ctx context.Context, stmt *sql.Stmt, source string, entry dict nullableString(entry.ExampleEN), nullableString(entry.ExampleJA), source, + activeValue, ); err != nil { return fmt.Errorf("insert word %s for source %q: %w", entry.Lemma, source, err) } return nil } -func updateWordTx(ctx context.Context, stmt *sql.Stmt, id int64, entry dict.Entry) error { +func updateWordTx(ctx context.Context, stmt *sql.Stmt, id int64, source string, isActive bool, entry dict.Entry) error { + activeValue := 0 + if isActive { + activeValue = 1 + } if _, err := stmt.ExecContext( ctx, + nullableString(entry.Lemma), + nullableString(entry.Pos), nullableString(entry.MeaningJA), nullableString(entry.Level), nullableFrequencyRank(entry.FrequencyRank), nullableString(entry.DistractorGroup), nullableString(entry.ExampleEN), nullableString(entry.ExampleJA), + source, + activeValue, id, ); err != nil { return fmt.Errorf("update word %s (%d): %w", entry.Lemma, id, err) diff --git a/tasks/feature_spec.md b/tasks/feature_spec.md index 764245f..d5ea681 100644 --- a/tasks/feature_spec.md +++ b/tasks/feature_spec.md @@ -620,6 +620,37 @@ - 画面ごとの簡易レイアウト - data-dependent な長文 overflow の包括対処 +--- + +## 2026-04-11 issue #49: bundled core 更新時の SRS 維持 + +### Goal + +- `dict_version` 更新で bundled core 語彙が差し替わっても、既存 core 語の `word_id` と SRS 進捗を保持したまま新語だけを追加する。 +- `reset --reseed` だけを明示的な破壊的リセットとして残し、通常起動時の core sync は non-destructive にする。 + +### Required Behavior + +- core 語の同一性は store の現行実装に合わせて `strings.ToLower(strings.TrimSpace(lemma) + "\x00" + strings.TrimSpace(pos))` で判定する。 +- `words` に `is_active` を持たせ、version bump 時の core sync では一致 row を同じ `id` のまま更新し、新語だけを insert し、消えた旧 core は `is_active = 0` にする。 +- metadata 更新時は `meaning_ja` / `level` / `frequency_rank` / `distractor_group` / `example_*` に加えて `lemma` / `pos` も embedded core 側の canonical 値へ揃える。 +- `SeedWords()` は core 未投入時の初回 seed と、同一 version の no-op を維持しつつ、version bump 時だけ destructive reset ではなく core diff sync を行う。 +- version bump 時に active session が存在する場合、question payload を snapshot していない現行設計では再開時に設問文面や distractor が drift するため、その session は sync transaction 内で `abandoned` にする。 +- `reset --reseed` は従来どおり learning tables を全削除し、`source='core'` を active/inactive を問わず全削除して bundled core を再投入する。 +- future planning に使う query は active core だけを対象にする。対象は `ListDueWords` / `ListNewWords` / `ListWriteBasicCandidates` / `ListReviewedWordsRandom` / `ListWordsByPOS` / `ListDistractorCandidates` / `countDueWords` / `countNewWords`。 +- `GetWord()`、export、session summary、履歴参照は inactive core を読めるままにして、過去 session や retired word の review 履歴を壊さない。 +- `doctor` は retired core を辞書破損として扱わず、active core と retired core を分けて報告する。pre-006 schema の read-only DB では schema introspection で fallback し、migration drift だけを報告できるようにする。 +- DB-level unique 制約は今回追加しない。same-source duplicate は sync 時 validation と `doctor` の duplicate check で検出する。 + +### Acceptance + +- version bump 後も、同一 normalized `lemma/pos` の core 語は `word_id` を保持し、`progress` / `reviews` と completed/abandoned session 履歴が失われない。 +- version bump 時点で active だった session は `abandoned` になり、resume 対象から外れる。 +- 新規追加された core 語だけが `new` 候補として増え、退役した core 語は新規セッション計画に出なくなる。 +- retired core を参照する既存 session item / review history / export が壊れない。 +- `reset --reseed` は引き続き learning tables を全削除し、bundled core を完全再投入する。 +- `go test ./internal/store ./internal/quiz ./cmd/eitango` と `go test ./...` が通る。 + ### Required Behavior - `RootModel.width` が既知で、現在 screen/overlay に対応する最小幅を下回るときは通常 UI を描かず narrow message に切り替える。 diff --git a/tasks/lessons.md b/tasks/lessons.md index 10efedd..f411c83 100644 --- a/tasks/lessons.md +++ b/tasks/lessons.md @@ -39,6 +39,7 @@ - panel の外側余白を追加・変更するときは、見た目だけでなく terminal 幅内に収まることも同じテストで確認する。margin は「増やしたら終わり」にせず、幅計算との整合まで見る。 - panel の左右余白は「外側 margin」と「枠内 padding」を分けて扱う。ユーザーが求める余白がどちらかを確認せずに片方だけ増やさない。 - fallback や practice 用の派生 session を追加するときは、通常 session と同じ mode 名に寄せて曖昧化しない。resume 後も挙動を再現できるよう、SRS 反映有無や feedback 契約を mode か同等の永続 state に持たせる。 +- 辞書や metadata の non-destructive sync で `word_id` を維持しても、active session が live な `words` row や distractor 候補を再読込する設計なら設問整合性は維持されない。question snapshot を持たない session は version bump 時に `abandoned` へ倒すか、再開に必要な prompt/choices を session 側へ固定する。 - UI の余白調整では、外側 margin は既定で 0 を維持し、見た目の余白要求はまず枠内 padding で満たす。外側 margin を触るのはレイアウト意図が明示されている場合だけにする。 - adaptive TUI の描画契約は `docs/specs/tui-layout.md` を正本にし、`tasks/lessons.md` へ width tier・省略対象・legacy fallback 画面一覧のような仕様本文を複製しない。 - adaptive TUI を触るときは、主情報の wrap、`width == 0` の legacy fallback、`results` / `stats` / `keymap editor` を含む回帰比較を `docs/specs/tui-layout.md` に沿って確認する。 @@ -54,3 +55,5 @@ - 並列レビューの `approved_slice_*.tsv` は merge 前に `status=approved` を spot check する。行の絞り込みだけで status を `candidate` のまま残すと `merge_parallel_reviews.py` が即失敗し、レビューや監査が完了していても apply に進めない。 - `merge_parallel_reviews.py` は slice dir 内の `approved_slice*.tsv` を広く拾うので、retry 用 TSV を同じディレクトリへ `approved_slice_XX_retry.tsv` のような名前で置かない。比較用の再レビューを作るなら別ディレクトリへ逃がすか、merge 前に必ず退避する。 - core 語彙の多義語は `touchy` / `eject` / `decorator` のように扱いやすい派生義へ寄せず、learner dictionary の先頭義を優先する。人物の恒常的 traits は `obedient` / `meek` と同様に `quality-adjective` を基準にし、一時状態扱いへ寄せない。 +- Codacy / Semgrep の SQL injection ルールが見る箇所では、内部定数しか入らない条件分岐でも `fmt.Sprintf` や文字列連結で SQL を組み立てない。`is_active` などの分岐は static query を switch/if で選択し、DB 呼び出しへ渡す文字列自体を固定化する。 +- static query と bind parameter へ直しても Codacy / Semgrep の誤検出が残るときは、ファイル単位の除外へ逃げず、対象行だけに `// nosemgrep` と理由コメントを付けて suppression を局所化する。 diff --git a/tasks/todo.md b/tasks/todo.md index 983a03d..6357961 100644 --- a/tasks/todo.md +++ b/tasks/todo.md @@ -7,6 +7,18 @@ - [x] 新規追加帯の `verb` 代表訳と `distractor_group` ドリフトを監査して必要な補正を入れる - [x] `validate --embedded-core` と `doctor` で整合性を検証する +# 2026-04-11 issue #49: bundled core 更新時の SRS 維持 + +- [x] `tasks/feature_spec.md` に core sync の non-destructive 仕様と検証条件を追記する +- [x] `words.is_active` 用 migration を追加し、core sync 用の store helper を実装する +- [x] `SeedWords()` を version bump 時の destructive reseed から diff sync に置き換える +- [x] future planning 用 query と件数集計を active words 限定へ更新する +- [x] `doctor` を retired core / pre-006 schema fallback に対応させる +- [x] `reset --reseed` の destructive 契約を維持する回帰テストを残しつつ、version bump 時の SRS 保持テストへ差し替える +- [x] ADR-0002 を新しい core dictionary lifecycle に更新する +- [x] `go test ./internal/store ./internal/quiz ./cmd/eitango` と `go test ./...` を通す +- [x] review follow-up として、version bump 時の active session を `abandoned` に倒す回帰と実装へ合わせて issue #49 の仕様メモを更新する + # 5k 初回リリース TODO このファイルは、初回 OSS リリースに向けた active backlog だけを管理する。