feat: Add comprehensive autocorrection system with batch import and m…#95
Draft
pzauner wants to merge 2 commits intopalsoftware:mainfrom
Draft
feat: Add comprehensive autocorrection system with batch import and m…#95pzauner wants to merge 2 commits intopalsoftware:mainfrom
pzauner wants to merge 2 commits intopalsoftware:mainfrom
Conversation
…ulti-language support
This PR introduces a complete autocorrection system with automatic loading of
accent/umlaut replacement rules and a user-friendly batch import interface.
## 🎯 Core Features
### 1. Automatic Autocorrection Loading (51,537 rules across 6 languages)
**MOST IMPORTANT**: Autocorrection rules are now **automatically loaded** from
`assets/common/autocorrect/` at app startup - no user action required!
Generated rules are immediately active after installation:
- **German (DE)**: 7,911 rules (ä→ae, ö→oe, ü→ue, ß→ss)
- **French (FR)**: 15,449 rules (é→e, è→e, à→a, ç→c, etc.)
- **Spanish (ES)**: 10,683 rules (á→a, ñ→n, etc.)
- **English (EN)**: 121 rules (preserves 51 existing manual rules)
- **Italian (IT)**: 1,816 rules (preserves 18 existing manual rules)
- **Polish (PL)**: 15,472 rules
**How it works:**
```
App Launch → AutoCorrector.loadCorrections()
→ Loads assets/common/autocorrect/auto_corrections_{lang}.json
→ User types "ueber" → automatically becomes "über" ✨
```
### 2. Batch Import UI (AutoCorrectionImportActivity)
User-friendly interface for importing custom autocorrection rules:
**Features:**
- File picker integration (JSON import)
- Language selector: Choose target language before import
- Real-time preview: filename, rule count, language code/name
- Validation with detailed error messages
- Progress indicator and success/error feedback
- Automatic activation: Imported rules are enabled immediately
**Access:** Settings → Auto-Correction → "Regeln Batch-Import"
**JSON Format:**
```json
{
"language": "de",
"name": "Deutsch",
"rules": {
"ueber": "über",
"fuer": "für"
}
}
```
### 3. Delete All Feature (AutoCorrectEditScreen)
Safely delete all autocorrection rules for a specific language:
**Features:**
- Delete button (🗑️ icon) next to Add button in top bar
- Only visible when rules exist
- Confirmation dialog showing rule count and language
- Warning: "This action cannot be undone"
- Immediate reload after deletion
**Use Cases:**
- Undo incorrect batch imports
- Switch between different rule sets
- Testing and development
### 4. Performance Optimizations
**Problem:** UI freeze when displaying 7,000+ rules
**Solution:** Replaced `Column + forEach` with `LazyColumn + items()`
**Results:**
- Smooth scrolling for 7,000+ rules
- Instant load time
- Virtualized rendering (only visible items rendered)
**Changed Files:**
- `AutoCorrectEditScreen.kt`: LazyColumn implementation
### 5. Universal Autocorrection Generator (generate_autocorrections.py)
Python script for generating autocorrection rules from base dictionaries.
**Features:**
- Multi-language support: de, fr, es, en, it, pl
- Language-specific transformations:
- German: ä→ae, ö→oe, ü→ue, ß→ss
- Others: Generic accent removal (NFD normalization)
- Preserves manually defined rules by default
- Outputs directly to `assets/common/autocorrect/`
**Usage:**
```bash
# Generate for all supported languages
python3 scripts/generate_autocorrections.py
# Generate for specific languages only
python3 scripts/generate_autocorrections.py de fr
# Overwrite existing rules (don't preserve manual edits)
python3 scripts/generate_autocorrections.py --no-preserve
```
**Output:**
```
DE: 7,911 rules → auto_corrections_de.json (254 KB)
FR: 15,449 rules → auto_corrections_fr.json (439 KB)
ES: 10,683 rules → auto_corrections_es.json (299 KB)
EN: 121 rules → auto_corrections_en.json (3 KB)
IT: 1,816 rules → auto_corrections_it.json (48 KB)
PL: 15,472 rules → auto_corrections_pl.json (440 KB)
```
## 📋 Technical Changes
### New Files
- `app/src/main/java/.../AutoCorrectionImportActivity.kt` (469 lines)
- Batch import UI with file picker, validation, language selector
- `scripts/generate_autocorrections.py` (228 lines)
- Universal generator for multi-language autocorrection rules
- `scripts/convert_dictionaries.py` (moved from root)
- Dictionary format converter (organization cleanup)
### Modified Files
- `app/src/main/AndroidManifest.xml`
- Registered `AutoCorrectionImportActivity`
- `app/src/main/java/.../AutoCorrectionCategoryScreen.kt`
- Added "Regeln Batch-Import" button with cloud upload icon
- Links to new import activity
- `app/src/main/java/.../AutoCorrectEditScreen.kt`
- **Performance**: Replaced `Column + verticalScroll + forEach` with `LazyColumn + items()`
- **Feature**: Added "Delete All" button with confirmation dialog
- **Import**: `DeleteSweep` icon in error color
- `app/src/main/assets/common/autocorrect/auto_corrections_*.json` (6 files)
- Populated with generated rules (51,537 total)
- Preserved existing manual rules where applicable
- `scripts/README.md`
- Added documentation for `generate_autocorrections.py`
- Added documentation for `convert_dictionaries.py`
- Organized into sections: Main Scripts, Legacy Scripts
- `.gitignore`
- Added `.idea/deploymentTargetSelector.xml`
- Added `app/build.properties`
- Prevents IDE-specific files from being committed
## 🚀 How to Use (For Developers)
### Generate Autocorrection Rules
```bash
# Generate for all languages (recommended after dictionary updates)
cd /path/to/project
python3 scripts/generate_autocorrections.py
# Or for specific languages only
python3 scripts/generate_autocorrections.py de fr
```
### Regenerate After Dictionary Changes
```bash
# When base dictionaries are updated:
python3 scripts/generate_autocorrections.py --no-preserve
```
This overwrites existing rules. Use with caution if manual rules exist.
## 📱 How to Use (For End Users)
### Option 1: Automatic (Default)
**No action required!** Autocorrection rules are automatically active:
1. Install/update Pastiera
2. Start typing: "ueber" → "über", "cafe" → "café"
3. Works immediately for all 6 supported languages
### Option 2: Batch Import (Custom Rules)
1. Open Pastiera Settings
2. Navigate to: **Settings → Auto-Correction**
3. Tap: **"Regeln Batch-Import"**
4. Select JSON file from device
5. (Optional) Change target language
6. Tap: **"Alle Regeln importieren"**
7. Done! Custom rules override defaults
**JSON Format Example:**
```json
{
"language": "de",
"name": "Meine Regeln",
"rules": {
"hallo": "Hallo",
"danke": "Danke!"
}
}
```
### Option 3: Delete All Rules (Per Language)
1. Settings → Auto-Correction → Select language (e.g., Deutsch)
2. Tap 🗑️ icon in top-right (next to + button)
3. Confirm deletion in dialog
4. All rules for that language are removed
## 🔄 System Architecture
### Loading Priority
```
1. Custom Rules (from Batch Import)
↓ (if exists, skip step 2)
2. Asset Rules (automatic, built-in)
↓ (loaded from assets/common/autocorrect/)
3. Runtime Application
```
**Key Point:** Custom imports override asset files. This allows users to:
- Customize built-in rules
- Add new languages
- Test different rule sets
### File Locations
```
Built-in Rules (Automatic):
└─ app/src/main/assets/common/autocorrect/
├─ auto_corrections_de.json
├─ auto_corrections_fr.json
└─ ...
Custom Rules (User Imports):
└─ SharedPreferences
└─ "auto_correct_custom_{language}"
```
## 🧪 Testing
### Manual Testing Checklist
- [ ] Install APK
- [ ] Type "ueber" in any text field → Should become "über"
- [ ] Type "cafe" → Should become "café"
- [ ] Settings → Auto-Correction → Batch Import
- [ ] Import test JSON file
- [ ] Verify rules are applied immediately
- [ ] Delete all rules for a language
- [ ] Confirm rules are removed
### Test JSON File
Create `test_rules.json`:
```json
{
"language": "de",
"name": "Test",
"rules": {
"test": "TEST",
"hallo": "HALLO"
}
}
```
## 📊 Statistics
- **14 files changed**
- **52,401 insertions**, 218 deletions
- **51,537 autocorrection rules** generated
- **469 lines** of new UI code (AutoCorrectionImportActivity)
- **228 lines** of Python generation code
- **6 languages** supported out of the box
## 🎯 Benefits
### For Users
✅ **Faster typing**: No need to access special characters
✅ **Multi-language**: Works for German umlauts, French accents, etc.
✅ **Automatic**: No setup required, works immediately
✅ **Customizable**: Import custom rules via JSON
✅ **Safe**: Delete all feature with confirmation
### For Developers
✅ **Maintainable**: Single script generates all languages
✅ **Extensible**: Easy to add new languages
✅ **Preserved**: Manual rules are kept by default
✅ **Documented**: Complete README in scripts/
✅ **Organized**: All scripts in scripts/ folder
## 🔍 Breaking Changes
None. This is a new feature with backward compatibility:
- Existing custom rules are preserved
- App works without autocorrection files (fallback)
- No changes to existing autocorrection behavior
## Notes
- Asset files are loaded first at app startup (see `AutoCorrector.loadCorrections()`)
- Custom imports take precedence over asset files
- Generated rules preserve existing manual edits by default
- All autocorrection files use simple JSON: `{"from": "to"}`
- Language codes follow standard: de, en, fr, es, it, pl
## Acknowledgments
- Preserves existing manual rules in FR (40), EN (51), IT (18)
- Generator script respects frequency data for collision resolution
- UI follows Material Design 3 guidelines
fe7c448 to
35c9b8e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
…ulti-language support
This PR introduces a complete autocorrection system with automatic loading of accent/umlaut replacement rules and a user-friendly batch import interface.
🎯 Core Features
1. Automatic Autocorrection Loading (51,537 rules across 6 languages) MOST IMPORTANT: Autocorrection rules are now automatically loaded from
assets/common/autocorrect/at app startup - no user action required!Generated rules are immediately active after installation:
How it works:
2. Batch Import UI (AutoCorrectionImportActivity) User-friendly interface for importing custom autocorrection rules:
Features:
Access: Settings → Auto-Correction → "Regeln Batch-Import"
JSON Format:
{ "language": "de", "name": "Deutsch", "rules": { "ueber": "über", "fuer": "für" } }3. Delete All Feature (AutoCorrectEditScreen)
Safely delete all autocorrection rules for a specific language:
Features:
Use Cases:
4. Performance Optimizations
Problem: UI freeze when displaying 7,000+ rules Solution: Replaced
Column + forEachwithLazyColumn + items()Results:
Changed Files:
AutoCorrectEditScreen.kt: LazyColumn implementation5. Universal Autocorrection Generator (generate_autocorrections.py) Python script for generating autocorrection rules from base dictionaries.
Features:
assets/common/autocorrect/Usage:
Output:
📋 Technical Changes
New Files
app/src/main/java/.../AutoCorrectionImportActivity.kt(469 lines)scripts/generate_autocorrections.py(228 lines)scripts/convert_dictionaries.py(moved from root)Modified Files
app/src/main/AndroidManifest.xmlAutoCorrectionImportActivityapp/src/main/java/.../AutoCorrectionCategoryScreen.ktapp/src/main/java/.../AutoCorrectEditScreen.ktColumn + verticalScroll + forEachwithLazyColumn + items()DeleteSweepicon in error colorapp/src/main/assets/common/autocorrect/auto_corrections_*.json(6 files)scripts/README.mdgenerate_autocorrections.pyconvert_dictionaries.py.gitignore.idea/deploymentTargetSelector.xmlapp/build.properties🚀 How to Use (For Developers)
Generate Autocorrection Rules
Regenerate After Dictionary Changes
# When base dictionaries are updated: python3 scripts/generate_autocorrections.py --no-preserveThis overwrites existing rules. Use with caution if manual rules exist.
📱 How to Use (For End Users)
Option 1: Automatic (Default)
No action required! Autocorrection rules are automatically active:
Option 2: Batch Import (Custom Rules)
JSON Format Example:
{ "language": "de", "name": "Meine Regeln", "rules": { "hallo": "Hallo", "danke": "Danke!" } }Option 3: Delete All Rules (Per Language)
🔄 System Architecture
Loading Priority
Key Point: Custom imports override asset files. This allows users to:
File Locations
🧪 Testing
Manual Testing Checklist
Test JSON File
Create
test_rules.json:{ "language": "de", "name": "Test", "rules": { "test": "TEST", "hallo": "HALLO" } }📊 Statistics
🎯 Benefits
For Users
✅ Faster typing: No need to access special characters ✅ Multi-language: Works for German umlauts, French accents, etc. ✅ Automatic: No setup required, works immediately ✅ Customizable: Import custom rules via JSON
✅ Safe: Delete all feature with confirmation
For Developers
✅ Maintainable: Single script generates all languages ✅ Extensible: Easy to add new languages
✅ Preserved: Manual rules are kept by default
✅ Documented: Complete README in scripts/
✅ Organized: All scripts in scripts/ folder
🔍 Breaking Changes
None. This is a new feature with backward compatibility:
Notes
AutoCorrector.loadCorrections()){"from": "to"}Acknowledgments