Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
63 changes: 63 additions & 0 deletions TESTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,69 @@ Reports at: `app/build/reports/tests/testDebugUnitTest/`

Requires a connected device or running emulator. Reports at: `app/build/reports/androidTests/connected/`

### OCR Fixture Regression

Labelled OCR fixtures live under `app/src/test/resources/images/`, with expectations in
`app/src/test/resources/images/labels.md`.

Run the fixture-backed OCR regression on a connected device or emulator with:

```bash
./gradlew connectedDebugAndroidTest \
-Pandroid.testInstrumentationRunnerArguments.class=com.receiptscanner.data.ocr.OcrFixtureRegressionTest#extractedDataMatchesLabelledReceipts
```

The suite exercises the real on-device OCR pipeline end to end:

1. Load receipt images from test assets
2. Run ML Kit text recognition on-device
3. Parse store, total, date, and card-last-four
4. Compare extracted data to the labelled expectations
5. Fail only if baseline accuracy thresholds regress

Current baseline gates:

- store accuracy >= 60%
- total accuracy >= 65%
- date accuracy >= 65%
- card last-four accuracy >= 75%
- exact record accuracy >= 20%

On failure, the test prints an OCR fixture scorecard with per-image mismatches so parser and
normalization regressions can be tuned incrementally.

#### Dumping OCR Results for Offline Iteration

The on-device test can serialize ML Kit's raw OCR output to JSON files. This lets you iterate
on `ReceiptParser` heuristics using the fast JVM-only test (seconds) instead of re-running the
full on-device pipeline (minutes).

**Step 1 — Dump OCR results from device:**

```bash
./gradlew connectedDebugAndroidTest \
-Pandroid.testInstrumentationRunnerArguments.class=com.receiptscanner.data.ocr.OcrFixtureRegressionTest#extractedDataMatchesLabelledReceipts \
-Pandroid.testInstrumentationRunnerArguments.ocrFixture.dumpOcrResults=true \
-Pandroid.testInstrumentationRunnerArguments.ocrFixture.enforceThresholds=false
```

**Step 2 — Pull OCR cache to your machine:**

```bash
adb pull /sdcard/Android/data/com.receiptscanner/files/ocr-cache/ app/src/test/resources/ocr-cache/
```

**Step 3 — Iterate on parser heuristics using the JVM test:**

```bash
./gradlew testDebugUnitTest --tests "com.receiptscanner.data.ocr.ReceiptParserFixtureTest"
```

This test replays the cached ML Kit output through `ReceiptParser` and prints a scorecard.
Change parser logic → re-run → check scores → repeat. No emulator required.

Re-dump OCR results (Step 1) only when you change image preprocessing.

### All Tests

```bash
Expand Down
26 changes: 26 additions & 0 deletions app/build.gradle.kts
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import org.gradle.api.tasks.Sync

plugins {
alias(libs.plugins.android.application)
alias(libs.plugins.kotlin.android)
Expand All @@ -6,6 +8,12 @@ plugins {
alias(libs.plugins.ksp)
}

val sharedTestAssetsDir = layout.buildDirectory.dir("generated/shared-test-assets")
val syncSharedTestAssets by tasks.registering(Sync::class) {
from("src/test/resources")
into(sharedTestAssetsDir)
}

android {
namespace = "com.receiptscanner"
compileSdk = 35
Expand Down Expand Up @@ -44,6 +52,18 @@ android {
packaging {
resources {
excludes += "/META-INF/{AL2.0,LGPL2.1}"
excludes += "/META-INF/LICENSE.md"
excludes += "/META-INF/LICENSE-notice.md"
}
}

sourceSets {
getByName("test") {
java.srcDir("src/sharedTest/kotlin")
}
getByName("androidTest") {
java.srcDir("src/sharedTest/kotlin")
assets.srcDir(sharedTestAssetsDir.get().asFile)
}
}
}
Expand Down Expand Up @@ -116,6 +136,7 @@ dependencies {
// Unit Testing
testImplementation(libs.bundles.testing)
testRuntimeOnly(libs.junit5.engine)
testRuntimeOnly(libs.junit5.platform.launcher)

// Android / Instrumentation Testing
androidTestImplementation(libs.bundles.android.testing)
Expand All @@ -124,3 +145,8 @@ dependencies {
tasks.withType<Test> {
useJUnitPlatform()
}

tasks.matching { it.name.startsWith("merge") && it.name.endsWith("AndroidTestAssets") }
.configureEach {
dependsOn(syncSharedTestAssets)
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
package com.receiptscanner.data.ocr

import android.content.Context
import com.receiptscanner.testing.receiptfixtures.ReceiptFixture
import com.receiptscanner.testing.receiptfixtures.ReceiptFixtureLabelsParser
import java.io.File

object OcrFixtureImageLoader {

fun loadFixturesFromAssets(context: Context): List<ReceiptFixture> {
val labels = context.assets.open("images/labels.md")
.bufferedReader()
.use { it.readText() }
val fixtures = ReceiptFixtureLabelsParser.parse(labels)
val availableAssets = context.assets.list("images")?.toSet().orEmpty()

fixtures.forEach { fixture ->
require(availableAssets.contains(fixture.imageName)) {
"Missing fixture image asset: ${fixture.imageName}"
}
}

return fixtures
}

fun copyImageToCache(assetContext: Context, storageContext: Context, imageName: String): File {
val fixtureDir = File(storageContext.cacheDir, "ocr-fixtures").apply { mkdirs() }
val tempFile = File(fixtureDir, imageName)
assetContext.assets.open("images/$imageName").use { input ->
tempFile.outputStream().use { output -> input.copyTo(output) }
}
return tempFile
}
}
Loading
Loading