This file provides guidance to WARP (warp.dev) when working with code in this repository.
matchy-java is a Java wrapper for matchy, providing JNA bindings to the native matchy library for fast IoC (Indicator of Compromise) matching.
Status: ✅ Core Features Complete
- JNA bindings implemented (NativeLoader, MatchyLibrary, NativeStructs)
- Core wrapper classes implemented (QueryResult, DatabaseStats, OpenOptions, MatchyException)
- Database and DatabaseBuilder classes implemented
- Extractor API implemented (Extractor, ExtractedMatch, ExtractFlags, ItemType)
- Unit tests for Database, DatabaseBuilder, and Extractor
matchy-java/
├── java/ # Maven project
│ ├── pom.xml # Maven configuration (Java 11, JNA 5.14.0, Gson)
│ └── src/main/java/com/matchylabs/matchy/
│ ├── jna/ # Package-private JNA bindings layer
│ │ ├── NativeLoader.java # Platform detection & native library loading
│ │ ├── MatchyLibrary.java # JNA interface to matchy C API
│ │ └── NativeStructs.java # JNA structure mappings (MatchyResult, etc.)
│ ├── Database.java # Main public API for querying
│ ├── DatabaseBuilder.java # Database builder API
│ ├── Extractor.java # IoC extraction from text
│ ├── ExtractedMatch.java # Single extracted match
│ ├── ExtractFlags.java # Extraction type flags
│ ├── ItemType.java # Enum of extractable item types
│ ├── QueryResult.java # Query result wrapper
│ ├── DatabaseStats.java # Database statistics
│ ├── OpenOptions.java # Database open configuration
│ └── MatchyException.java # Exception type
├── native/matchy/ # Git submodule to matchy core (Rust)
└── examples/ # Usage examples (empty, coming soon)
- Clean Java API: JNA bindings are package-private; users interact with idiomatic Java classes
- Resource safety: Database handles and native resources must be properly managed
- Builder pattern: Use builders for configuration (OpenOptions, DatabaseBuilder)
- Exception handling: Convert C error codes to MatchyException with descriptive messages
- Platform independence: NativeLoader handles Windows/macOS/Linux + x86_64/aarch64 detection
- Java: JDK 11+ (configured in pom.xml)
- Maven: 3.6+ for building
- Rust: Required to build the native matchy library (see native/matchy/WARP.md)
- Git submodules:
git submodule update --init --recursive
The Java wrapper requires the compiled native matchy library. Build it from the submodule:
# Build native matchy library (release mode for production)
cd native/matchy
cargo build --release
# The library will be at:
# - macOS: native/matchy/target/release/libmatchy.dylib
# - Linux: native/matchy/target/release/libmatchy.so
# - Windows: native/matchy/target/release/matchy.dllFor development of the native library, see native/matchy/WARP.md.
cd java
# Compile Java code
mvn compile
# Run tests (when implemented)
mvn test
# Package JAR (includes sources and javadoc)
mvn package
# Install to local Maven repository
mvn install
# Clean build artifacts
mvn clean# Run specific test class
mvn test -Dtest=DatabaseTest
# Run specific test method
mvn test -Dtest=DatabaseTest#testQuery
# Run with debug output
mvn test -X# Format code (if using a formatter plugin)
mvn spotless:apply # if spotless is configured
# Generate Javadoc
mvn javadoc:javadoc
# Open generated docs
open java/target/site/apidocs/index.html
# Check for dependency updates
mvn versions:display-dependency-updatesThe NativeLoader class handles platform-specific library loading:
- Platform detection: Detects OS (linux/macos/windows) and architecture (x86_64/aarch64)
- Resource lookup: Searches for library at
/native/{platform}/{libname}in JAR - Temporary extraction: Extracts library to temp file for System.load()
- One-time initialization: Library loads once on first MatchyLibrary.INSTANCE access
For development, the native library must be either:
- In JAR resources at
src/main/resources/native/{platform}/ - In system library path (LD_LIBRARY_PATH, DYLD_LIBRARY_PATH, PATH)
- Specified via
-Djna.library.path=path/to/native/libs
Example for development:
# Set library path for testing
export LD_LIBRARY_PATH=$PWD/native/matchy/target/release:$LD_LIBRARY_PATH # Linux
export DYLD_LIBRARY_PATH=$PWD/native/matchy/target/release:$DYLD_LIBRARY_PATH # macOS
# Or use Maven property
mvn test -Djna.library.path=../native/matchy/target/releaseJNA structures in NativeStructs.java must match C struct layouts exactly:
// Must match C struct field order and types
@Structure.FieldOrder({"found", "prefix_len", "_data_cache", "_db_ref"})
static class MatchyResult extends Structure {
public boolean found; // C: bool (1 byte)
public byte prefix_len; // C: uint8_t
public Pointer _data_cache; // C: void*
public Pointer _db_ref; // C: void*
}Critical: If the C struct changes in matchy.h, update the corresponding Java structure immediately. Field order, types, and padding must match exactly.
Convert C error codes to Java exceptions with descriptive messages:
// In wrapper class (Database.java)
Pointer dbPtr = MatchyLibrary.INSTANCE.matchy_open(path);
if (dbPtr == null) {
throw new MatchyException("Failed to open database: " + path);
}
// For functions returning error codes
int result = MatchyLibrary.INSTANCE.matchy_builder_add(builder, key, data);
if (result != MatchyLibrary.MATCHY_SUCCESS) {
throw new MatchyException("Failed to add entry: error code " + result);
}Native resources must be explicitly freed. Use try-with-resources when Database implements AutoCloseable:
// Database.java should implement AutoCloseable
public class Database implements AutoCloseable {
private final Pointer nativeHandle;
@Override
public void close() {
if (nativeHandle != null) {
MatchyLibrary.INSTANCE.matchy_close(nativeHandle);
}
}
}
// User code
try (Database db = Database.open("threats.mxy")) {
QueryResult result = db.query("192.168.1.1");
// ...
} // Automatically closedUse builders for complex configuration:
Database db = Database.builder()
.path("threats.mxy")
.cacheCapacity(100_000)
.autoReload(true)
.build();
// Or with OpenOptions
OpenOptions options = OpenOptions.defaults()
.cacheCapacity(50_000)
.noCache(); // Fluent API
Database db = Database.open("threats.mxy", options);Use the Extractor to find IoCs (Indicators of Compromise) in text:
// Extract all supported types
try (Extractor extractor = Extractor.create(ExtractFlags.ALL)) {
List<ExtractedMatch> matches = extractor.extract(
"Contact user@example.com at 192.168.1.1 about evil.com");
for (ExtractedMatch match : matches) {
System.out.println(match.getItemType() + ": " + match.getValue());
}
}
// Extract only specific types
int flags = ExtractFlags.DOMAINS | ExtractFlags.IPV4 | ExtractFlags.IPV6;
try (Extractor extractor = Extractor.create(flags)) {
List<ExtractedMatch> matches = extractor.extract(text);
// Only domains and IPs are extracted
}Supported extraction types (see ExtractFlags):
- DOMAINS - domain names (e.g., "example.com")
- EMAILS - email addresses
- IPV4 / IPV6 - IP addresses
- HASHES - file hashes (MD5, SHA1, SHA256, SHA384, SHA512)
- BITCOIN / ETHEREUM / MONERO - cryptocurrency addresses
- ALL - extract everything
Based on README.md, these are the next implementation priorities:
Create .github/workflows/ci.yml:
- Build native library (Rust) for Linux/macOS/Windows
- Build Java wrapper (Maven)
- Run tests
- Generate Javadoc
- Create release artifacts with platform-specific native libraries
Create example programs in examples/:
BasicQuery.java- simple query exampleBuildDatabase.java- building database exampleBatchProcessing.java- processing files example
The native matchy library is a Git submodule:
# Initialize submodule (first time)
git submodule update --init --recursive
# Update submodule to latest upstream
cd native/matchy
git pull origin main
cd ../..
git add native/matchy
git commit -m "Update matchy submodule"
# Pull updates including submodules
git pull --recurse-submodulesImportant: When making changes to matchy-java that depend on new matchy C API features, coordinate submodule updates carefully.
- Test each public method in isolation
- Mock native calls where possible (or use test databases)
- Test error conditions (invalid paths, null pointers, corrupt data)
- Test resource cleanup (no memory leaks)
- Test with real .mxy database files
- Test IP queries (IPv4, IPv6, CIDRs)
- Test string queries (exact matches, glob patterns)
- Test database building end-to-end
- Test multi-threaded access (thread safety of native library)
Store test databases in src/test/resources/:
test-ips.mxy- IP address databasetest-strings.mxy- string/pattern databasetest-combined.mxy- mixed IP and string data
JNA automatically handles structure padding, but if queries return incorrect data, verify:
- Field order matches C struct exactly (
@Structure.FieldOrder) - Java types match C types (bool→boolean, uint8_t→byte, size_t→NativeLong)
- Test on multiple platforms (padding differs between architectures)
If you see UnsatisfiedLinkError:
- Check native library was built (
ls native/matchy/target/release/) - Verify library is in JAR resources or library path
- Check platform detection in NativeLoader (add debug logging)
- Use
-Djna.library.pathfor development
Native resources must be freed:
- Always call
matchy_close()on database handles - Call
matchy_free_result()for result pointers (if used) - Call
matchy_free_string()for returned strings - Implement
AutoCloseableon Database for RAII-style cleanup - Write tests that check resource cleanup
The native matchy library is thread-safe for queries (read operations) but:
- Database opening/closing must be synchronized
- Builder operations are NOT thread-safe
- Cache operations are thread-safe (internally synchronized in C)
Test multi-threaded access explicitly.
The pom.xml is configured for Maven Central deployment:
- GPG signing plugin configured
- Source and javadoc JARs generated
- OSSRH repository configured
Before deploying:
- Create Sonatype JIRA account
- Request com.matchylabs groupId
- Configure GPG keys
- Set credentials in
~/.m2/settings.xml - Deploy:
mvn clean deploy
- Native matchy: See
native/matchy/WARP.mdfor Rust library development - C API Reference: See
native/matchy/include/matchy.hfor complete C API - Matchy Book: https://matchylabs.github.io/matchy/ for user documentation
- Rust API docs:
cd native/matchy && cargo doc --open
Apache-2.0 (matching native matchy library)