The current implementation provides exact match and wildcard match on label boundaries. Anything that matches, is discarded from downstream processing (local analyse/events).
The first feature enhancement is to add more types of matching. This is somewhat challenging when using the current DAWG library, since it doesn't support adding data to keys. Turning the DAWG into a "transducer" with data values adds significant overhead.
A suggested approach is to define multiple symbols as equivalent to '.' in domain names to indicate multiple types of wildcards.
A fork of Steve Hanov's DAWG library implements some of these ideas.
The current implementation provides exact match and wildcard match on label boundaries. Anything that matches, is discarded from downstream processing (local analyse/events).
The first feature enhancement is to add more types of matching. This is somewhat challenging when using the current DAWG library, since it doesn't support adding data to keys. Turning the DAWG into a "transducer" with data values adds significant overhead.
A suggested approach is to define multiple symbols as equivalent to '.' in domain names to indicate multiple types of wildcards.
A fork of Steve Hanov's DAWG library implements some of these ideas.