Skip to content

feature: support simple filtering operation#209

Open
Abolfazlrezaei93 wants to merge 3 commits intodevelopfrom
feature/support-simple-filtering-operation
Open

feature: support simple filtering operation#209
Abolfazlrezaei93 wants to merge 3 commits intodevelopfrom
feature/support-simple-filtering-operation

Conversation

@Abolfazlrezaei93
Copy link
Collaborator

🚀 feature: support simple filtering operation

Add GeoJsonFilterOperation for point-radius filtering of GeoJSON features + tests + DHN examples

📌 Summary

This PR adds a new CITYData operation to filter GeoJSON features by centroid distance from a target point (longitude, latitude) within a radius (meters). It also adds tests and runnable example queries for DHN roads/buildings.

🎯 Why

Current filtering is mostly string/JSON-key based.
For spatial workflows (roads/buildings), we needed a geospatial filter over GeoJSON FeatureCollection data.

✅ What Was Added

  • 🧠 New operation:
    • src/main/java/ca/concordia/encs/citydata/operations/GeoJsonFilterOperation.java
  • 🧪 New unit tests:
    • src/test/java/ca/concordia/encs/citydata/operations/GeoJsonFilterOperationTest.java
  • 🔎 Discovery test update (/operations/list assertions):
    • src/test/java/ca/concordia/encs/citydata/core/DiscoveryRoutesTest.java
  • 🗺️ New example queries:
    • docs/examples/queries/dhnRoadsGeoJsonFilterByPointRadius.json
    • docs/examples/queries/dhnBuildingsGeoJsonFilterByPointRadius.json

⚙️ How GeoJsonFilterOperation Works

  • Input type: ArrayList<String> (compatible with CKANProducer output)
  • Parses each string as JSON
  • If object is a GeoJSON FeatureCollection, iterates features
  • Computes feature centroid from geometry coordinates
  • Computes haversine distance to target point
  • Keeps only features where distance <= radiusMeters
  • Validates parameter presence and valid ranges
  • Supports nested coordinate structures + GeometryCollection

🧩 New Parameters

  • centerLongitude (Double, range [-180, 180])
  • centerLatitude (Double, range [-90, 90])
  • radiusMeters (Double, must be > 0)

🔄 Intended Workflow

  1. Use CKANMetadataProducer to find resourceId
  2. Use CKANProducer to fetch resource content
  3. Apply GeoJsonFilterOperation with point + radius
  4. Save filtered output as GeoJSON from API response

📝 Example Query Shape

{
  "use": "ca.concordia.encs.citydata.producers.CKANProducer",
  "withParams": [
    {"name":"url","value":"https://ngci.encs.concordia.ca/ckan/api/3"},
    {"name":"resourceId","value":"<resource-id>"}
  ],
  "apply": [
    {
      "name":"ca.concordia.encs.citydata.operations.GeoJsonFilterOperation",
      "withParams": [
        {"name":"centerLongitude","value":-73.74085},
        {"name":"centerLatitude","value":45.51895},
        {"name":"radiusMeters","value":200.0}
      ]
    }
  ]
}

🧪 Tests Added (Coverage)

  • testFilterByRadiusUsingFeatureCentroids
    • ✅ inside-radius feature kept
    • ✅ outside-radius feature removed
  • testMissingParametersThrowsError
    • ✅ missing required params throws IllegalArgumentException
  • Discovery assertions in DiscoveryRoutesTest
    • /operations/list includes:
      • GeoJsonFilterOperation
      • centerLongitude
      • centerLatitude
      • radiusMeters

⚠️ Known Limitation

  • Very large CKAN resources (e.g., DHN buildings ~907MB) may fail in current CKANProducer flow due to memory-heavy fetching.
  • This PR adds spatial filtering logic, not large-file streaming support.

🧱 Commit Breakdown

  • f1432a8feat(operations): add GeoJsonFilterOperation for centroid radius filtering
  • 9599bf1test(operations): add GeoJsonFilterOperation tests and discovery assertions
  • 6a1e9d1docs(queries): add DHN point-radius GeoJSON filter examples

@yann-gael
Copy link
Member

yann-gael commented Mar 5, 2026

Great description and contribution, thanks @Abolfazlrezaei93 !

The problem with large files and the CKANProducer is known, and we're working on a solution. The difficulty is that the whole chain of producers, operations, consumers, Hub, etc. must be "stream-ready".

@yann-gael
Copy link
Member

I like how you validate the parameters first with this.validateParameters();.
I like how you return early, like on Line 146 and others.
Is it really necessary to make a deep copy on Line 98 with final JsonObject output = inputObject.deepCopy();?
Although it would change the logic, instead of returning null on Lines 105, 112, and 118 with return null;, you could create a NullObject maybe.

PS. I would only work for data on Earth ;-)

Copy link
Collaborator

@MinetteMeyo MinetteMeyo Mar 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you provided a Unit Test. Unless you don't want any user to run it through an HTTP request, it's ok. But since you'll finally merge it into the project, I suggest you update the test into an integration test.

The issue: you instantiate and call the operation directly final GeoJsonFilterOperation operation = new GeoJsonFilterOperation(); final ArrayList<String> output = operation.apply(input);
insteal of wiring the producer's (CKANProducer here, according to your operation and queries) output through the SequentialRunner into the operation input as it is done in all the integration tests. This will provide a real query payload with CKANProducer as the data source and GeoJsonOperation as the operation to apply, and this is consistent with how the other operation is tested.

You can get inspired by MergeOperationTest.java

Copy link
Collaborator

@MinetteMeyo MinetteMeyo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great job, please consider Yann's comment to your code first, and update the test.

Globally, the code is fine, nothing wrong. You followed the same logic we are using in the codebase, which is great, and your PR could be merge as it is. However, I suggest you make the change your unit test to make it an integration test before I approve the PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

feature New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants