- Base Fuseki with Jena Commands
- GeoSPARQL
- RDF Delta Fuseki
- RDF Delta Server
The image is available as ghcr.io/kurrawong/fuseki:<version> where version is composed of the jena version and this container image's build version number.
For example, ghcr.io/kurrawong/fuseki:5.6.0-0 is built on Jena Fuseki 5.6.0 and the 0 indicates the build number of this container image. If we release a new build that's still based on Jena 5.6.0, the build number will be incremented to 1 to form ghcr.io/kurrawong/fuseki:5.6.0-1.
This image builds and runs on Java 21.
See the tagged images here.
To make data loading an managing easier, it is recommended to install the kurra CLI.
uv tool install kurra
task fuseki:build
task fuseki:up
This will enable the Fuseki UI at http://localhost:3030/
A testdatabase is configured in testdata/config-geosparql.ttl. It has all features enabled by default. You can disable them by setting the following properties to false:
# some GeoSPARQL settings. See https://jena.apache.org/documentation/geosparql/geosparql-fuseki.html
geosparql:inference true ; # GeoSPARQL RDFS schema and inferencing (adds additional statements to the dataset)
geosparql:queryRewrite true ; # Simplifies queries, relies on applyDefaultGeometry
geosparql:applyDefaultGeometry true ; # Makes the dataset less dependent on one serialization. Adds additional geo:hasSerialization statements to the dataset
geosparql:indexEnabled true ; # Enable caching of re-usable data to improve query performance
geosparql:validateGeometryLiterals true ; # Logs warnings when adding invalid geometry
With the fuseki up and running, you can create this dataset using the following command:
kurra db create http://localhost:3030 --config ./testdata/config-geosparql.ttl
You'll see a warning in the docker logs of the fuseki service:
WARN GeoAssembler :: Dataset empty. Spatial Index not constructed. Server will require restarting after adding data and any updates to build Spatial Index.
We can add some data and restart the server:
kurra db upload ./testdata/data-geosparql.ttl http://localhost:3030/test-geosparql
task fuseki:restart
Now you should see that the spatial index was created:
SpatialIndex :: Saving Spatial Index - Completed: /fuseki/databases/test-geosparql/spatial.index
To verify that the dataset is working, go to http://localhost:3030/#/dataset/test-geosparql/query and try some GeoSPARQL queries.
Useful tools to construct and query WKT geometries: https://www.geometrymapper.com/ https://wktmap.com/
Find all addresses within a certain area:
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT DISTINCT ?address
WHERE {
BIND("POLYGON ((152.685242 -27.161808, 152.698975 -27.829361, 153.492737 -27.829361, 153.435059 -27.178912, 152.685242 -27.161808))"^^geo:wktLiteral AS ?polygon)
?address geo:hasGeometry / geo:asWKT ?point .
FILTER(geof:sfWithin(?point, ?polygon))
}
# returns
# 1<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e>
# 2<https://linked.data.gov.au/dataset/qld-addr/address/beb30200-2988-5c0a-942b-36cd2138805a>
Note that thanks to the applyDefaultGeometry and inference options, the following also works:
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT DISTINCT ?address
WHERE {
BIND("POLYGON ((152.685242 -27.161808, 152.698975 -27.829361, 153.492737 -27.829361, 153.435059 -27.178912, 152.685242 -27.161808))"^^geo:wktLiteral AS ?polygon)
?address geo:hasDefaultGeometry / geo:hasSerialization ?point .
FILTER(geof:sfWithin(?point, ?polygon))
}
# returns
# 1<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e>
# 2<https://linked.data.gov.au/dataset/qld-addr/address/beb30200-2988-5c0a-942b-36cd2138805a>
These queries are useful when dealing with dynamic, user-defined polygons. However, much more is possible when polygons are included in the dataset, and thus also in the spatial index.
The dataset also contains a broad bounding box of Australia, which then gets included in the spatial index.
Thanks to the query rewriting, it means we can use a much simpler query to list all addresses in Australia:
PREFIX addr: <https://linked.data.gov.au/def/addr/>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT DISTINCT ?address
WHERE {
?address a addr:Address .
<https://example.org/australia> geo:sfContains ?address .
}
# returns all 4 addresses in the test dataset
Or in reverse, we can look up which country a certain address is located in:
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
SELECT DISTINCT ?country
WHERE {
?country a dbo:Country .
<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e> geo:sfWithin ?country .
}
# returns <https://example.org/australia>
Note that there might be some confusion between the spatial property & filter functions in the Jena namespace (spatial: and spatialF:) and those specified in the standard GeoSPARQL ontology namespace (geo: and geof:).
Because of this, none of the Non-topological Query Functions specified in the GeoSPARQL standard seem to work with the correct namespaces. Instead, there are equivalent implementations of these functions in the Jena namespace, sometimes under a different name.
For example, geof:distance does not seem to work with Jena, whereas spatialF:distance does.
PREFIX spatialF: <http://jena.apache.org/function/spatial#>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
SELECT ?distance
WHERE {
<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e> geo:hasDefaultGeometry / geo:hasSerialization ?point1 .
<https://linked.data.gov.au/dataset/qld-addr/address/2fd46078-88c0-5f30-b43e-d2908d9445b6> geo:hasDefaultGeometry / geo:hasSerialization ?point2 .
BIND(xsd:decimal(spatialF:distance(?point1, ?point2, uom:kilometre)) AS ?distance) .
}
# returns "129.601686"^^<http://www.w3.org/2001/XMLSchema#decimal>
That means when migrating from other systems that do implement the GeoSPARQL standard as-is, some query rewriting might be required to ensure a seamless transition.
Jena supports property & filter functions as specified in the documentation: https://jena.apache.org/documentation/geosparql/index
For example, find addresses less than 150 kilometres from a reference point using latitude -27.5 and longitude 152.5
PREFIX spatial: <http://jena.apache.org/spatial#>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX addr: <https://linked.data.gov.au/def/addr/>
SELECT DISTINCT ?address
WHERE {
?address a addr:Address ;
spatial:nearby(-27.5 152.5 100 uom:kilometre)
}
# returns
#<https://linked.data.gov.au/dataset/qld-addr/address/2fd46078-88c0-5f30-b43e-d2908d9445b6>
#<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e>
Find all addresses north of that same point:
PREFIX spatial: <http://jena.apache.org/spatial#>
PREFIX uom: <http://www.opengis.net/def/uom/OGC/1.0/>
PREFIX addr: <https://linked.data.gov.au/def/addr/>
SELECT DISTINCT ?address
WHERE {
?address a addr:Address ;
spatial:north(-27.5 152.5)
}
# returns <https://linked.data.gov.au/dataset/qld-addr/address/2fd46078-88c0-5f30-b43e-d2908d9445b6>
When configuring a spatial dataset, combined with a Lucene index, it's important that the fuseki:dataset of the fuseki:Service points to the dataset with type text:TextDataset, and not to the geosparql:geosparqlDataset. Only then can we combine a spatial index with a full-text index. See testdata/config-geosparql.ttl for an example.
With the lucene index enabled, the following queries are supported, according to the documentation:
?s text:query 'Queensland' # simplest query
?s text:query ('Queensland' 2) # with limit on results
?s text:query (rdfs:label 'Queensland') # query specific property
?s text:query (rdfs:label 'Queensland' 'lang:en') # restrict search to one language
(?s ?score) text:query 'Queensland' # include the score
(?s ?score ?literal) text:query 'Queensland' # include the original literal value
(?s ?score ?literal ?g) text:query 'Queensland' # include the graph
(?s ?score ?literal) text:query (rdfs:label "(Barbaralla AND Queensland)") # Boolean operators
(?s ?score ?literal) text:query (rdfs:label "(Queensla~)") # Fuzzy search
(?s ?sc ?lit) text:query ( "Queensland" "highlight:" ) # highlighting
(?s ?sc ?lit) text:query ( "Queensland" "highlight:s:<em class='hiLite'> | e:</em>" ) # highlighting with HTML
That means now we can combine the full-text search with the spatial index, which means we can search for text occurrences within a certain geographical area:
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX geo: <http://www.opengis.net/ont/geosparql#>
PREFIX geof: <http://www.opengis.net/def/function/geosparql/>
PREFIX text: <http://jena.apache.org/text#>
SELECT DISTINCT ?address ?literal
WHERE {
BIND("POLYGON ((152.685242 -27.161808, 152.698975 -27.829361, 153.492737 -27.829361, 153.435059 -27.178912, 152.685242 -27.161808))"^^geo:wktLiteral AS ?polygon)
?address geo:hasGeometry / geo:asWKT ?point ;
rdfs:label ?addressLabel .
FILTER(geof:sfWithin(?point, ?polygon))
(?address ?score ?literal) text:query ( "Drive" "highlight:" ) .
}
# returns
# 1<https://linked.data.gov.au/dataset/qld-addr/address/65cb1e52-fc1d-5dee-a2d2-ea7882d12c7e> "32 Barbaralla ↦Drive↤, Springwood, Queensland, Australia"@en
See Taskfile.yml for local development commands.
We can build patches for Jena ourselves by developing on a specific version of the Jena source code, and including patches in /docker/patches.
A simple example of this is the addition of the GeoSPARQL dependency in /docker/patches/enable-geosparql.diff as inspired by the zazuko docker image.
For this repository's current setup, the only required change for a normal upstream bump is:
- update
ARG JENA_VERSION=...indocker/Dockerfile
The GeoSPARQL dependency patch file (docker/patches/enable-geosparql.diff) is already applied by docker/Dockerfile; it does not need to be edited for normal version bumps.
Then verify locally with:
task fuseki:smoke
The smoke test is deterministic and will fail fast if the image/runtime behaviour changes.
Only follow this path when you need behaviour that is not available in upstream Jena:
- check out the target Jena tag from https://github.com/apache/jena (for example
git checkout jena-5.6.0) - make your changes in Jena source
- generate a patch with
git diff > my-patch.diff - add the patch to
/docker/patches - apply it from
docker/Dockerfilein the builder stage (as done forenable-geosparql.diff)