Skip to content

Add the BerlinMOD nine-query three-form parity matrix for NebulaStream#15

Open
estebanzimanyi wants to merge 1 commit into
MobilityDB:mainfrom
estebanzimanyi:feat/berlinmod-streaming-forms
Open

Add the BerlinMOD nine-query three-form parity matrix for NebulaStream#15
estebanzimanyi wants to merge 1 commit into
MobilityDB:mainfrom
estebanzimanyi:feat/berlinmod-streaming-forms

Conversation

@estebanzimanyi
Copy link
Copy Markdown
Member

@estebanzimanyi estebanzimanyi commented May 21, 2026

Add the BerlinMOD nine-query parity matrix for NebulaStream across the three query forms (instantaneous, sequence, and full streaming), as the reference surface for measuring streaming MEOS parity. Base of the streaming-parity series.

@estebanzimanyi estebanzimanyi force-pushed the feat/berlinmod-streaming-forms branch from a5ff0f0 to 6d3f885 Compare May 21, 2026 06:07
@estebanzimanyi estebanzimanyi changed the title feat(berlinmod): scaffold BerlinMOD-Q1..Q4 × 3 forms (12 YAMLs, 12/27 cells) feat(berlinmod): scaffold BerlinMOD-Q1..Q4+Q7 × 3 forms (21 YAMLs, 15/27 cells) on NebulaStream May 21, 2026
… on NebulaStream (33 YAMLs, 27/27 cells)

Additive scaffold for the BerlinMOD-9 × 3 streaming-form parity contract
on MobilityNebula, sibling to the existing SNCB Q-series and matching
the MobilityFlink MobilityDB#3 / MobilityKafka MobilityDB#1 streaming-form definitions.

All 27 cells covered:

  Q1 'which vehicles have appeared'      — full (continuous + windowed + snapshot)
  Q2 'where is vehicle X at time T'      — full
  Q3 'vehicles within 5 km of P'         — full
  Q4 'vehicles inside region R (polygon)'— full
  Q5 'pairs of vehicles meeting near P'  — partial (emit per-vehicle trajectories near P; consumer joins)
  Q6 'cumulative distance per vehicle'   — partial (emit TEMPORAL_SEQUENCE; consumer computes length)
  Q7 'first passage of vehicle through POI' × {POI1, POI2, POI3}
                                          — full (per-POI fan-out)
  Q8 'vehicles within d of LINESTRING'   — full (edwithin_tgeo_geo with LINESTRING geometry)
  Q9 'distance between X and Y at time T'— partial (emit X and Y trajectories; consumer joins)

18 of 27 cells are FULL (the BerlinMOD-Q semantic is computed entirely
inside NebulaStream). 9 cells are PARTIAL — NebulaStream emits the
per-window inputs (trajectory, candidate vehicles) and a consumer
post-processes for the final BerlinMOD-Q answer. The partial pattern
is the natural expression of these queries in NebulaStream's current
SQL surface; the path to FULL is documented per-Q in
docs/berlinmod-streaming-forms.md (a stream-self-join for Q5/Q9, a
temporal_length scalar function for Q6).

Form mapping to NebulaStream windows:

  continuous: SLIDING(time_utc, SIZE 1 SEC, ADVANCE BY 1 SEC)
  windowed:   TUMBLING(time_utc, SIZE 10 SEC)
  snapshot:   TUMBLING(time_utc, SIZE 5 SEC)

MEOS-side surface consumed (already exposed by PR MobilityDB#14 + follow-ups):

  edwithin_tgeo_geo — Q3 (POINT predicate), Q4 (POLYGON, d=0.0),
                      Q5 (POINT predicate), Q7 (per-POI POINT),
                      Q8 (LINESTRING predicate)
  TEMPORAL_SEQUENCE — Q2 / Q5 / Q6 / Q9 (per-window per-vehicle trajectory)

No new MEOS PhysicalFunction classes added; no C++ changes; no SNCB
Q-series modifications. All 33 YAMLs are additive in a new
Queries/berlinmod/ subdirectory.

Add (additions):
  Queries/berlinmod/q1_{continuous,windowed,snapshot}.yaml          (3)
  Queries/berlinmod/q2_{continuous,windowed,snapshot}.yaml          (3)
  Queries/berlinmod/q3_{continuous,windowed,snapshot}.yaml          (3)
  Queries/berlinmod/q4_{continuous,windowed,snapshot}.yaml          (3)
  Queries/berlinmod/q5_{continuous,windowed,snapshot}.yaml          (3, partial)
  Queries/berlinmod/q6_{continuous,windowed,snapshot}.yaml          (3, partial)
  Queries/berlinmod/q7_poi{1,2,3}_{continuous,windowed,snapshot}.yaml (9, full via fan-out)
  Queries/berlinmod/q8_{continuous,windowed,snapshot}.yaml          (3, LINESTRING predicate)
  Queries/berlinmod/q9_{continuous,windowed,snapshot}.yaml          (3, partial)
  Input/input_berlinmod.csv  (sample data: 3 vehicles × 21 events, 14 simulated seconds)
  docs/berlinmod-streaming-forms.md

Validation: every YAML parses cleanly via python3 yaml.safe_load.
Runtime verification gated on the NebulaStream test harness.

Coverage: 27 of 27 cells (100 %), with 18 FULL and 9 PARTIAL annotated
explicitly per Q. Path to FULL for the 9 PARTIAL cells is one
MobilityNebula C++ PhysicalFunction class each (or a NebulaStream
upstream stream-self-join), documented in
docs/berlinmod-streaming-forms.md.
@estebanzimanyi estebanzimanyi force-pushed the feat/berlinmod-streaming-forms branch from 6d3f885 to 395e364 Compare May 21, 2026 06:40
@estebanzimanyi estebanzimanyi changed the title feat(berlinmod): scaffold BerlinMOD-Q1..Q4+Q7 × 3 forms (21 YAMLs, 15/27 cells) on NebulaStream feat(berlinmod): scaffold the full BerlinMOD-9 × 3-form parity matrix on NebulaStream (33 YAMLs, 27/27 cells) May 21, 2026
@marianaGarcez
Copy link
Copy Markdown
Contributor

These are great additions, happy to review when you're ready!

Copy link
Copy Markdown
Contributor

@marianaGarcez marianaGarcez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR looks good and adds new queries. I do not alter the existing code or add functions

@marianaGarcez marianaGarcez marked this pull request as ready for review May 29, 2026 10:14
@estebanzimanyi estebanzimanyi changed the title feat(berlinmod): scaffold the full BerlinMOD-9 × 3-form parity matrix on NebulaStream (33 YAMLs, 27/27 cells) Add the BerlinMOD nine-query three-form parity matrix for NebulaStream Jun 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants