Skip to content

Fix/cdc log processor handle#18

Open
divyagovindaiah wants to merge 3 commits into
Sunbird-Knowlg:developfrom
divyagovindaiah:fix/cdc-log-processor-handle
Open

Fix/cdc log processor handle#18
divyagovindaiah wants to merge 3 commits into
Sunbird-Knowlg:developfrom
divyagovindaiah:fix/cdc-log-processor-handle

Conversation

@divyagovindaiah
Copy link
Copy Markdown

@divyagovindaiah divyagovindaiah commented May 28, 2026

Please include a summary of the change and which issue is fixed. Please also include relevant motivation and context. List any dependencies that are required for this change.

Type of change

Please choose appropriate options.

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • This change requires a documentation update

How Has This Been Tested?

Please describe the tests that you ran to verify your changes in the below checkboxes. Provide instructions so we can reproduce. Please also list any relevant details for your test configuration

  • Ran Test A
  • Ran Test B

Test Configuration:

  • Software versions:
  • Hardware versions:

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

Summary by CodeRabbit

  • New Features

    • Introduced JanusGraph Change Data Capture (CDC) extension for capturing and processing transaction events with multiple converter options
    • Support for routing CDC events to logging or custom sinks
  • Documentation

    • Added comprehensive setup and verification guide for CDC Log Processor configuration, deployment, and troubleshooting

Review Change Stack

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 28, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a01909ae-10a9-44ad-b0d1-a2c6afb79141

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR introduces a complete JanusGraph CDC (Change Data Capture) extension module that captures vertex and edge changes from transaction logs, converts them to pluggable message formats, and routes events to configurable sinks.

Changes

JanusGraph CDC Extension Implementation

Layer / File(s) Summary
CDC Module Foundation & Core Framework
pom.xml, .gitignore, src/main/java/org/sunbird/janusgraph/cdc/EventSink.java, src/main/java/org/sunbird/janusgraph/cdc/MessageConverter.java, src/main/java/org/sunbird/janusgraph/cdc/GraphLogProcessor.java
Sets up Maven project metadata and Java 11 compiler settings. Defines EventSink and MessageConverter contracts. Implements GraphLogProcessor with start()/shutdown() lifecycle, transaction log registration, vertex/edge change classification (CREATE/DELETE/UPDATE), and event dispatch with helper methods for timestamp and property extraction from nested or flat structures.
Message Converter Implementations
src/main/java/org/sunbird/janusgraph/cdc/.../SimpleMessageConverter.java, src/main/java/org/sunbird/janusgraph/cdc/.../TelemetryMessageConverter.java, src/main/java/org/sunbird/janusgraph/cdc/.../SunbirdLegacyMessageConverter.java, src/main/resources/cdc-converter.conf
Provides three converter implementations: SimpleMessageConverter for basic event payloads; TelemetryMessageConverter with fixed telemetry schema and operation-specific property extraction; SunbirdLegacyMessageConverter with legacy metadata, {ov,nv} property diffs for UPDATE/DELETE, relation tracking across both edge directions, configurable string-only field preservation, and vertex/ROOT event filtering.
Event Sink & Logging Configuration
src/main/java/org/sunbird/janusgraph/cdc/LogFileEventSink.java, src/main/resources/cdc-log4j.properties, src/main/resources/log4j2-server.xml
Implements LogFileEventSink for SLF4J event routing. Includes sample Log4j properties configuration with rolling file appender and Log4j2 server config with console/rolling-file appenders, HBase logger suppression, and CDC sink isolation.
Bootstrap & Server Startup Scripts
scripts/register-cdc.groovy, scripts/empty-sample.groovy
Provides register-cdc.groovy for reflective GraphLogProcessor startup during server boot with dynamic class loading, Gremlin binding inspection, and structured error handling. Includes empty-sample.groovy with lifecycle hooks for graceful processor shutdown and default traversal binding.
Documentation & Test Coverage
README.md, src/test/java/org/sunbird/janusgraph/cdc/GraphLogProcessorTest.java
Comprehensive README covering Maven build, JAR deployment, JanusGraph transaction log backend configuration, bootstrap script setup, optional log4j2 logging, restart verification, application-side transaction logging with matching logIdentifier, end-to-end CDC validation, and troubleshooting. Tests validate event ordering via timestamp caching, ISO-8601 timestamp parsing, and property extraction from nested (transactionData.properties.lastUpdatedOn.nv) and flat (transactionData.properties.lastUpdatedOn) structures.

Sequence Diagram

sequenceDiagram
  participant JanusGraph as JanusGraph<br/>Transaction Log
  participant GraphLogProcessor
  participant MessageConverter
  participant EventSink
  participant SinkImpl as LogFileEventSink/<br/>External Sink
  JanusGraph->>GraphLogProcessor: processChanges(vertices, relations)
  GraphLogProcessor->>GraphLogProcessor: classify CREATE/DELETE/UPDATE
  GraphLogProcessor->>MessageConverter: convert(vertex, changeState, operation)
  MessageConverter->>MessageConverter: extract properties & metadata
  MessageConverter-->>GraphLogProcessor: event map
  GraphLogProcessor->>EventSink: send(key, JSON payload)
  EventSink->>SinkImpl: route event
  SinkImpl-->>EventSink: acknowledged
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Poem

🐰 CDC Changes Dance

From graph to log, the vertices sing,
Three converters catch each data wing,
Legacy, simple, telemetry flow—
Events dispatch where sinks tell them to go! 📊
Bootstrap and bind through Gremlin's embrace,
Change capture now has its rightful place. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The PR description is entirely empty except for the template structure; all required sections lack substantive content, and no checkboxes were marked or details provided. Complete the description by providing a clear summary of changes, selecting appropriate change type(s), documenting testing performed, and marking relevant checklist items.
Docstring Coverage ⚠️ Warning Docstring coverage is 21.28% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Fix/cdc log processor handle' is vague and incomplete, using generic language that does not clearly convey what the actual fix or change accomplishes. Revise the title to be more descriptive and specific about the main change, such as 'Add CDC Log Processor with event sink and message converter implementations' or similar.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@divyagovindaiah divyagovindaiah changed the base branch from main to develop May 28, 2026 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant