Skip to content
Merged
2 changes: 2 additions & 0 deletions .cspell.json
Original file line number Diff line number Diff line change
Expand Up @@ -184,6 +184,8 @@
"paas",
"Pantothenic",
"parallelization",
"Partitioner",
"partitioner",
"pbcopy",
"pflag",
"pgoutput",
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
@@ -0,0 +1,75 @@
---
id: initial_load_full_refresh
title: Initial Load & Full Refresh Operations
sidebar_label: Initial Load & Full Refresh Operations
---

While you are evolving your Fast Data pipelines, you may need to perform a re-ingestion of all messages previously ingested into the system.
For example, you need to update a filter logic to refine data subsets, restructure how aggregations are organized, optimize storage by pruning obsolete records, fix transformation bugs, or generally evolve your Single View schema.

Especially in production environment, during Initial Load / Full Refresh processes it is extremely important not to lose the **Near Real-Time (NRT)operational continuity** from what is changing on the data sources ingested by your Fast Data Pipeline.

## Full Refresh architectural pattern

**To guarantee the business continuity** despite the need for a full events re-ingestion, you can see an example of **Full Refresh Architecture** (from a screenshot of the **Control Plane UI**).

![Full Refresh Architecture](img/full-refresh-architecture.png)

As shown in the diagram, the messages from the _topic.input_ are consumed by two different flows:

- **NRT (Near-Real-Time) Layer**: the flow in the upper half of the pipeline shows a [Stream Processor service](/products/fast_data_v2/stream_processor/10_Overview.md), which is responsible to simply forward the message to the next stage of the pipeline and to ensure business continuity
- **Backup Layer**: the flow in the lower half of the pipeline shows several processes responsible to perform a backup of the messages in a backup store: in the example, the messages inside _topic.input_ are consumed by the [Kango service](/products/fast_data_v2/kango/10_Overview.md) to compact and generate MongoDB documents. These documents are then stored in a **MongoDB collection**, which can be used **as backup**. Then a [Mongezium service](/products/fast_data_v2/mongezium_cdc/10_Overview.md) is configured to read these MongoDB document changes and consequently generate the Kafka messages published to the _topic.backup_ topic, which can be read by a _Stream Processor_ that can stay **paused and activated only when you need to reingest messages** into the pipeline.

These operations can be easily executed leveraging **Fast Data Control Plane UI** to govern and orchestrate every stage of **Initial Load** or **Full Refresh** operations with precision and zero friction.

:::note
Thanks to the backup layer and full refresh architectural pattern, it is possible to eliminate some critical operational constraints: instead of requesting full refreshes from external data sources or relying on infinite topic retention, you maintain a controlled backup flow that you can internally manage within your pipeline architecture, minimizing time-loss and exposure to external systems and organizational overhead.
:::

To configure this **routing pattern** that enables the two different flows representing the regular processing of the messages (upper flow) and the backup management (lower flow), the _Stream Processor_ services of both the two layers must be configured with the **Custom Partitioner** settings, in order to make possible to produce messages on a segregated subset of the partitions of the _topic.merge_ topic. For more info about the custom partitioner settings, visit the dedicated [page](/products/fast_data_v2/stream_processor/20_Configuration.mdx).
By dedicating a set of topic partitions to the backup flow and the remaining ones to the regular flow, you reach a clearer separation of the two layers and can better regulate the speed of the reingestion of the backup messages with the speed of the ingestion in the regular flow.

In the last Process step in the above shown picture, a _Stream Processor_ can include a dedicated logic to further guard the system from introducing messages that we might want not to be included anymore (e.g. messages from the backup flow that are now older because the regular flow - still processing - has already produced newer messages of a specific identifier in the output stream - this guard can be implemented for example by checking the timestamp of the createAt / updatedAt fields of the event coming from the source database with a internal cache for the needed service logics).

Some final considerations:

- you can choose whether the backup store should include the messages already refined through a transformation logic layer, to have them as a ready-to-use backup faster to reingest into the pipeline, or instead to include the raw messages, to have a more complete backup that can be reingested even with different transformation logics
- you can decide to have a faster **backup store using a Kafka topic with infinite retention** without the MongoDB persistency layer, to have a faster reingestion of the messages and have Kafka itself to deal with retention and compaction because maybe you might not need an efficient and durable storage

## Controlled Initialization

When performing an _Initial Load_ process, you can even use the same architecture shown in the previous diagram.
During pipeline initialization, every Fast Data workload can be configured with a default **paused** runtime state. This is managed via the **`onCreate`** parameter within each microservice's **ConfigMap**. By initializing flows in a paused state, you ensure that no workload begins consuming data immediately after deployment, allowing for manual orchestration.
Then, start resuming the first execution steps: the NRT layer will start consuming messages from the input topic; the backup one will start too, butt remind keeping in a paused state its final process (not useful during a pipeline initialization).

## Iterative Pipeline Activation

Whenever it is necessary to start the _Full Refresh_ process or an _Initial Load_, you can simply resume the consumption from the UI, allowing the messages in the backup topic to be reingested into the pipeline in a controlled way.
Typically, this first step involves executing transformation logic to ensure incoming data is compliant with Fast Data formats (e.g., casting, mapping, and data quality enhancements).
Once processed, these messages are produced into the output streams, ready for the subsequent stages of the pipeline.

You can monitor the flow of the pipeline from the UI, and quickly identify bottlenecks or issues, or perform quick operations to fix them (e.g. pausing the regular flow, to allow the backup flow to process the messages and catch up with the regular flow, before resuming it again).

## Ingestion and Lag Monitoring

Whether it is during the regular flow of the pipeline, or an _Initial Load_ or a _Full Refresh_ operation, you have full visibility of the state of the pipeline and full control of it.

Once the environment is ready, you can regulate message loading into the ingestion layer of your pipeline, pausing and resuming consumptions of topic messages in services. As the queues fill, the Control Plane provides real-time visibility into **Consumer Lag** across every pipeline edge, allowing you to monitor the volume of data awaiting processing.

## Advanced Aggregation Management

When dealing with **Aggregate execution steps**, the **Aggregation Graph Canvas** provides a centralized strategic view. This interface is specifically designed to manage complex scenarios where multiple data streams must be merged.

**Best Practice: The Leaf-to-Head Strategy**
For efficient ingestion, it is recommended to resume consumption following a "bottom-up" approach:

1. **Start from the Leaves**: Resume consumption at the leaf nodes of the aggregation graph.
2. **Monitor Lag**: Observe the incremental decrease in consumer lag.
3. **Progression**: Once the lag approaches zero, move to the next level of the graph.
4. **Activate the Head Node**: Finally, resume the head node of the aggregation.

:::note
By keeping the head node in a **Paused** state while the leaves process data, you prevent the production of premature events in the output stream. Once the head is resumed, it will produce the final aggregated output, significantly reducing redundant processing load on downstream stages.
:::

By combining real-time **Consumer Lag monitoring** with granular **runtime state control**, the Control Plane transforms complex Initial Load and Full Refresh operations into a manageable, transparent, and highly efficient process.
44 changes: 44 additions & 0 deletions docs/products/fast_data_v2/best_practices/overview.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
id: overview
title: Best Practices
sidebar_label: Overview
---

This section provides best practices and operational strategies for effectively designing and managing Fast Data v2 pipelines.

## How to navigate this section

The Fast Data v2 Best Practices are organized into three main areas to guide you through different stages of your data pipeline lifecycle:

### [Pipeline Development & Testing](/products/fast_data_v2/best_practices/pipeline_development_testing.md)

Start here during the development phase of your Fast Data pipelines. Learn how to:
- Visualize pipeline architecture as you build it
- Simulate performance scenarios with pause/resume controls
- Test system behavior under different load patterns before promoting to production

### [Initial Load & Full Refresh Operations](/products/fast_data_v2/best_practices/initial_load_full_refresh.md)

Master the operational strategies for managing data re-ingestion in production. Understand:
- How to maintain Near Real-Time operational continuity during complex pipeline changes
- The Full Refresh architectural pattern with NRT and Backup layers
- Controlled initialization and iterative pipeline activation
- Consumer lag monitoring and the Leaf-to-Head strategy for aggregations

### [System Optimization & Reliability](/products/fast_data_v2/best_practices/system_optimization_reliability.md)

Ensure your Fast Data infrastructure runs efficiently and reliably. Discover:
- Strategic resource allocation through granular runtime controls
- Performance optimization techniques
- Enhanced system reliability and fault isolation
- Maintenance strategies and graceful degradation patterns

---

## Key Concepts

**Runtime Control**: The ability to pause and resume message consumption at any pipeline stage, enabling precise orchestration of data flows without stopping the entire pipeline.

**Near Real-Time (NRT) Continuity**: Maintaining continuous processing of new incoming data while performing full refreshes or data reprocessing operations on historical data.

**Backup Layer**: A dedicated flow that maintains a controlled backup of your messages, enabling full refresh operations without requiring infinite topic retention or direct access to source databases.
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
---
id: pipeline_development_testing
title: Pipeline Development & Testing
sidebar_label: Pipeline Development & Testing
---

This section covers best practices for developing and testing Fast Data v2 pipelines during the development phase, where you can safely experiment and validate your architecture before promoting to production.

## Visualize Fast Data Pipelines while Building Them

During the Fast Data development phase, users can iteratively configure and continuously deploy in the development environment new Fast Data pipeline steps. Control Plane UI will provide the new architecture steps incrementally rendered, offering immediate visual feedback as the pipeline evolves.

## Performance Testing and Simulation

During the Fast Data development phase, users can simulate different scenarios for performance testing by pausing and resuming messages consumption along the pipeline. In this way, user can pause and resume operations to test system behavior under different load patterns before to promote to production.
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
id: system_optimization_reliability
title: System Optimization & Reliability
sidebar_label: System Optimization & Reliability
---

This section covers strategies for optimizing Fast Data v2 system performance and ensuring reliability through granular runtime controls and architectural best practices.

## Strategic Resource Allocation and Performance Optimization

By leveraging the ability to pause and resume message-consuming microservices in real-time and verifying the lag of your topic and the stability of your services, the Control Plane ensures that computing power is strategically directed toward high-priority tasks during peak demand periods.
These granular runtime controls facilitate a balanced distribution of processing loads across every stage of the architecture, effectively mitigating bottlenecks and ensuring maximum resource utilization throughout your entire Fast Data v2 infrastructure.

## Enhanced System Reliability

When faced with scheduled maintenance or unforeseen anomalies, the Control Plane allows for precise intervention by pausing specific pipeline segments, ensuring that controlled troubleshooting occurs without compromising the broader system workflow.
This systematic approach extends into post-maintenance phases, where operations can be resumed gradually to verify stability and minimize recovery time. Beyond routine maintenance, these runtime controls facilitate effective fault isolation, enabling you to contain issues within localized segments to protect the integrity of the overall infrastructure. By implementing graceful degradation through precise shutdown and startup procedures, you ensure that your Fast Data v2 environment maintains absolute operational integrity even in challenging circumstances.
65 changes: 0 additions & 65 deletions docs/products/fast_data_v2/runtime_management/best_practices.md

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ The pipeline provides **Pause Data Consumption** and **Resume Data Consumption**
Pause and Resume buttons are available whenever you click on a pipeline step that supports runtime state control for specific data flows.
Additionally, for the Aggregate execution step, these same controls are also available directly within the Aggregation Graph Canvas, providing enhanced utility for managing Initial Load and Full Refresh scenarios, allowing for more efficient and optimized runtime control in these and other operational scenarios.

For more detailed operational strategies and best practices on using these runtime controls effectively, visit the [Best Practices documentation](/products/fast_data_v2/runtime_management/best_practices.md).
For more detailed operational strategies and best practices on using these runtime controls effectively, visit the [Best Practices documentation](/products/fast_data_v2/best_practices/overview.md).

## Navigating UI

Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Loading