Description
Support automatic sharing of shared signal values between nodes in a cluster.
The solution should enable building a view with cluster-shared state along these lines:
@Route
public class ClickCount extends VerticalLayout {
public ClickCount(@Autowired ClusteredSignalFactory factory) {
SharedNumberSignal countSignal = factory.getNumber("clickCount");
Button button = new Button();
button.bindText(() -> "Click count: " + countSignal.get());
button.addClickListener(click -> countSignal.incrementBy(1));
add(button);
}
}
Tier
Enterprise
License
Proprietary
Motivation
Background
Vaadin 25.1 introduces signals for reactive UI state management in Flow. One limitation in 25.1 is that it's not possible to share signal state between nodes in a cluster. To avoid causing surprises in production, the shared signal types throw an exception when serialized to make the developer realize that the use case is explicitly unsupported.
Problem
Signals can be a powerful mechanism for implementing collaborative functionality in a cluster by sharing UI state between collaborating users regardless of which nodes in the cluster those users are connected to. This requires new framework functionality for synchronizing signal values so that state can be shared between users who are connected to different nodes in a cluster.
Solution
The core idea is that multiple nodes in the cluster can have signal instances that are connected to each other. The application developer controls how instances are connected based on a string identifier provided when acquiring the instance. The state in instances that use the same identifier value is kept in sync whereas nothing is shared between instances using different identifiers.
AbstractSharedSignal and all subclasses are already designed for asynchronous operation. There's support for using a custom SignalTree implementation that would submit and confirm changes through an external system. Rather than manually managing SignalTree implementations, the application would configure a ClusteredSignalFactory instance as e.g. a Spring bean that can be injected into views that use collaborative features.
Strongly consistent event log
Synchronization between nodes happens through an event log abstraction. Each signal identifier corresponds to a separate event log which means that signals with different identifiers are completely independent from each other and might be hosted on different nodes in a distributed system.
Events in the log carry the signal modification operations that are already used by AbstractSharedSignal. The log must have strongly consistent ordering, meaning that there's no possibility that event order will change after a submitted event has been accepted by the log. Events can be distributed to clients in the cluster only when ordering is guaranteed to no longer change, e.g. through a consensus mechanism.
When a new signal instance is created with an identifier for which an event log already exists, it's necessary to re-apply all previous events from that log so that the signal reaches the same state as other signal instances using the same identifier. As the event log grows longer, this becomes increasingly more expensive. This should be mitigated with the help of state snapshots that are occasionally created and stored in the cluster. A new signal instance can then find the latest snapshot for a given event log and only replay events that have been added to the log after the snapshot was created.
Cluster backends
We will not build our own distributed event log from scratch. Instead, we should integrate with existing clustering technology that has solved hard problems such as partitioning and consensus.
A wide range of existing systems can provide the building blocks that are needed.
- Actual event log systems such as Kafka or RabbitMQ (note that a traditional event queue is not enough since it doesn't preserve old events for later replay)
- Databases with a notification mechanism like PostgreSQL or MongoDB
- Distributed in-memory data grids like Hazelcast
- Caches like Redis or Ehcache
- Agent systems like Akka
We assume that a user who runs their application in a cluster already has an existing solution for communicating across that cluster and that they want to keep using that solution rather than deploying another one.
We need to find out which such systems would be most useful for typical users so that we can choose which ones to integrate with initially. Some promising candidates include Hazelcast thanks to the way it can be embedded in the application rather than run as a standalone system, and PostgreSQL thanks to its wide adoption.
Serialization
Vaadin applications running in a cluster often use session serialization for high availability or to migrate sessions from a node that is about to be shut down. Serializing a shared signal instance will lead to problems in case that instance has listeners from other sessions since that will effectively lead to serializing multiple sessions at the same time. Also, serialized data might no longer be up-to-date after deserializing.
Conceptually, what's needed is to not serialize the whole signal with all its data but instead only metadata needed to recover a connection to the same underlying event log. The main challenge is how to handle listeners. One potential solution might be to not allow sharing actual signal instances between sessions but instead let each user have their own signal and tree instances and manage only share the event log abstraction between sessions. This might be straightforward to implement but would lead to some application constraints that might be challenging to enforce even at runtime. Another alternative might be to only collect the data needed to recreate listeners from the current session while ignoring other listeners. Some research is needed on this topic before choosing an approach.
Notes
Similar functionality has been available as a preview feature for Collaboration Kit without gaining traction. It's assumed that this is mainly a reflection on the positioning of Collaboration Kit in general, but there's also a risk that it implies that there just isn't a huge demand among customers for this kind of solution.
Requirements
TBD
Nice-to-haves
TBD
Risks, limitations and breaking changes
Risks
- Correctness is the main concern. We need to carefully review and test the implementation to avoid different types of race conditions that lead to missed or duplicated events.
- Access control on the application level is the responsibility of the application developer by choosing which users get access to which signal instances. Access control on the infrastructure level is handled by the 3rd party cluster backend.
Limitations
N/A
Breaking changes
This would be the first case where we properly exercise some aspects of the asynchronous APIs in the shared signals and this might uncover some needs for API adjustments. We might also need to make some changes to the internal-ish APIs related to listener registration to be able to associate a listener with a specific session to help manage serialization.
Out of scope
No response
Materials
No response
Metrics
No response
Pre-implementation checklist
Pre-release checklist
Security review
None
Description
Support automatic sharing of shared signal values between nodes in a cluster.
The solution should enable building a view with cluster-shared state along these lines:
Tier
Enterprise
License
Proprietary
Motivation
Background
Vaadin 25.1 introduces signals for reactive UI state management in Flow. One limitation in 25.1 is that it's not possible to share signal state between nodes in a cluster. To avoid causing surprises in production, the shared signal types throw an exception when serialized to make the developer realize that the use case is explicitly unsupported.
Problem
Signals can be a powerful mechanism for implementing collaborative functionality in a cluster by sharing UI state between collaborating users regardless of which nodes in the cluster those users are connected to. This requires new framework functionality for synchronizing signal values so that state can be shared between users who are connected to different nodes in a cluster.
Solution
The core idea is that multiple nodes in the cluster can have signal instances that are connected to each other. The application developer controls how instances are connected based on a string identifier provided when acquiring the instance. The state in instances that use the same identifier value is kept in sync whereas nothing is shared between instances using different identifiers.
AbstractSharedSignaland all subclasses are already designed for asynchronous operation. There's support for using a customSignalTreeimplementation that would submit and confirm changes through an external system. Rather than manually managingSignalTreeimplementations, the application would configure aClusteredSignalFactoryinstance as e.g. a Spring bean that can be injected into views that use collaborative features.Strongly consistent event log
Synchronization between nodes happens through an event log abstraction. Each signal identifier corresponds to a separate event log which means that signals with different identifiers are completely independent from each other and might be hosted on different nodes in a distributed system.
Events in the log carry the signal modification operations that are already used by
AbstractSharedSignal. The log must have strongly consistent ordering, meaning that there's no possibility that event order will change after a submitted event has been accepted by the log. Events can be distributed to clients in the cluster only when ordering is guaranteed to no longer change, e.g. through a consensus mechanism.When a new signal instance is created with an identifier for which an event log already exists, it's necessary to re-apply all previous events from that log so that the signal reaches the same state as other signal instances using the same identifier. As the event log grows longer, this becomes increasingly more expensive. This should be mitigated with the help of state snapshots that are occasionally created and stored in the cluster. A new signal instance can then find the latest snapshot for a given event log and only replay events that have been added to the log after the snapshot was created.
Cluster backends
We will not build our own distributed event log from scratch. Instead, we should integrate with existing clustering technology that has solved hard problems such as partitioning and consensus.
A wide range of existing systems can provide the building blocks that are needed.
We assume that a user who runs their application in a cluster already has an existing solution for communicating across that cluster and that they want to keep using that solution rather than deploying another one.
We need to find out which such systems would be most useful for typical users so that we can choose which ones to integrate with initially. Some promising candidates include Hazelcast thanks to the way it can be embedded in the application rather than run as a standalone system, and PostgreSQL thanks to its wide adoption.
Serialization
Vaadin applications running in a cluster often use session serialization for high availability or to migrate sessions from a node that is about to be shut down. Serializing a shared signal instance will lead to problems in case that instance has listeners from other sessions since that will effectively lead to serializing multiple sessions at the same time. Also, serialized data might no longer be up-to-date after deserializing.
Conceptually, what's needed is to not serialize the whole signal with all its data but instead only metadata needed to recover a connection to the same underlying event log. The main challenge is how to handle listeners. One potential solution might be to not allow sharing actual signal instances between sessions but instead let each user have their own signal and tree instances and manage only share the event log abstraction between sessions. This might be straightforward to implement but would lead to some application constraints that might be challenging to enforce even at runtime. Another alternative might be to only collect the data needed to recreate listeners from the current session while ignoring other listeners. Some research is needed on this topic before choosing an approach.
Notes
Similar functionality has been available as a preview feature for Collaboration Kit without gaining traction. It's assumed that this is mainly a reflection on the positioning of Collaboration Kit in general, but there's also a risk that it implies that there just isn't a huge demand among customers for this kind of solution.
Requirements
TBD
Requirement 1
Requirement 2
Requirement 3
Documentation
License check
Feature flag (remove if not needed)
Nice-to-haves
TBD
Risks, limitations and breaking changes
Risks
Limitations
N/A
Breaking changes
This would be the first case where we properly exercise some aspects of the asynchronous APIs in the shared signals and this might uncover some needs for API adjustments. We might also need to make some changes to the internal-ish APIs related to listener registration to be able to associate a listener with a specific session to help manage serialization.
Out of scope
No response
Materials
No response
Metrics
No response
Pre-implementation checklist
Pre-release checklist
Security review
None