Skip to content

Latest commit

 

History

History
714 lines (506 loc) · 44.8 KB

File metadata and controls

714 lines (506 loc) · 44.8 KB

Welcome! I'm here to help you with questions about: https://code2tutorial.com/tutorial/78333062-f7cc-48e3-9f88-eabc1028dc3d/03_virtual_objects__stateful_actors__.md. What would you like to know?

Question: Why Virtual Objects?

How do you pass data from one workflow step to another, say from asynchronous function to synchronous function and vice versa

Workflow state async func --> async func (0,0) async func --> sync func (0,1) sync func --> async func (1,0) sync func --> sync func (1,1)

Answer:

Crawled 1 page: https://code2tutorial.com/tutorial/78333062-f7cc-48e3-9f88-eabc1028dc3d/03_virtual_objects__stateful_actors__.md We need to explore more pages to get better information... Crawled 1 page: https://code2tutorial.com/tutorial/78333062-f7cc-48e3-9f88-eabc1028dc3d/04_workflows__long_running_processes__.md We've got enough information to answer the question... The final answer is ready!

Imagine your workflow as a meticulous project manager, coordinating various tasks. Some tasks are quick (synchronous), like making a quick decision based on available info. Others are "waiting games" (asynchronous), like waiting for a client's approval or an external service to respond.

How does this project manager keep track of everything and ensure that information from one task is available for the next, even if there's a long wait in between, or if they take a coffee break (server restart)?

The answer in Restate is through Durable State, which acts like a persistent, shared notebook for your workflow.

The Project Manager's Notebook: Durable State

Every Restate Workflow (which is a special kind of Virtual Object) has its own private, durable memory. This "notebook" persists across server failures and long-running waits. When you need to pass data from one step to another, whether it's an immediate follow-up or a step that happens much later after an asynchronous wait, you simply write it into this durable state.

The key tools for interacting with this notebook are:

ctx.set("key", value):

To write data into the workflow's durable state. Think of this as writing a note on a specific page ("key") in the notebook.

ctx.get("key"):

To read data from the workflow's durable state. This is like looking up the note on that specific page ("key").

How it Bridges Asynchronous and Synchronous Steps

Synchronous to Asynchronous: If a synchronous part of your workflow calculates some data and then needs to kick off an asynchronous operation (like sending an email and waiting for verification), it will first ctx.set that data. Then, it proceeds with the asynchronous wait (e.g., ctx.promise or ctx.run for an external call). If the server crashes during the wait, the data is safe in the durable state.

Asynchronous to Synchronous: Once the asynchronous operation completes or an event (like a signal) arrives, the workflow resumes. It can then ctx.get the previously stored data to continue its synchronous processing, using the information that was saved before the wait.

This mechanism ensures that your workflow's memory is never lost, and data can flow seamlessly between any of its steps, regardless of how long the pauses are or if the underlying service instances restart.

Let's illustrate with an example: a user signup workflow.

Database EmailService SignupWorkflow RestateClient User Database EmailService SignupWorkflow RestateClient User Store user details in durable state Workflow pauses, waiting for signal. State (userDetails) is durable. Retrieves user details from durable state Start Signup (name, email) call run(userId, userDetails) ctx.set("userDetails", userDetails) Save initial user record User record saved Send verification email(email, userId) Email sent confirmation ctx.promise("emailVerified") Clicks verification link Trigger email verified signal (userId) send click(userId) signal Receives signal ctx.get("userDetails") Update user status to 'verified' Status updated Workflow completes, returns true Signup complete (true) Signup success! Code Example (TypeScript) Let's look at how this might appear in a TypeScript workflow for a user signup process.

First, your run handler takes the initial userDetails:

// Define the user details type
interface UserDetails {
  name: string;
  email: string;
}

const signupWorkflow = restate.workflow({
  name: "usersignup",
  handlers: {
    run: async (ctx: restate.WorkflowContext, user: UserDetails) => {
      const userId = ctx.key; // The unique ID for this workflow instance

      // 1. Store the user details in durable state immediately.
      // This data will be available even after long waits or restarts.
      await ctx.set("userDetails", user);
      // userDetails is now saved persistently.
      console.log(`[${userId}] Stored user details in durable state.`);

      // ... (imagine calling an external service to save to DB)
      await ctx.run(
        () => Promise.resolve(`User ${user.name} saved to DB.`)
      );
      console.log(`[${userId}] Initial user record saved.`);

      // ... (imagine calling an external service to send an email)
      await ctx.run(
        () => Promise.resolve(`Verification email sent to ${user.email}.`)
      );
      console.log(`[${userId}] Verification email sent.`);

      // 2. Now, we wait for an asynchronous event (the email verification signal).
      // The workflow will pause here. When it resumes, 'userDetails' will still be available.
      await ctx.promise("emailVerified");
      console.log(`[${userId}] Email verified signal received!`);

      // 3. The workflow resumes after verification. Now, retrieve the user details
      // from durable state to continue processing.
      const storedUserDetails = (await ctx.get<UserDetails>("userDetails"))!;
      console.log(`[${userId}] Retrieved details from state: ${storedUserDetails.name}`);

      // ... (imagine calling an external service to update user status)
      await ctx.run(
        () => Promise.resolve(`User ${storedUserDetails.name} marked as verified.`)
      );
      console.log(`[${userId}] User status updated to 'verified'.`);

      return true; // Workflow successfully completed
    },
    // ... (signal handler below)
  },
});

Here, ctx.set("userDetails", user) saves the initial user data. The workflow then enters an asynchronous wait (ctx.promise("emailVerified")). When the emailVerified signal arrives (potentially hours or days later), the workflow resumes, and ctx.get("userDetails") retrieves the very same data that was saved earlier, allowing the workflow to pick up exactly where it left off with all the necessary context.

Now, let's look at the click handler which sends the signal:

// ... (signupWorkflow definition continues)
    click: async (ctx: restate.WorkflowSharedContext, request: { secret: string }) => {
      // This handler receives a signal, e.g., when the user clicks a verification link.
      // It will resolve the 'emailVerified' promise, allowing the 'run' method to continue.
      console.log(`[${ctx.key}] Received 'click' signal with secret: ${request.secret}`);
      ctx.resolvePromise("emailVerified");
      // Note: The signal handler itself doesn't typically access/modify complex
      // state directly needed by the 'run' method, but rather signals its completion.
    },
// ... (end of signupWorkflow definition)

The ctx.resolvePromise("emailVerified") in the click signal handler is what unblocks the await ctx.promise("emailVerified") in the run method, allowing it to continue. The userDetails are safely stored and retrieved using ctx.set and ctx.get, bridging the asynchronous wait.

In summary, by consistently using ctx.set to store data before asynchronous operations and ctx.get to retrieve it afterward, Restate's durable state acts as the reliable bridge for passing data between any steps in your workflow, whether they are synchronous or separated by long, asynchronous waits.

You may find these pages useful: https://code2tutorial.com/tutorial/78333062-f7cc-48e3-9f88-eabc1028dc3d/03_virtual_objects__stateful_actors__.md https://code2tutorial.com/tutorial/78333062-f7cc-48e3-9f88-eabc1028dc3d/04_workflows__long_running_processes__.md

For Golang

  1. Define Data Structures First, we'll define the UserDetails struct and our SignupWorkflow struct, which will hold our workflow logic.
package main

import (
    "log"
    "time"

    "github.com/restatedev/sdk-go/server"
    "github.com/restatedev/sdk-go/state"
)

// UserDetails represents the data we want to store durably.
type UserDetails struct {
    Name  string
    Email string
}

// SignupWorkflow is our durable workflow definition.
type SignupWorkflow struct{}

UserDetails: This is a simple Go struct to hold the user's name and email, just like the TypeScript interface UserDetails. SignupWorkflow: This empty struct acts as the receiver for our workflow methods (Run and Click), telling Restate that these methods belong to this particular workflow.

  1. The Run Handler (Main Workflow Logic)

The Run method is the heart of your workflow. It defines the sequence of steps, including where to store data and where to pause for asynchronous events.

// Run is the main workflow handler that orchestrates the signup process.
func (SignupWorkflow) Run(ctx state.WorkflowContext, user UserDetails) (bool, error) {
    // The unique ID for this workflow instance (e.g., a userId).
    userId := state.Key(ctx)

    // 1. Store the user details in durable state immediately.
    // This data will be available even after long waits or restarts.
    if err := state.Set[UserDetails](ctx, "userDetails", user); err != nil {
        return false, err
    }
    log.Printf("[%s] Stored user details in durable state: %+v", userId, user)

    // ... (imagine calling an external service to save to DB)
    // We use state.Run for external, non-durable operations.
    _, err := state.Run(ctx, "saveUserToDB", func() (string, error) {
        // Simulate a database save operation
        time.Sleep(50 * time.Millisecond)
        return "User " + user.Name + " saved to DB.", nil
    })
    if err != nil {
        return false, err
    }
    log.Printf("[%s] Initial user record saved.", userId)

    // ... (imagine calling an external service to send an email)
    _, err = state.Run(ctx, "sendVerificationEmail", func() (string, error) {
        // Simulate sending an email
        time.Sleep(100 * time.Millisecond)
        return "Verification email sent to " + user.Email + ".", nil
    })
    if err != nil {
        return false, err
    }
    log.Printf("[%s] Verification email sent.", userId)

    // 2. Now, we wait for an asynchronous event (the email verification signal).
    // The workflow will pause here efficiently. When it resumes, 'userDetails'
    // will still be available from durable state.
    log.Printf("[%s] Workflow pausing, waiting for 'emailVerified' signal...", userId)
    if err := state.Promise(ctx, "emailVerified").Await(); err != nil {
        return false, err
    }
    log.Printf("[%s] Email verified signal received! Workflow resuming.", userId)

    // 3. The workflow resumes after verification. Now, retrieve the user details
    // from durable state to continue processing.
    storedUserDetails, err := state.Get[UserDetails](ctx, "userDetails")
    if err != nil {
        return false, err
    }
    log.Printf("[%s] Retrieved details from state: %s", userId, storedUserDetails.Name)

    // ... (imagine calling an external service to update user status)
    _, err = state.Run(ctx, "updateUserStatus", func() (string, error) {
        // Simulate updating user status
        time.Sleep(50 * time.Millisecond)
        return "User " + storedUserDetails.Name + " marked as verified.", nil
    })
    if err != nil {
        return false, err
    }
    log.Printf("[%s] User status updated to 'verified'.", userId)

    return true, nil // Workflow successfully completed
}

Go-specific notes for the Run handler:

state.WorkflowContext: This is the Go equivalent of restate.WorkflowContext in TypeScript. It provides all the durable capabilities.

state.Key(ctx): This gets the unique ID for the current workflow instance, just like ctx.key in TypeScript.

state.Set[UserDetails](ctx, "userDetails", user): This durably stores the user struct under the key "userDetails". Notice the Go-specific type parameter [UserDetails] to specify the type being stored.

state.Run(ctx, "description", func() (string, error) { ... }): This is used to call any non-durable, external operations. The "description" is helpful for tracing. The anonymous function returns the result of the external call. In our example, we're just simulating external calls with time.Sleep.

state.Promise(ctx, "emailVerified").Await(): This creates a durable promise named "emailVerified" and makes the workflow Await() it. The workflow will pause here without consuming resources until this promise is resolved by a signal. If the server restarts, this promise's state is remembered.

state.GetUserDetails: After the workflow resumes, this durably retrieves the UserDetails struct previously saved under the key "userDetails". Again, [UserDetails] specifies the expected type.

  1. The Click Handler (Signal Receiver) The Click method acts as a signal handler. It doesn't run sequentially within the Run method but can be called externally at any time to provide new information or unblock a waiting workflow.
// Click is a signal handler that resolves the email verification promise.
func (SignupWorkflow) Click(ctx state.WorkflowSharedContext, request struct{ Secret string }) error {
    userId := state.Key(ctx)
    log.Printf("[%s] Received 'click' signal with secret: %s", userId, request.Secret)

    // This resolves the "emailVerified" promise, allowing the Run method to continue.
    state.ResolvePromise(ctx, "emailVerified")
    return nil
}

Go-specific notes for the Click handler:

state.WorkflowSharedContext: This is the Go equivalent of restate.WorkflowSharedContext in TypeScript. Signal handlers use this context type.

request struct{ Secret string }: In Go, you define an anonymous struct directly for the input payload, similar to the request: { secret: string } type annotation in TypeScript.

state.ResolvePromise(ctx, "emailVerified"): This is the crucial line that tells Restate that the "emailVerified" promise (that our Run handler is Await()-ing) is now fulfilled. This unblocks the Run handler and allows it to proceed.

  1. Setting up the Restate Server Finally, you need a main function to register your SignupWorkflow with the Restate server.
func main() {
    // Create a new Restate server instance.
    // The Reflect function registers our SignupWorkflow struct and its methods
    // as Restate handlers.
    server := server.NewRestate().Bind(state.Reflect(SignupWorkflow{}))

    // Start the Restate server on the default port (9080).
    log.Println("Starting Restate server on :9080")
    if err := server.Listen(":9080"); err != nil {
        log.Fatalf("Failed to start Restate server: %v", err)
    }
}

How it Bridges Asynchronous and Synchronous Steps (Go perspective) Let's quickly recap with a Go lens:

Durable State (state.Set and state.Get):

When your Run handler starts, it immediately uses state.Set[UserDetails](ctx, "userDetails", user) to save the initial UserDetails. This is like writing down essential info in a durable notebook. Later, after the state.Promise(ctx, "emailVerified").Await() call unblocks, the workflow uses state.GetUserDetails to retrieve that same data. Even if your Go service crashes and restarts, or waits for hours, this data is reliably available. Durable Promises (state.Promise().Await() and state.ResolvePromise()):

state.Promise(ctx, "emailVerified").Await() in the Run handler causes the workflow to pause. Restate handles this efficiently, freeing up resources while it waits. When the external Click signal handler is invoked, state.ResolvePromise(ctx, "emailVerified") is called. This tells Restate that the promise is fulfilled, and Restate then transparently resumes the Run handler exactly where it left off. This combination of durable state and durable promises is how Restate ensures your long-running workflows can pass data and coordinate between synchronous steps and potentially very long asynchronous waits, even across server restarts.

Question: Why port 8080 as ingress port of the restate server, is it hardcoded or can it be changed?

Answer:

Port 8080 is not hardcoded - it can be changed. It's the default ingress port for the Restate server.

You can configure it via the ingress.bind-address setting in the configuration file or by setting the environment variable RESTATE_INGRESS__BIND_ADDRESS. For example, to use port 9000 instead:

RESTATE_INGRESS__BIND_ADDRESS=0.0.0.0:9000 restate-server

Or in your TOML config file:

[ingress]
bind-address = "0.0.0.0:9000"

Want to know more about configuration? These pages may help:

Server Configuration Configuration Reference

Question: In a saga what happens when the compensation func also failed?

Answer:

In a Saga pattern, when a compensation function itself fails, the system is designed to handle this gracefully to ensure that as many "undo" actions as possible are performed.

Think of it like a meticulous cleanup crew (the Saga orchestrator) trying to put things back to normal after a party (the main transaction) went wrong.

Here's what generally happens, based on the provided content:

The Compensation Error is Logged: When a specific "undo" step (compensation function) encounters an issue, this failure is recorded. It's like a cleanup crew member noting, "Couldn't put away the chairs because they're broken."

if _, compErr := compensation(); compErr != nil {
    // Log compensation errors, but don't stop other compensations
    log.Printf("Compensation failed: %v", compErr)
}

In this Go example, log.Printf("Compensation failed: %v", compErr) is used to record the failure.

Other Compensations Continue: Even though one compensation failed, the Saga orchestrator (the cleanup crew leader) doesn't just stop. It continues to try and execute all other necessary compensation functions. This is crucial because you want to revert as many changes as possible to prevent leaving the system in an even more inconsistent state.

The comment // Log compensation errors, but don't stop other compensations explicitly states this behavior.

Let's illustrate with our travel booking example:

Imagine the booking workflow tried to book a flight, a car, and a hotel. The hotel booking failed, so the Saga needs to compensate:

In this scenario:

The Book Hotel step fails. The BookingWorkflow starts compensating in reverse order. Cancel Car succeeds. Cancel Flight fails. The workflow logs this failure but does not stop. If there were other compensations (e.g., "Cancel Travel Insurance"), it would attempt those as well.

The outcome is that while the system tried its best to undo all actions, the failure of a compensation means that part of the "undo" operation was unsuccessful. This often requires further monitoring, alerts, and potentially manual intervention or a more sophisticated retry mechanism for the failed compensation to fully resolve the inconsistency.

Question: In a saga do I have to retry all funcs within the saga?

Answer:

No, in a Saga implemented with Restate's Durable Execution, you generally do not have to manually retry all functions within the saga for transient failures. Restate handles this for you automatically!

Think of Restate's Durable Execution as a super-diligent assistant for your workflow. Here's how it works for both the main steps and the "undo" steps:

  1. Reliable Execution for Both Forward and Compensation Steps Restate's core strength is its Durable Execution. This means that any operation you ask it to perform, whether it's a main "forward" step or a "compensating" (undo) step, is guaranteed to be reliably executed.

The website content explicitly states:

"Restate's Durable Execution is key here: it guarantees that both your forward steps and your compensating steps will always be reliably executed, even if your application or the Restate runtime crashes."

This guarantee includes automatic retries for transient errors. If a network blip occurs, or a service you're calling is temporarily unavailable, Restate's runtime will automatically retry that step until it succeeds or encounters a non-recoverable error.

In this example, your SagaWorkflow doesn't need to implement for loops or try-catch blocks for retries; Restate handles the repeated attempts behind the scenes until the Book Flight operation eventually succeeds.

  1. When Retries Stop: The TerminalError Retries by Restate continue until the operation genuinely succeeds or fails with a TerminalError.

Forward Steps: If one of your main booking steps (like Book Flight or Book Car) encounters a transient error, Restate will retry it. It only stops retrying that specific forward step if it hits a TerminalError (an error that means "this operation cannot succeed, stop trying"). At that point, the Saga then triggers the compensation (undo) steps for all previously successful actions.

A TerminalError is like the hotel saying, "We are permanently fully booked for those dates, trying again won't help."

Compensation Steps: Even the "undo" compensation steps benefit from Restate's Durable Execution and automatic retries. If Cancel Car encounters a transient network issue, Restate will retry it.

However, if a compensation function itself fails persistently (meaning it hits a TerminalError or keeps failing after all Restate's retries are exhausted), then:

The failure is logged. Other compensations continue to be attempted. This specific persistently failed compensation might require manual intervention or a separate, more sophisticated retry mechanism external to the immediate Saga execution to fully resolve the inconsistency. This is what the previous conversation's log.Printf("Compensation failed: %v", compErr) snippet and the note "Requires manual intervention or retry logic" were referring to.

if _, compErr := compensation(); compErr != nil {
    // Log compensation errors, but don't stop other compensations
    log.Printf("Compensation failed: %v", compErr)
}

This code shows that if a compensation fails (even after Restate's internal retries), it's logged, but the Saga continues to try other compensations.

In summary, for most day-to-day transient issues, Restate handles the retries automatically. You primarily focus on defining what a "terminal" failure looks like for your business logic, which then triggers the compensation flow.

What is the purpose of restate.sleep?

https://docs.restate.dev/develop/go/durable-timers

The restate.sleep() function (or ctx.sleep() within a handler) is used to pause the execution of a handler for a specified duration in a durable, fault-tolerant way. The key purposes and mechanisms are: Durable Timers: Unlike a standard Thread.sleep() which would cause the execution to fail or hang if the process/server crashes, restate.sleep() is a "durable timer". Restate tracks and manages these timers, so if a service fails during the sleep period, the Restate server ensures it resumes execution after the remaining duration has passed. Resource Efficiency (FaaS): For function-as-a-service (FaaS) platforms (like AWS Lambda) that charge based on execution time, restate.sleep() is highly efficient. When this is called, Restate suspends the handler and frees up the resources/thread, and only wakes it up when the timer is done, saving compute costs. Building Workflows and Timeouts: It allows developers to easily implement time-based logic, such as: Pausing a workflow for a set period (e.g., waiting for an external system to process a request). Setting timeouts for asynchronous operations or external calls. Implementing patterns like cron jobs or delayed messages. Reliable Execution: It is a core primitive of Restate's "durable execution" model. The sleep duration is recorded in a persistent journal, ensuring that the code always runs to completion reliably, even across failures and restarts. In essence, restate.sleep() is a non-blocking, fault-tolerant way to introduce time delays in distributed applications and long-running workflows without consuming active resources

Difference between ObjectContext and ObjectSharedContext

Per docs, ObjectContext and ObjectSharedContext are both used within Virtual Objects, but they provide different levels of access and concurrency guarantees.

ObjectContext

  • Access Level: Provides both read and write access to the virtual object's state.
  • Concurrency: Handlers using ObjectContext are exclusive. Only one such handler can run at a time for a given object key, ensuring strong consistency and preventing data conflicts.
  • Usage: Used for main business logic handlers where state modifications (e.g., updating a user profile, processing a transaction) are required.

ObjectSharedContext

  • Access Level: Provides read-only access to the virtual object's state.
  • Concurrency: Handlers using ObjectSharedContext are concurrent. Multiple shared handlers can run simultaneously for the same object key without blocking each other or waiting for exclusive handlers to complete (except during an active state transaction commit).
  • Usage: Typically used for querying the current status or state of an object, such as checking if a long-running operation is complete, without needing exclusive access.

Summary of Differences

Feature ObjectContext ObjectSharedContext
State Access Read and Write Read-only
Concurrency Exclusive (one at a time) Concurrent (multiple can run)
Handler Type Exclusive handlers Shared/Query handlers
Primary Use Case Modifying object state Querying object state

Difference among basic service, virtual object and workflow

Per docs, basic services in Restate are for stateless or short-lived logic, Virtual Objects are for stateful, key-addressable entities, and Workflows are for long-running, multi-step processes that require durable execution for state consistency.

Basic Services

  • What they are: A grouping of related handlers that are exposed as callable endpoints.
  • State: Primarily stateless, though handlers can interact with other services.
  • Use case: Best for standard business logic, API integrations, and data processing where you don't need to maintain per-instance state between requests.

Virtual Objects

  • What they are: Stateful services that are addressable by a unique key.
  • State: Each object instance has its own persistent state associated with its unique key.
  • Concurrency: Restate ensures at most one handler runs for a given key at a time, using an intrinsic lock and a per-key queue for ordering.
  • Use case: Ideal for modeling stateful entities like a user account, a shopping cart, or a device, where you need to manage state and concurrency for a specific instance.

Workflows

  • What they are: Long-lived, multi-step processes with a defined lifecycle.
  • State: They are durable and their state is logged to ensure they can be recovered and completed even after failures.
  • Concurrency: They are designed for complex coordination and guarantee exactly-once execution for their steps.
  • Use case: Perfect for multi-step processes that require reliable, stateful orchestration, such as order processing, onboarding, or any multi-step business process where state consistency is critical.
Feature Basic Service Virtual Object Workflow
Primary Use General business logic, data processing, API integrations. Modeling stateful entities and sessions (e.g., a specific user, device, or conversation). Orchestrating multi-step, long-running business processes that require external interactions.
State Stateless (or state managed externally). Stateful, with an isolated, key-addressable K/V store per instance. Stateful, uses the durable execution engine to store progress and an embedded K/V store for state.
Addressing Addressed by service and handler name. Addressed by a unique key (e.g., user_id) and handler name. Addressed by a unique workflow ID and handler name.
Concurrency Handlers can run concurrently. Exclusive handlers run one at a time per key; shared handlers can read concurrently. Handlers run with durable execution guarantees; the run handler executes exactly once per workflow instance.
Lifecycle Stateless, handlers run on request-response basis. Long-lived, instances persist as long as their state is retained. Long-lived processes with a defined lifecycle, capable of suspending and resuming.

Summary of Differences

  • Basic Services are for general-purpose, stateless operations, similar to typical microservices. They are the starting point for most business logic without a need for distributed state management within Restate.
  • Virtual Objects are used when you need to model distinct, stateful entities (like an "account" or "session"). Their key feature is the guarantee that only one exclusive handler runs at a time for a given key, preventing concurrency conflicts when managing the entity's state.
  • Workflows are specifically designed for orchestrating complex, multi-step processes that need to be durable and resilient to failures. They can suspend their execution (e.g., to wait for a timer or an external signal) and will resume exactly where they left off, providing exactly-once execution semantics for the entire process.

In practice, you can combine these service types within the same application to leverage their respective strengths for different components of your business logic

Does Restate address the outbox pattern?

Restate.dev effectively eliminates the need for implementing the outbox pattern manually by providing built-in primitives for reliable, durable execution and message delivery. The outbox pattern is a common solution in microservices to solve the "dual write problem" (reliably updating a database and publishing an event in a single atomic transaction). Restate handles this challenge internally through its architecture, which offers:

  • Durable Execution: Restate treats operations as durable functions (or "virtual objects") whose state and execution are persisted in a reliable log. This means that the intent to send a message is part of the durable state of the service.

  • Reliable Messaging: When a Restate service invokes another service or sends a message, Restate guarantees that the message will be delivered at least once, with automatic retries handled by the platform. The handler can complete and return without needing to manually manage an outbox table or background processor.

  • Idempotency: Restate automatically deduplicates calls made within the same handler execution, which simplifies the consumer side of messaging and helps prevent duplicate processing in case of retries, a common challenge when implementing the outbox pattern from scratch.

  • Built-in Consistency: By managing the state and the outbound messages within its own transaction log, Restate ensures that the system remains consistent, abstracting away the complex plumbing typically required by the outbox pattern.

In essence, Restate's design provides a more direct and operationally simpler solution to the problems the outbox pattern addresses, allowing developers to focus on business logic rather than distributed systems plumbing.

How to use virtual object for notifications?

https://docs.restate.dev/develop/go/external-events

When designing a notification system with Restate, virtual objects are the ideal primitive for representing and managing a user's notification state. A virtual object, identified by a user ID, encapsulates all the logic and state related to that user's notifications. This approach guarantees strong consistency and reliable delivery for each individual user. The virtual object notification pattern This pattern uses a dedicated virtual object for each user. It stores notification data in the object's durable state and provides handlers to manage and send notifications.

  1. Define the virtual object Create a UserNotifications virtual object. Its state, including a list of notifications, is stored durably by Restate. The object is keyed by a userId

  2. Send messages to the virtual object From any other service or virtual object in your application, you can send one-way messages to the UserNotifications object to trigger an action

  3. Implement durable and external communications For real-world notification delivery, integrate with an external push service like Firebase Cloud Messaging (FCM) or a webhook. Restate's durability primitives ensure delivery is reliable even if the external service is temporarily unavailable

Advantages of using virtual objects for notifications

  • Reliable, stateful delivery: Each virtual object is a durable, stateful entity. It guarantees that a notification is never lost, even if there are failures during the process of adding it to a user's queue or sending it to an external provider.

  • Strong consistency: Restate processes all invocations for a single virtual object key (the user ID) serially. This guarantees that notifications for a single user are handled in the order they arrive, preventing race conditions.

  • Decoupled microservices: The sending service (e.g., OrderService) doesn't need to know the details of notification delivery. It simply sends a reliable message to the UserNotifications object, and Restate handles the coordination.

  • Simplified logic: The notification logic, including managing the unread count and read status, is encapsulated within the virtual object, making it easy to manage. Exactly-once semantics: Restate's reliable communication ensures that messages are delivered with exactly-once semantics, preventing duplicate notifications from being sent

How to archive or evict inactive virtual objects?

In Restate, the state of Virtual Objects is retained indefinitely by default and is not automatically evicted. To clean up inactive objects, you must actively implement your own application-level logic to clear the state and handle their lifecycle. Since Restate does not automatically remove data, relying on a custom cleanup strategy is a key part of managing stateful services. Here are the methods for clearing state and implementing your cleanup logic. Method 1: Use ctx.clearAll() within your service You can implement a handler within your Virtual Object service that is responsible for clearing all of its state. This is useful for clearing an object when a certain event happens. How it works Add a cleanup handler to your service's code. This handler should use the ctx.clearAll() function, which deletes all state associated with the current object's key. Your application logic can then call this cleanup handler when it detects that an object has become inactive.

Method 2: Use a timer for automated cleanup For inactivity-based eviction, you can set a timer that triggers a cleanup after a specified period of inactivity. How it works Every time the Virtual Object becomes active, reset a durable timer. If the timer expires, it triggers a function that performs the cleanup using ctx.clearAll().

Method 3: Use the Restate CLI for manual eviction For manual, ad-hoc cleanup, you can use the Restate command-line interface (CLI) to clear an object's state. How it works Ensure you have the Restate CLI installed. Run the command restate state keys to see the keys of your Virtual Objects. Use the restate state clear command with the object's key to delete its state. Example CLI commands

Summary of strategies Strategy Pros Cons Service handler with ctx.clearAll() Direct, explicit control over when an object is cleared. Requires external application logic to trigger the handler. Durable timer for inactivity Fully automated and integrated cleanup based on inactivity. Requires careful management of timers to ensure correct behavior. Restate CLI Simple for manual or one-off cleanup tasks. Not suitable for automated, production-level lifecycle management.

What are the size limits for Restate primitives?

There are no hardcoded, fixed size limits for restate.dev primitives like state, journals, or messages. However, practical constraints can arise depending on your deployment environment and payload sizes. The system is engineered to handle large amounts of data efficiently, though excessive data can impact performance. State and journal storage Virtual Objects: The state for each Virtual Object is key-based and stored in an embedded RocksDB database by default. Persistent storage: The journal for each durable invocation, which includes state and event history, is also stored in RocksDB. For clustered deployments, snapshots of this data are saved to an object store, like Amazon S3. Performance impact: While there is no hard limit on the size of the state or journal, exceedingly large data can slow down handler execution and increase memory usage. Restate provides features like "lazy state" to help manage performance with large objects. Function-as-a-Service (FaaS) payload limits AWS Lambda: One of the most significant practical constraints comes from the payload size limits of the FaaS platform you use. For example, AWS Lambda has a payload limit of 6 MB. Automatic compression: To mitigate this, Restate automatically compresses payloads for AWS Lambda when approaching the 6 MB limit, allowing for larger journals and state to be used with the platform. This feature is available in the Restate TypeScript SDK (version 1.9+). Best practices for managing large data To avoid potential performance impacts, consider these strategies when working with large data in Restate: Keep state lean: For Virtual Objects, only keep the necessary, small pieces of data in the internal state. Use external databases: For larger documents or objects that don't need transactional consistency with your handler's logic, store them in an external database and save only a reference ID in your Restate state. Utilize lazy state: If you have large state objects that are only needed in certain code paths, use the lazy state feature to only fetch the data when required. Manage invocation history: For auxiliary or high-volume Virtual Objects where the invocation history adds little value, you can configure Restate to not retain histories, which reduces storage consumption

Why does saga example at https://github.com/restatedev/examples/tree/main/go/tutorials/tour-of-workflows-go/examples use WorkflowContext while in this framework.go it uses restate.Context, what is the difference and is it significant or does it matter?

WorkflowContext vs restate.Context Key Differences: WorkflowContext is specifically for Workflow entities and is used in the required Run handler that implements workflow logic. It's designed for workflows that execute exactly once per workflow execution/object.

restate.Context is the general context used by regular Services and Virtual Objects for standard handler operations. Why It Matters:

Workflow Guarantees: The Run handler with WorkflowContext executes exactly one time per workflow execution/object providing stronger once-only guarantees. Semantic Clarity: Using WorkflowContext makes it explicit that this is a workflow with special execution semantics. Potential Future Features: WorkflowContext may expose workflow-specific features that aren't available in regular contexts (though currently they have similar APIs).

Your framework's DurableWorkflow uses restate.Context in a regular Service, which is fine for workflow patterns but not true Workflow entities. Here's what you should update:

Summary:

DurableWorkflow (your framework) - Uses ServiceContext with restate.Context

This is a workflow pattern for orchestration with saga compensation Used in regular Services Great for multi-step business processes

WorkflowEntity (Restate's Workflow) - Uses WorkflowContext with restate.WorkflowContext

This is a Workflow entity with once-only execution guarantees Used for long-running workflows that need coordination Perfect for approval workflows, saga orchestrators, etc.

When to Use Each: Use DurableWorkflow (your pattern):

// Inside a regular Service handler
workflow := &DurableWorkflow{...}
workflow.Run(serviceContext) // Multi-step saga pattern

Use WorkflowEntity (Restate's Workflow):

// As a separate Workflow entity
type MyWorkflow struct { *WorkflowEntity }
func (w *MyWorkflow) Run(ctx restate.WorkflowContext) error {
    // Long-running workflow with awakeables, promises, etc.
}

The saga example uses WorkflowContext because it's implementing a true Restate Workflow entity with once-only execution semantics, while your DurableWorkflow is a reusable pattern that can be embedded in any service handler!

Is it fine to use awakeable with restate virtual objects?

https://docs.restate.dev/develop/java/external-events

Yes, it is perfectly fine and a recommended pattern to use awakeables with Restate virtual objects. This is a key part of how virtual objects can coordinate with external systems in a durable, stateful way. Why this pattern works Virtual Objects for managing state and concurrency: Restate virtual objects are stateful, key-addressable services. This means they represent a specific entity (e.g., a user_id or session_id) and handle interactions with it, guaranteeing only one write-access handler runs at a time for that object's key. Awakeables for handling external events: Awakeables are used for handling asynchronous external events, such as webhooks from a payment provider or an email link clicked by a user. An awakeable creates a durable "promise" that a handler can wait on. When the external event occurs, the awakeable is completed, and the suspended handler resumes. The combination for durable, multi-step interactions: The durable, stateful nature of a virtual object, combined with the waiting capabilities of an awakeable, allows you to implement robust, long-running processes that interact with the outside world. A typical pattern for using awakeables with virtual objects Receive an external request: A client makes a request to a handler in your virtual object. Create an awakeable: Inside the virtual object handler, you create an awakeable. This generates a unique awakeable ID and a future-like object that you can await on. Initiate an external action: You send the awakeable ID to an external system. For example, if you are integrating with a payment provider, you start a transaction and provide the awakeable ID as part of the metadata for the webhook. Wait for the result: Your virtual object handler suspends, waiting for the awakeable to be completed. The handler does not block a thread and Restate saves its state, so the wait can last for hours or even months. External system responds: When the external system (e.g., the payment provider) sends a webhook, Restate uses the awakeable ID to complete the awakeable. Handler resumes: The suspended virtual object handler resumes from where it left off, and the program flow continues.

What is the better term for workflow in the context of restate.dev (not to confuse with workflow/activity in temporal)?

In the context of restate.dev, a better term to use instead of "workflow" is "durable function" or "durable service". Restate uses the concept of "durable execution" for any function or service, not just a formal "workflow," which helps distinguish its approach from Temporal's more explicit workflow-activity model. Here's a breakdown of the differences: Restate's "Durable Functions" Temporal's "Workflows" Granularity Applies to any function or service handler. Developers can start with a simple durable RPC call and gradually introduce finer-grained durable steps. A "Workflow Execution" is a distinct, top-level concept that orchestrates "Activity Executions". Complexity It is lightweight and can feel like a regular RPC call, but with built-in durability. It eliminates the need for separate workers, as the Restate server pushes tasks directly to the functions. The distinction between a "Workflow" and an "Activity" is more explicit, involving dedicated workers that pull tasks from queues. Scope Goes beyond traditional workflows. Restate can manage stateful AI agents, orchestrate microservices, and process events with durability, in addition to standard multi-step processes. While very powerful, its core is built around the explicit coordination of activities via workflows. Developer Experience The SDK provides durable versions of standard programming constructs, such as RPC calls (ctx.run), promises (ctx.promise), and timers (ctx.sleep), directly inside ordinary code. Developers write deterministic "Workflow Functions" that orchestrate non-deterministic "Activity Functions," a mental model that differs from a standard application service. By framing the concept as a durable function or service, you emphasize that any function can be made resilient and stateful with Restate, without being confined to a rigid, top-level workflow definition