Skip to content

Optionally allow limiting streaming text speed at a maximum rate/speed #236

@bastianolea

Description

@bastianolea

For production applications that support many concurrent users, sometimes having too fast or even instant text generation is bad in the sense that it allows users to mindlessly ask more stuff to the IA, burning tokens in the process.

Maybe an optional argument to cap text streaming speed would be useful to limit text speed to a high-enough rate that invites users to read while text is being generated, therefore leading users to actually read answers while they stream instead of waiting for the full answer to appear. This would have the effect of spacing out new messages, as users will have to wait an acceptable amount of time before getting the full answer.

This would be only a cosmetic or UI change.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Priority: LowLow-impact bug, docs polish, papercut, or unclear low-severity request.ai-triage:acceptedA human reviewed and accepted the AI triage result.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions