Skip to content

POST /tables blocks HTTP thread for minutes with high-partition-count SASL/SSL Kafka realtime table #18743

@udaysagar2177

Description

@udaysagar2177

Environment: Pinot controller, realtime table, large partition count (e.g. 100+), multiple replica groups, Kafka over SASL/SSL.

Symptom: POST /tables takes several minutes to respond when the Kafka topic has a large number of partitions. The client times out, but the controller eventually completes the work correctly and writes the ideal state.

Root cause (traced from source):

addTable is fully synchronous on the HTTP thread. After InstanceAssignmentDriver finishes (ZK-only, fast), PinotLLCRealtimeSegmentManager.setUpNewTable() calls getNewPartitionGroupMetadataList(), which ends up in StreamMetadataProvider.computePartitionGroupMetadata(). That method loops over all partitions sequentially - for each partition it constructs a new KafkaConsumer (full SASL/SSL handshake) and calls fetchStreamPartitionOffset. With SASL/SSL, each handshake takes several seconds, so the total time scales linearly with partition count and blocks the HTTP thread throughout.

Call chain:

POST /tables (PinotTableRestletResource.java:262)
  → PinotHelixResourceManager.addTable() (line 1866)
    → PinotLLCRealtimeSegmentManager.setUpNewTable() (line 379)
      → getNewPartitionGroupMetadataList()
        → PinotTableIdealStateBuilder.getPartitionGroupMetadataList()
          → PartitionGroupMetadataFetcher.call()
            → StreamMetadataProvider.computePartitionGroupMetadata()
              → for i in 0..N:  ← serial, no parallelism
                  new KafkaPartitionLevelConnectionHandler(...)
                    → new KafkaConsumer<>()  ← SASL/SSL handshake per partition
                  → fetchStreamPartitionOffset()

Questions:

  1. Is this expected? Is there a known workaround for large partition counts with SASL/SSL?
  2. Is there a path to parallelize the per-partition offset fetch in computePartitionGroupMetadata?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions