POST /tables blocks HTTP thread for minutes with high-partition-count SASL/SSL Kafka realtime table

**Environment:** Pinot controller, realtime table, large partition count (e.g. 100+), multiple replica groups, Kafka over SASL/SSL.

**Symptom:** `POST /tables` takes several minutes to respond when the Kafka topic has a large number of partitions. The client times out, but the controller eventually completes the work correctly and writes the ideal state.

**Root cause (traced from source):**

`addTable` is fully synchronous on the HTTP thread. After `InstanceAssignmentDriver` finishes (ZK-only, fast), `PinotLLCRealtimeSegmentManager.setUpNewTable()` calls `getNewPartitionGroupMetadataList()`, which ends up in `StreamMetadataProvider.computePartitionGroupMetadata()`. That method loops over all partitions sequentially - for each partition it constructs a new `KafkaConsumer` (full SASL/SSL handshake) and calls `fetchStreamPartitionOffset`. With SASL/SSL, each handshake takes several seconds, so the total time scales linearly with partition count and blocks the HTTP thread throughout.

Call chain:
```
POST /tables (PinotTableRestletResource.java:262)
  → PinotHelixResourceManager.addTable() (line 1866)
    → PinotLLCRealtimeSegmentManager.setUpNewTable() (line 379)
      → getNewPartitionGroupMetadataList()
        → PinotTableIdealStateBuilder.getPartitionGroupMetadataList()
          → PartitionGroupMetadataFetcher.call()
            → StreamMetadataProvider.computePartitionGroupMetadata()
              → for i in 0..N:  ← serial, no parallelism
                  new KafkaPartitionLevelConnectionHandler(...)
                    → new KafkaConsumer<>()  ← SASL/SSL handshake per partition
                  → fetchStreamPartitionOffset()
```

**Questions:**
1. Is this expected? Is there a known workaround for large partition counts with SASL/SSL?
2. Is there a path to parallelize the per-partition offset fetch in `computePartitionGroupMetadata`?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

POST /tables blocks HTTP thread for minutes with high-partition-count SASL/SSL Kafka realtime table #18743

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

POST /tables blocks HTTP thread for minutes with high-partition-count SASL/SSL Kafka realtime table #18743

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions