Skip to content

fix: Avoid overflows when reading json inputs#8676

Open
whoisj wants to merge 18 commits intomainfrom
jwyman/tri-673
Open

fix: Avoid overflows when reading json inputs#8676
whoisj wants to merge 18 commits intomainfrom
jwyman/tri-673

Conversation

@whoisj
Copy link
Contributor

@whoisj whoisj commented Feb 24, 2026

Check input from HTTP+JSON sizes before allocating memory for them.

Protect against integer overflows while we're at it.

@whoisj whoisj added the PR: fix A bug fix label Feb 24, 2026
@mudit-eng
Copy link
Contributor

LGTM

@mattwittwer
Copy link
Contributor

One edge case:
max_input_size_ is checked per-input rather than cumulatively, allowing up to a 4x memory limit bypass.
This PR checks byte_size > max_input_size_ to prevents excessive memory allocation. However, because this check is evaluated on individual inputs rather than the cumulative total of the entire HTTP request, a single request can still cause the server to allocate up to 4x the memory limit.

The scenario:
The upstream EVBufferToJson limits the physical size of the incoming HTTP JSON payload to max_input_size_.
However, the most compact JSON representation of an element is 0, (2 bytes), while an FP64 element allocates 8 bytes in memory.
With a limit set to --http-max-input-size=128000000 (128 MB), a single 128 MB JSON payload formatted like this can allocate 512MB:

{
  "parameters": {
    "in_1": {"datatype": "FP64", "data": [ ... 16 million zeros (32 MB JSON) ... ]},
    "in_2": {"datatype": "FP64", "data": [ ... 16 million zeros (32 MB JSON) ... ]},
    "in_3": {"datatype": "FP64", "data": [ ... 16 million zeros (32 MB JSON) ... ]},
    "in_4": {"datatype": "FP64", "data": [ ... 16 million zeros (32 MB JSON) ... ]}
  }
}

What happens:
The upstream check passes because the total HTTP payload is exactly 128 MB.
ExactMappingInput parses in_1. The calculated byte_size is 128 MB. The check if (byte_size > max_input_size_) passes. Triton allocates 128 MB. This repeats for in_2, in_3, and in_4. Resulting in the server allocating 512 MB of memory for a single request, bypassing the 128 MB max_input_size_ limit.

Copy link
Contributor

@yinggeh yinggeh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Though I think it's a bit of over-engineering here.

@whoisj
Copy link
Contributor Author

whoisj commented Feb 27, 2026

max_input_size_ is checked per-input rather than cumulatively, allowing up to a 4x memory limit bypass.

Agreed about exceeding the overall limit. I'll create a follow up to address that separately. Thanks for pointing it out.

@whoisj whoisj requested a review from yinggeh March 2, 2026 22:09
@whoisj whoisj force-pushed the jwyman/tri-673 branch 2 times, most recently from c603e3d to 75c3c95 Compare March 3, 2026 00:09
@yinggeh
Copy link
Contributor

yinggeh commented Mar 4, 2026

PR title isn't following the convention.

@yinggeh
Copy link
Contributor

yinggeh commented Mar 4, 2026

I think test is required for such code change. Whether verify its functionality with existing ones or by creating a new test.

@whoisj whoisj requested a review from yinggeh March 4, 2026 20:03
@whoisj whoisj changed the title Avoid overflows when reading json inputs fix: Avoid overflows when reading json inputs Mar 5, 2026
@whoisj
Copy link
Contributor Author

whoisj commented Mar 6, 2026

I think test is required for such code change. Whether verify its functionality with existing ones or by creating a new test.

Added 2 new tests. Let me know if they look good to you.

@whoisj whoisj requested a review from yinggeh March 6, 2026 01:48
)
error_msg = response.content.decode()
self.assertEqual(
'{"error":"Request JSON size of 89478576 bytes exceeds the maximum allowed value of 67108864 bytes. Use --http-max-input-size to increase the limit."}',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to see tests that passing with size 67108864 but failed with 67108864+1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed the format.

fi

# Run test to verify that large inputs fail with default limit
python http_input_size_limit_test.py InferSizeLimitTest.test_type_size_explosion >> $CLIENT_LOG 2>&1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am confused with the naming here. Can you use a more descriptive name?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you have a suggestion?

@whoisj whoisj force-pushed the jwyman/tri-673 branch 4 times, most recently from 3f045a5 to 52f656f Compare March 9, 2026 15:49
@whoisj whoisj force-pushed the jwyman/tri-673 branch 2 times, most recently from db78671 to a67598d Compare March 11, 2026 20:39
@whoisj whoisj marked this pull request as draft March 19, 2026 22:02
@whoisj
Copy link
Contributor Author

whoisj commented Mar 19, 2026

Converted to a draft after hitting another SEGFAULT bug. Will reopen once the tests are clean.

@whoisj whoisj force-pushed the jwyman/tri-673 branch 5 times, most recently from e793a58 to 77b3f26 Compare March 20, 2026 20:12
@whoisj whoisj marked this pull request as ready for review March 23, 2026 16:08
@whoisj whoisj requested a review from yinggeh March 23, 2026 16:09
@whoisj whoisj force-pushed the jwyman/tri-673 branch 2 times, most recently from d864e63 to fcaabcf Compare March 23, 2026 20:08
@whoisj whoisj requested a review from mudit-eng March 23, 2026 21:43
std::vector<char>& serialized = infer_req->serialized_data_.back();
serialized.resize(byte_size);

char* serialized_base = &serialized[0];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

&serialized[0] is undefined behavior if vector is empty. Safer to do something like:

char* serialized_base = serialized.data();

or

char* serialized_base = serialized.empty() ? nullptr : serialized.data();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure. I was just using the existing code, but I'll fix up regardless.

const MappingSchema* request_schema_{nullptr};
const MappingSchema* response_schema_{nullptr};
const bool streaming_{false};
const size_t max_input_size_{0};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTPAPIServer already has protected member:size_t max_input_size_;. Can this create confusion?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how, they're both private fields.

" bytes. Use --http-max-input-size to increase the limit.")
("Request JSON size of " + std::to_string(length) + " + " +
std::to_string(overrun) +
" bytes exceeds the maximum allowed input size. "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this message is intuitive. This does not convey what is the maximum input size.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the updated format was a request from @yinggeh. Can you two decide between you which you prefer?

const std::string& name,
triton::common::TritonJson::Value& generate_request,
std::map<std::string, triton::common::TritonJson::Value>& input_metadata)
std::map<std::string, triton::common::TritonJson::Value>& input_metadata,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use unordered_map here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no idea, but that would be changing existing code and I'd rather not because I feel that is outside the scope of this PR.

const MappingSchema* RequestSchema() { return request_schema_; }
const MappingSchema* ResponseSchema() { return response_schema_; }

size_t MaxInputSize() { return max_input_size_; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not being called anywhere

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove.

// allowed input size.
if (byte_size + consumed_input_byte_size > max_input_size_ ||
byte_size + consumed_input_byte_size < consumed_input_byte_size) {
auto overrun = byte_size + consumed_input_byte_size - max_input_size_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can byte_size + consumed_input_byte_size result in an overflow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no because of the check immediately previous.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you checked for overflow here: byte_size + consumed_input_byte_size < consumed_input_byte_size) but can't there be an overflow here: auto overrun = byte_size + consumed_input_byte_size - max_input_size_

@whoisj whoisj requested a review from mudit-eng March 25, 2026 16:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

PR: fix A bug fix

Development

Successfully merging this pull request may close these issues.

4 participants