Skip to content

Priority Linked Blocking Deque Implementation (not intended to be merged)#1

Open
mpoemsl wants to merge 32 commits intotorchserve-0.7.1from
feature/priority-linked-blocking-deque
Open

Priority Linked Blocking Deque Implementation (not intended to be merged)#1
mpoemsl wants to merge 32 commits intotorchserve-0.7.1from
feature/priority-linked-blocking-deque

Conversation

@mpoemsl
Copy link
Copy Markdown

@mpoemsl mpoemsl commented Mar 2, 2023

This PR implements request priorisation by replacing LinkedBlockingDeque in Model.java with a wrapper of multiple LinkedBlockingDeques (one for each priority class).

This PR builds on TorchServe 0.7.1. The behavior is as explained in #88. The implementation of this behavior is achieved with these changes:

  1. Jobs parse an int priority from the HTTP header X-Priority (default priority class is 0) and extend the Prioritisable interface (getter and setter for priority)
  2. Models store Jobs in a new class PriorityLinkedBlockingDeque which wraps multiple LinkedBlockingDeques (one for each priority class, the number of which is specified via TS_NUMBER_OF_PRIORITIES)
  3. PriorityLinkedBlockingDeque implements all methods of LinkedBlockingDeque that are used in TorchServe, namely addFirst and offer for insertion and poll for extraction.
  4. Insertion is forwarded to the appropriate LinkedBlockingDeque for a Job's priority (if it is full, a ServiceUnavailableException is thrown)
  5. Extraction follows this order to find a non-empty deque:
    5.1. Deque for priority 0
    5.2. Deque for sampled priority >0
    5.3. All other deques
    If all deques are empty, poll sampled deque until next insertion in any deque.
  6. A new metric RequestPriority logs the priority of each request.

@mpoemsl mpoemsl marked this pull request as draft March 2, 2023 12:15
@mpoemsl mpoemsl marked this pull request as ready for review March 6, 2023 09:26
@mpoemsl mpoemsl requested review from pypae and simonschoelly March 6, 2023 09:33
scheduled = begin;

Map<String, String> headers = input.getHeaders();
if (headers.containsKey("X-Priority")) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not that it is incorrect, but maybe it is a bit unnatural to use a header instead of the body for the priority? But if it is the only way, then lets keep it.

I am also not sure about the name X-Priority - apparently that header is already used in some cases for emails.
In the past, I started using the convention of prefixing headers with X-TS- so that it is clear, that it is our own header.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea, I'll change it to X-TS-. I think at this point the request body is not yet parsed, so it has to be a header unless we want introduce overhead at the pre-queueing stage.

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class PriorityLinkedBlockingDeque<T extends Prioritisable> {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wanted to be future proof, we probably had to implement all interfaces implemented by LinkedBlockingDeque. But I guess just implementing the methods that are actually used is okay for now.

@mpoemsl mpoemsl changed the title Priority Linked Blocking Deque Priority Linked Blocking Deque Implementation (not intended to be merged) Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants