Skip to content

Expose raw logit weights to the API if request_logits=true#7

Open
pathorn wants to merge 1 commit intodeepinfra:mainfrom
pathorn:logit_weights
Open

Expose raw logit weights to the API if request_logits=true#7
pathorn wants to merge 1 commit intodeepinfra:mainfrom
pathorn:logit_weights

Conversation

@pathorn
Copy link
Copy Markdown

@pathorn pathorn commented Nov 28, 2023

What does this PR do?

Adds a new input boolean parameter return_logits. If true, each Generation will contain the base64 of the binary float32 array of logit weights inside the Token's details.

I have tested performance and this extra output data does not add any measurable performance overhead.

It would be useful to have a test application on top of this API to demonstrate its usefulness.

Copy link
Copy Markdown

@NikolaBorisov NikolaBorisov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we could have options to return top 100 or 1000 tokens in addition to all. Right now returning 32000 * 32 is 1MB

Comment thread proto/generate.proto
Comment on lines +188 to +189
/// Logit tokens
optional string logit_tokens = 3;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

here you have string, and bytes above. I think you want bytes?

Comment thread router/src/server.rs
if req.0.parameters.return_full_text.unwrap_or(false) {
add_prompt = Some(req.0.inputs.clone());
}
// let return_logits = req.0.parameters.return_logits;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete?

Comment thread router/src/infer.rs
text: generation.token_text,
logprob: generation.token_logprob,
special: generation.token_is_special,
logits: if let Some(binfloats) = generation.logit_binary { Some(general_purpose::URL_SAFE.encode(binfloats)) } else { None },
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not inline this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants