Skip to content

Generation speed issue #26

@eagle705

Description

@eagle705

I load llama2 model like example successfully but the speed to generate text is really slow.

image

[1] I'm not sure it use mps to accelerate generation.
How to confirm it?
[2] Is there a smaller LLM than 7B?

Here is my env

  • Macbook Air / M2 / 16GB / Sonoma 14.5
  • Xcode 15.4
  • ckpt: coreml-projects/Llama-2-7b-chat-coreml

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions