Releases · xiaobing318/llama.cpp

15 Aug 16:44

13d4335

b6179 Latest

Latest

Merge branch 'ggml-org:master' into master

Assets 15

cudart-llama-bin-win-cuda-12.4-x64.zip

sha256:8c79a9b226de4b3cacfd1f83d24f962d0773be79f1e7b75c6af4ded7e32ae1d6

373 MB 2025-08-15T16:44:20Z
llama-b6179-bin-macos-arm64.zip

sha256:44d10443038d5ef0c703c6aaedc47a91c1d4e5a6be35baf57b4bc724d33303de

10.9 MB 2025-08-15T16:44:29Z
llama-b6179-bin-macos-x64.zip

sha256:d8b10570f40d0f490daf6f1105f4b830bc9f2093ded9a9e026743b820cf02d07

28 MB 2025-08-15T16:44:30Z
llama-b6179-bin-ubuntu-vulkan-x64.zip

sha256:f908bdbc94e222eb2772ba137c7ceec7c193c0a6b7eb6f53f46810908c412e81

21.6 MB 2025-08-15T16:44:32Z
llama-b6179-bin-ubuntu-x64.zip

sha256:76a5f61afceeb1aaf3905a92c817b6cd43fec5e0f9f4722a46d0bee836df59bf

12.9 MB 2025-08-15T16:44:33Z
llama-b6179-bin-win-cpu-arm64.zip

sha256:9a500b6b7cb0c3db046d4d286c9c1f2cc39195e3542e4e6b8fb91d2658bacfbe

11.1 MB 2025-08-15T16:44:34Z
llama-b6179-bin-win-cpu-x64.zip

sha256:bd51be25a18af586d25d639442e5a245af30a00e455cb0cb693a6c922293e51b

14 MB 2025-08-15T16:44:35Z
llama-b6179-bin-win-cuda-12.4-x64.zip

sha256:23afb9e768faff099076b48ce1736c71f89d98fcfc40df0da7c1fe06a770483e

139 MB 2025-08-15T16:44:36Z
llama-b6179-bin-win-hip-radeon-x64.zip

sha256:eef18e400db14fe7dc7a9f845126403d8e41593959ff41b3a227cd9700cd472c

288 MB 2025-08-15T16:44:40Z
llama-b6179-bin-win-opencl-adreno-arm64.zip

sha256:b33e1c9ccaa1bfa4bb708bdbc82a8160a9475e49bdfaf28801f7d4d2a3b92a33

11.5 MB 2025-08-15T16:44:48Z
Source code (zip)

2025-08-15T15:42:14Z
Source code (tar.gz)

2025-08-15T15:42:14Z

08 Apr 11:43

github-actions

b5077

8ca6e1c

b5077

server : webui : Improve Chat Input with Auto-Sizing Textarea (#12785)

* Update ChatScreen.tsx

* useAutosizeTextarea.ts

useAutosizeTextarea to encapsulate the logic.

* Implement responsive auto-sizing chat textarea

Replaces the manual textarea resizing with an automatic height adjustment based on content.

- `useChatTextarea` hook to manage textarea state and auto-sizing logic via refs, preserving the optimization
- Textarea now grows vertically up to a maximum height (`lg:max-h-48`) on large screens (lg breakpoint and up).
- Disables auto-sizing and enables manual vertical resizing (`resize-vertical`) on smaller screens for better mobile usability.
- Aligns the "Send" button to the bottom of the textarea (`items-end`) for consistent positioning during resize.

* -update compressed index.html.gz after npm run build
-refactor: replace OptimizedTextareaValue with AutosizeTextareaApi in VSCode context hook

* chore: normalize line endings to LF
refactor: AutosizeTextareaApi -> chatTextareaApi

* refactor: Rename interface to PascalCase

---------

Co-authored-by: Xuan Son Nguyen <son@huggingface.co>

Assets 26

19 Mar 10:16

github-actions

b4923

108e53c

b4923

llama : add support for GPT2, Bloom and CodeShell tied word embedding…

Assets 26

17 Feb 23:08

github-actions

b4735

73e2ed3

b4735

CUDA: use async data loading for FlashAttention (#11894)

* CUDA: use async data loading for FlashAttention

---------

Co-authored-by: Diego Devesa <slarengh@gmail.com>

Assets 24

12 Feb 16:17

github-actions

b4695

fef0cbe

b4695

cleanup: fix compile warnings associated with gnu_printf (#11811)

Assets 23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!

Releases: xiaobing318/llama.cpp

b6179

Uh oh!

b5077

Uh oh!

b4923

Uh oh!

b4735

Uh oh!

b4695

Uh oh!