Skip to content

Commit df57cb2

Browse files
authored
[Version] v1.8.1. (#30)
1 parent aaf5dc5 commit df57cb2

2 files changed

Lines changed: 13 additions & 1 deletion

File tree

CHANGELOG.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,16 @@
11
# CHANGELOG
2+
# [Version v1.8.1](https://github.com/intel/xFasterTransformer/releases/tag/v1.8.1)
3+
v1.8.1
4+
5+
## Functionality
6+
- Expose the interface of embedding lookup.
7+
8+
## Performance
9+
- Optimized the performance of grouped query attention (GQA).
10+
- Enhanced the performance of creating keys for the oneDNN primitive cache.
11+
- Set the [bs][nh][seq][hs] layout as the default for KV Cache, resulting in better performance.
12+
- Improved the task split imbalance issue in self-attention.
13+
214
# [Version v1.8.0](https://github.com/intel/xFasterTransformer/releases/tag/v1.8.0)
315
v1.8.0 Continuous Batching on Single ARC GPU and AMX_FP16 Support.
416

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
1.8.0
1+
1.8.1

0 commit comments

Comments
 (0)