Skip to content

Commit fdf38b3

Browse files
fix: avoid cleanup errors for partially initialized LlamaModel (abetlen#2173)
* Add attribute check for sampler in close method This solves a bug I uncovered, that causes an AttributeError if constantly re-initializing a model in a loop and Python garbage collects it, such as testing the highest GPU layer count you can go before CUDA OOMs. * fix: avoid cleanup errors for partial model init --------- Co-authored-by: abetlen <abetlen@gmail.com>
1 parent 6bdab5d commit fdf38b3

2 files changed

Lines changed: 4 additions & 1 deletion

File tree

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
- fix: avoid cleanup errors for partially initialized `LlamaModel` objects by @usernames122 in #2173
1011
- fix: suppress stdout and stderr in Jupyter notebooks by @Anai-Guo in #2181
1112
- feat: enable arm64 musl builds by @acon96 in #2221
1213
- feat: Update llama.cpp to ggml-org/llama.cpp@d749821db

llama_cpp/_internals.py

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,9 @@ def __init__(
4444
self.params = params
4545
self.verbose = verbose
4646
self._exit_stack = ExitStack()
47+
# LlamaModel does not use samplers, but close() can run after partial init.
48+
self.sampler = None
49+
self.custom_samplers = []
4750

4851
model = None
4952

@@ -65,7 +68,6 @@ def __init__(
6568

6669
self.model = model
6770
self.vocab = vocab
68-
self.sampler = None # LlamaModel doesn't use samplers, but some cleanup code expects this attribute
6971

7072
def free_model():
7173
if self.model is None:

0 commit comments

Comments
 (0)