用Transformers库调用模型,max_new_tokens设置为20000,当生成长度为4096时会出现警告:This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (4096). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.
但是模型不会结束,会持续占用GPU,并且出现卡死的情况。
用Transformers库调用模型,max_new_tokens设置为20000,当生成长度为4096时会出现警告:This is a friendly reminder - the current text generation call will exceed the model's predefined maximum length (4096). Depending on the model, you may observe exceptions, performance degradation, or nothing at all.
但是模型不会结束,会持续占用GPU,并且出现卡死的情况。