The bug of Lumina-mGPT-7B-768

Hi. Thanks to your work. 
When I want to just run baseline exp in the mode "Lumina-mGPT-7B-768". My command is that:
```
prompt=MSCOCO2017Val
model=lumina_mgpt
temperature=1
model_type=base

python main.py generate_images \
 --prompt $prompt \
 --model $model \
 --temperature $temperature \
 --model_type $model_type \
 --model_path Alpha-VLLM/Lumina-mGPT-7B-768 \
 --drafter_path sihwanpark/LANTERN-Lumina-mGPT-7B-768 \
 --output_dir /home/leihaodong/MM25/script_MM/PEANUT/exp/${model}_${temperature}/${model_type} \
 --num_images 10
```
I print the Variable of input and embedding.
"tensor([[151645,  31115,    458,   2168,    315,    220,     22,     21,     23,
             87,     22,     21,     23,   4092,    311,    279,   2701,   9934,
            510,   8002,   6128,    304,    264,  47762,   3082,    448,    264,
           6716,    315,    259,  13181,   3143,   1105,     13, 171384]],
       device='cuda:0')
Embedding(65536, 4096)"
in https://github.com/jadohu/LANTERN/blob/3a56e451dd5ebfa1903d044f754c68c420de27a9/models/base_models/lumina_mgpt/modeling_lumina_mgpt.py#L1301

Just like that:
```
        if inputs_embeds is None:
            print(input_ids)
            print(self.embed_tokens)
            inputs_embeds = self.embed_tokens(input_ids)
```
There will be the bug：

> ../aten/src/ATen/native/cuda/Indexing.cu:1284: indexSelectLargeIndex: block: [442,0,0], thread: [64,0,0] Assertion `srcIndex < srcSelectDimSize` failed.
> ../aten/src/ATen/native/cuda/Indexing.cu:1284: indexSelectLargeIndex: block: [442,0,0], thread: [65,0,0] Assertion `srcIndex < srcSelectDimSize` failed.

I think it is the reason that **T5's Embedding does not fits target model's Embedding in "Lumina-mGPT-7B-768".** So the input's index is out of the range of Embedding.

The key wrong is the bos_id and eos_id is set in wrong way in your code of "Lumina-mGPT-7B-768" model. Hope
 you can fix that.

https://github.com/jadohu/LANTERN/blob/3a56e451dd5ebfa1903d044f754c68c420de27a9/models/base_models/lumina_mgpt/xllmx/model/tokenizer.py#L24-L43

Thanks to your work again. 

	if model_path.endswith(".model"): # spm tokenizer
	self.tokenizer_type = "spm"
	# reload tokenizer
	assert os.path.isfile(model_path), model_path
	self.tokenizer = SentencePieceProcessor(model_file=model_path)
	logger.info(f"Reloaded SentencePiece model from {model_path}")

	# BOS / EOS token IDs
	self.bos_id: int = self.tokenizer.bos_id()
	self.eos_id: int = self.tokenizer.eos_id()
	assert self.tokenizer.vocab_size() == self.tokenizer.get_piece_size()
	else:
	self.tokenizer_type = "transformers"
	self.tokenizer = AutoTokenizer.from_pretrained(model_path)
	logger.info(f"load HF transformers tokenizer from {model_path}")
	# BOS / EOS token IDs
	self.bos_id: int = self.tokenizer.bos_token_id
	if self.bos_id is None:
	self.bos_id = self.tokenizer.eos_token_id
	self.eos_id: int = self.tokenizer.eos_token_id

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The bug of Lumina-mGPT-7B-768 #3

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

The bug of Lumina-mGPT-7B-768 #3

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions