Skip to content

The bug of Lumina-mGPT-7B-768 #3

@Haodong-Lei-Ray

Description

@Haodong-Lei-Ray

Hi. Thanks to your work.
When I want to just run baseline exp in the mode "Lumina-mGPT-7B-768". My command is that:

prompt=MSCOCO2017Val
model=lumina_mgpt
temperature=1
model_type=base

python main.py generate_images \
 --prompt $prompt \
 --model $model \
 --temperature $temperature \
 --model_type $model_type \
 --model_path Alpha-VLLM/Lumina-mGPT-7B-768 \
 --drafter_path sihwanpark/LANTERN-Lumina-mGPT-7B-768 \
 --output_dir /home/leihaodong/MM25/script_MM/PEANUT/exp/${model}_${temperature}/${model_type} \
 --num_images 10

I print the Variable of input and embedding.
"tensor([[151645, 31115, 458, 2168, 315, 220, 22, 21, 23,
87, 22, 21, 23, 4092, 311, 279, 2701, 9934,
510, 8002, 6128, 304, 264, 47762, 3082, 448, 264,
6716, 315, 259, 13181, 3143, 1105, 13, 171384]],
device='cuda:0')
Embedding(65536, 4096)"
in

inputs_embeds = self.embed_tokens(input_ids)

Just like that:

        if inputs_embeds is None:
            print(input_ids)
            print(self.embed_tokens)
            inputs_embeds = self.embed_tokens(input_ids)

There will be the bug:

../aten/src/ATen/native/cuda/Indexing.cu:1284: indexSelectLargeIndex: block: [442,0,0], thread: [64,0,0] Assertion srcIndex < srcSelectDimSize failed.
../aten/src/ATen/native/cuda/Indexing.cu:1284: indexSelectLargeIndex: block: [442,0,0], thread: [65,0,0] Assertion srcIndex < srcSelectDimSize failed.

I think it is the reason that T5's Embedding does not fits target model's Embedding in "Lumina-mGPT-7B-768". So the input's index is out of the range of Embedding.

The key wrong is the bos_id and eos_id is set in wrong way in your code of "Lumina-mGPT-7B-768" model. Hope
you can fix that.

if model_path.endswith(".model"): # spm tokenizer
self.tokenizer_type = "spm"
# reload tokenizer
assert os.path.isfile(model_path), model_path
self.tokenizer = SentencePieceProcessor(model_file=model_path)
logger.info(f"Reloaded SentencePiece model from {model_path}")
# BOS / EOS token IDs
self.bos_id: int = self.tokenizer.bos_id()
self.eos_id: int = self.tokenizer.eos_id()
assert self.tokenizer.vocab_size() == self.tokenizer.get_piece_size()
else:
self.tokenizer_type = "transformers"
self.tokenizer = AutoTokenizer.from_pretrained(model_path)
logger.info(f"load HF transformers tokenizer from {model_path}")
# BOS / EOS token IDs
self.bos_id: int = self.tokenizer.bos_token_id
if self.bos_id is None:
self.bos_id = self.tokenizer.eos_token_id
self.eos_id: int = self.tokenizer.eos_token_id

Thanks to your work again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions