Skip to content

Checkpoint tokenizer vocab mismatch causes lm_head shape error (152064 vs 151646) on first /predict #7

@Tancilon

Description

@Tancilon

Hi maintainers,
I’m hitting a vocab-size mismatch error when loading the provided checkpoint and making the first inference call.

Steps to reproduce:

  1. Create env and install deps (per README).
  2. Download checkpoint to ckpt/checkpoint-5554.
  3. Ensure SigLIP is available (e.g., ./siglip-so400m-patch14-384).
  4. Start server:
conda activate reconvla
python reconvla/serve/flask_server.py \
  --model-path ckpt/checkpoint-5554 \
  --action_stat reconvla/calvin/dataset/calvin_debug_dataset/validation/statistics.yaml \
  --port 9097
  1. Trigger a /predict call (e.g., via evaluation script). The server errors on first request.
    Actual result:
ValueError: Trying to set a tensor of shape torch.Size([152064, 3584]) in "weight"
(which has shape torch.Size([151646, 3584])), this looks incorrect.

Additional info:

  • config.json:
    • "vocab_size": 152064
  • AutoTokenizer.from_pretrained(ckpt/checkpoint-5554):
    • len(tokenizer)=151646, vocab_size=151643
  • vocab.json size: 151643
  • added_tokens.json: 3 tokens

This suggests the checkpoint expects a larger vocab than the tokenizer files provide. If resize_token_embeddings(len(tokenizer)) is called, the embedding shrinks to 151646 and then conflicts with the stored weights (152064).

Possible fix / question:

  • Should the tokenizer files include the extra tokens used during training?
  • Or should the loader avoid shrinking embeddings when len(tokenizer) < config.vocab_size and instead pad tokenizer to match the model vocab?

Environment:

  • OS: Ubuntu (container)
  • Python: 3.10 (reconvla env)
  • GPU: RTX 4090, Driver 570.195.03, CUDA 12.8

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions