Tokenization for the large model

1) The config for large model specifies a vocab size of 51200, is there a separate tokenizer file for it? Weirdly vocab falls back down to 32 for xlarge which makes me think typo?
2) The tokenizer file specifies a vocab_size of 30, while the config for base and small specifies 32. Is this rounding to a power of two for efficiency?