More self-contained info in BERT models

Models that are constructed from pretrained models should bring their tokenizer + vocabulary along for the ride, since those are a necessary part of the model (you won't get the same result with a different tokenizer, for example).

If users want to do something weird (like using a subset of the vocabulary), they can construct the model more manually; if they use `make_and_load_bert`, they're specifying a BERT model.

Even within {torchtransformers} (before moving to the more constrained models in {tidybert}), we can then include tools that work with things more automatically (eg, the input to the model can be raw text).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

More self-contained info in BERT models #30

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

More self-contained info in BERT models #30

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions