Some abbreviations have inconsistent whitespace, for example spelling e. g. with space. The tokenizer should have some way of eliminating spaces in these based on a list in some file, possibly producing some annotation that indicates the original spelling (maybe sic+hi@rend="x-space"):
e.g.
Or adding an attribute with the original spelling (could do , though that is not really TEI)
Some abbreviations have inconsistent whitespace, for example spelling
e.g.e. g.with space. The tokenizer should have some way of eliminating spaces in these based on a list in some file, possibly producing some annotation that indicates the original spelling (maybe sic+hi@rend="x-space"):Or adding an attribute with the original spelling (could do , though that is not really TEI)