When training a Lora, many trainers have the option of using a custom "keep token separator" to prevent the shuffling of a variable number of tags. For example, when training a Lora for a character and several outfit elements one can use tags like CHARACTER, CHARACTERHAT, CHARACTERSHOES, CHARACTERBELT etc. But if the dataset includes a character without all the outfit elements visible or even other characters wearing these elements, the number of tags to keep from shuffling would vary per image. That's when the custom "keep token separator" comes in handy, allowing the custom tags to be separated from the rest by said separator (commonly "|||" for example) and eliminating the need for multiplying folders.
It'd be ideal if TagGUI could recognize those token separators. Currently, using it on a dataset that's tagged this way will have it recognize such individual tags as "CHARACTER ||| 1boy" or "CHARACTERHAT ||| white background", which is not just messy but also alters the tag count. Hence, a field in which to put an extra custom separator that'd function the same way as a comma already does, would be a nice addition.
When training a Lora, many trainers have the option of using a custom "keep token separator" to prevent the shuffling of a variable number of tags. For example, when training a Lora for a character and several outfit elements one can use tags like CHARACTER, CHARACTERHAT, CHARACTERSHOES, CHARACTERBELT etc. But if the dataset includes a character without all the outfit elements visible or even other characters wearing these elements, the number of tags to keep from shuffling would vary per image. That's when the custom "keep token separator" comes in handy, allowing the custom tags to be separated from the rest by said separator (commonly "|||" for example) and eliminating the need for multiplying folders.
It'd be ideal if TagGUI could recognize those token separators. Currently, using it on a dataset that's tagged this way will have it recognize such individual tags as "CHARACTER ||| 1boy" or "CHARACTERHAT ||| white background", which is not just messy but also alters the tag count. Hence, a field in which to put an extra custom separator that'd function the same way as a comma already does, would be a nice addition.