Probe the fine-tuned model for the traits/facts it encoded about you — a fascinating learning angle and a safety check at once.
Directions: probing classifiers; activation analysis; surface (and let users correct) what the model thinks is "you". Part of the exploratory roadmap.
Probe the fine-tuned model for the traits/facts it encoded about you — a fascinating learning angle and a safety check at once.
Directions: probing classifiers; activation analysis; surface (and let users correct) what the model thinks is "you". Part of the exploratory roadmap.