Popular repositories Loading
-
llm-instruction-conflicts
llm-instruction-conflicts PublicThis repository contains the data and the code for the paper "Control Illusion: The Failure of Instruction Hierarchies in Large Language Models"
Python 7
-
-
persona_vectors
persona_vectors PublicForked from safety-research/persona_vectors
Persona Vectors: Monitoring and Controlling Character Traits in Language Models
Python
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
