Hi, I am very interested in the method you proposed for detecting harmful data based on hidden layer states. I noticed in your paper that you used AUROC and AUPRC to evaluate the algorithm's detection performance. However, in practical application scenarios, a clear threshold is required to determine whether a sample is harmful. I would like to ask: how should this threshold be determined according to your method?
Additionally, do different LVLMs require different thresholds, and how should such thresholds be determined based on the specific model?

Hi, I am very interested in the method you proposed for detecting harmful data based on hidden layer states. I noticed in your paper that you used AUROC and AUPRC to evaluate the algorithm's detection performance. However, in practical application scenarios, a clear threshold is required to determine whether a sample is harmful. I would like to ask: how should this threshold be determined according to your method?
Additionally, do different LVLMs require different thresholds, and how should such thresholds be determined based on the specific model?