I appreciate your sharing of this intriguing research and the accompanying code.
Upon reviewing the results, it appears that the performance of the pretrained model and the individual (finetuned) model diverges from those reported in papers [1], [2], and [3], as well as in numerous other studies, despite all utilizing the ViT model from CLIP. Could you provide an explanation for the potential reasons behind these discrepancies?
Thank you for your attention to this matter.
[1] Editing Models with Task Arithmetic
[2] ADAMERGING: Adaptive Model Merging for Multi-Task Learning
[3] Representation Surgery for Multi-Task Model Merging
Thank you.
I appreciate your sharing of this intriguing research and the accompanying code.
Upon reviewing the results, it appears that the performance of the pretrained model and the individual (finetuned) model diverges from those reported in papers [1], [2], and [3], as well as in numerous other studies, despite all utilizing the ViT model from CLIP. Could you provide an explanation for the potential reasons behind these discrepancies?
Thank you for your attention to this matter.
[1] Editing Models with Task Arithmetic
[2] ADAMERGING: Adaptive Model Merging for Multi-Task Learning
[3] Representation Surgery for Multi-Task Model Merging
Thank you.