-
Notifications
You must be signed in to change notification settings - Fork 20
Open
Description
Hi, have you ever test the zero-shot accuracy on scannet200, i.e., replace those class-agnostic mask proposals predicted by Mask3D and use ground-truth instances as input to your mask feature computation module to get the mask features, then dot product with text embeddings from CLIP text encoder and select the maximum index as the predicted label for ground-truths? I just wonder how accurate CLIP is.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels