Skip to content

Feature request: Add Contrastive Representation Distillation (CRD) #887

@surajyadav-research

Description

@surajyadav-research

Requesting a built-in Contrastive Representation Distillation (CRD) strategy (Tian et al., 2019) for tunix.distillation.DistillationTrainer.

What CRD does

Distills at the representation level using an InfoNCE/contrastive loss:

  • positive pair: (student rep, teacher rep) from the same sample
  • negatives: mismatched pairs (e.g., in-batch negatives)

Minimal form:

  • logits = (z_s @ z_t.T) / tau
  • labels = arange(B)
  • loss = CE(logits, labels) (optional symmetric term on logits.T)

Why this helps

  • Complements / improves over logit-KD in many setups

Reference

Tian et al., Contrastive Representation Distillation, arXiv:1910.10699 (2019): https://arxiv.org/pdf/1910.10699

I can contribute a PR + tests if you’re open to it.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions