Home Knowledge Base CLIP Projectors for Efficient Intra

CLIP Projectors for Efficient Intra

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

IsoCLIP: Decomposing CLIP Projectors for Efficient Intra-modal Alignment

Announce Type: replace Abstract: Vision-Language Models like CLIP are extensively used for inter-modal tasks which involve both visual and text modalities. However, when the individual modality encoders are applied to inherently intra-modal tasks like image-to-image retrieval, their performance suffers from the intra-modal misalignment. In this paper we study intra-modal misalignment in CLIP with a focus on the role of the projectors that map pre-projection image and text embeddings into the...

arXiv CS 9d ago