GiPL: Generative augmented iterative Pseudo-Labeling for Cross-Domain Few-Shot Object Detection

arXiv CS Tuesday 02 June 2026, 04:00 UTC By Jiacong Liu, Shu Luo, Yikai Qin, Yaze Zhao, Yongwei Jiang, Yixiong Zou 1 min read

Key Points

Announce Type: replace Abstract: Vision-language foundation models have shown promising zero-shot generalization for Cross-Domain Few-Shot Object Detection (CD-FSOD). However, they face two critical challenges in fine-tuning: insufficient support set utilization due to sparse single-instance annotations, and severe overfitting under extremely limited target-domain samples. To address these issues, this paper proposes GiPL, an efficient two-branch training framework.

arXiv:2605.29539v2 Announce Type: replace Abstract: Vision-language foundation models have shown promising zero-shot generalization for Cross-Domain Few-Shot Object Detection (CD-FSOD). However, they face two critical challenges in fine-tuning: insufficient support set utilization due to sparse single-instance annotations, and severe overfitting under extremely limited target-domain samples. To address these issues, this paper proposes GiPL, an efficient two-branch training framework. In the first branch, we design an iterative pseudo-label self-training paradigm, which performs zero-shot inference on the support set to generate reliable pseudo-annotations, fuses them with ground-truth labels, and iteratively optimizes the model to fully exploit support set data. In the second branch, we introduce generative data augmentation pipeline using large vision-language models, which synthesizes domain-aligned, multi-object annotated images to enrich training samples and suppress overfitting. Extensive experiments on three challenging CD-FSOD datasets (RUOD, CARPK, CarDD) under 1/5/10-shot settings demonstrate that GiPL consistently outperforms state-of-the-art methods with significant performance gains. Code is available at \href{https://github.com/z-yaz/CDiscover}{CDiscover}.

Pseudo-Labeling (ORG) Cross-Domain Few-Shot Object Detection (ORG) FSOD (ORG)

Originally published by arXiv CS Read original →

GiPL: Generative augmented iterative Pseudo-Labeling for Cross-Domain Few-Shot Object Detection

Related Stories

Musk Stock Fans Say ‘The More, The Better’ in SpaceX IPO Frenzy

Whale graveyard dating back five million years discovered

Whale graveyard dating back five million years discovered

SpaceX Leaves Some Banks Peeved at Junior Roles in IPO Lineup