Echo: A Joint-Embedding Predictive Architecture for Speaker Diarization and Speech Recognition in a Shared Latent Space

arXiv CS Tuesday 02 June 2026, 04:00 UTC By Louis Mouchon 1 min read

Key Points

Announce Type: new Abstract: We present Echo, a proof-of-concept audio system built around a single 25 M-parameter ViT encoder. The encoder is pretrained with a JEPA objective and then specialised by stages to carry speaker identity, phonetic content, and dynamic source routing in the same 512-dimensional latent space, with no per-task fine-tuning at deployment. Light heads handle diarization (ArcFace + VBx) and dynamic source separation (null-target K-set prediction).

arXiv:2606.01909v1 Announce Type: new Abstract: We present Echo, a proof-of-concept audio system built around a single 25 M-parameter ViT encoder. The encoder is pretrained with a JEPA objective and then specialised by stages to carry speaker identity, phonetic content, and dynamic source routing in the same 512-dimensional latent space, with no per-task fine-tuning at deployment. Light heads handle diarization (ArcFace + VBx) and dynamic source separation (null-target K-set prediction). On synthetic VoxCeleb2 mixtures with unknown K, the canonical stack reaches 15.00% blind DER, 97.80% PIT separation accuracy with +9.52 dB latent SI-SDR, and a +53.50-point speaker/content factorisation gap on a held-out k-NN probe. The point of Echo is not a new SOTA on any single task but the joint coexistence of three tasks on one encoder at this footprint. We document the design stage by stage, report the dead-ends, and identify the structural wall on end-to-end ASR through the VQ bottleneck that still bounds the PoC.

Diarization (PERSON) ViT (ORG) JEPA (ORG) VoxCeleb2 (ORG) Echo (ORG) SOTA (ORG) ASR (ORG)

Originally published by arXiv CS Read original →

Worker bees build a 'royal palace' for the honeybee queen June 10 : Honeybee queens come from the same ordinary fertilized female eggs as worker bees. So how does one bee become a queen - with the responsibility of serving as the colony's only baby maker - rather than just another worker? Until now, scientists believed it was solely because the chosen bee was served a special diet.

Channel News Asia 1h ago

Starlink rival Qianfan hits satellite milestone, but is it too slow and costly?

Starlink rival Qianfan hits satellite milestone, but is it too slow and costly? Constellation now has 201 satellites in orbit but the company is said to be under pressure to ramp up launches The constellation now has 201 satellites after a successful launch on board a Zhuque-2E rocket from the Gobi Desert at 4.23pm Beijing time on Tuesday. The mission delivered Qianfan DTC-01 – a direct-to-cell test satellite – alongside a satellite from China Mobile, state broadcaster CCTV reported.

South China Morning Post 1h ago

Insta360's Luna Ultra takes on DJI's Osmo Pocket gimbal cameras

Insta360's Luna Ultra takes on DJI's Osmo Pocket gimbal cameras The camera, which has a detachable screen, will be available starting today for $770. Insta360 has launched Luna Ultra, a direct competitor to DJI's Osmo Pocket gimbal camera lineup primarily meant for vlogging and travel documentation.

Engadget 1h ago

CIBC to Offer SpaceX Access Through Canadian Depositary Receipt

Investing CIBC to Offer SpaceX Access Through Canadian Depositary Receipt Canadian Imperial Bank of Commerce is set to offer Canadian investors access to SpaceX through a vehicle that gives local investors access to global stocks. The bank will launch a SpaceX Canadian depositary receipt on the Toronto Stock Exchange, with trading expected to begin on Friday, according to a release. SpaceX’s initial public offering is set to price Thursday and trade the following day.

Bloomberg Markets 1h ago

Echo: A Joint-Embedding Predictive Architecture for Speaker Diarization and Speech Recognition in a Shared Latent Space

Related Stories

Worker bees build a 'royal palace' for the honeybee queen

Starlink rival Qianfan hits satellite milestone, but is it too slow and costly?

Insta360's Luna Ultra takes on DJI's Osmo Pocket gimbal cameras

CIBC to Offer SpaceX Access Through Canadian Depositary Receipt