Home Knowledge Base Model Multiplicity for Adversarial Detection

Model Multiplicity for Adversarial Detection

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Model Multiplicity for Adversarial Detection in Small Language Model Training on Edge Devices

arXiv:2606.07857v1 Announce Type: new Abstract: The rise of edge-based machine learning has enabled distributed adaptation of language models across mobile and IoT devices, offering privacy preservation and real-time responsiveness. However, distributed fine-tuning of language models on untrusted or heterogeneous edge nodes introduces new vulnerabilities.

arXiv CS 1d ago

Trans GAN-WT: A Feature Extraction and Interactive Learning-Based Anomaly Detection Model for Wind Turbine Time Series Data

Announce Type: cross Abstract: With the increasing scale and number of wind farms, wind turbines' daily operation and maintenance costs are increasing. To reduce operation and maintenance costs and enhance the reliability of wind turbine and system operation data before reaching catastrophic failures, monitoring the operating status of the equipment and detecting failures at an early stage is crucial. It is of great practical significance to utilize the working condition data for abnormal...

arXiv CS 7d ago

Human-Like Neural Nets by Catapulting

Human-like Neural Nets by Catapulting Speculative proposal to create artificial neural nets with human-like performance by high-learning-rate/regularization training of overparameterized NNs to trigger catapulting/grokking. Over-parameterization as a route to true generalization would resolve many outstanding mysteries of artificial versus natural intelligence. There are many mysteries about deep learning and human intelligence, but we could describe the biggest anomaly this way: why are...

Hacker News 3d ago

Dual Feature Decoupling for Fine-Grained OOD Detection

arXiv:2606.05536v1 Announce Type: new Abstract: Out-of-distribution detection (OOD) is an indispensable technique when applying machine learning models to real-world scenarios. Most existing OOD detection methods have been developed under the idealized assumption of large inter-class distributional differences, while largely overlooking fine-grained tasks characterized by subtle variations, such as medical image classification and vehicle recognition. The high visual similarity among...

arXiv CS 5d ago

Audio Pirates: Black-box Audio Watermark Removal via Diffusion Priors

Announce Type: new Abstract: With the rise of AI-generated audio, watermarking has become widely used for detecting misuse and protecting intellectual property. However, adversaries may try to remove these watermarks, making it critical to evaluate how well watermarking schemes withstand removal attacks. Existing attacks are often impractical: they either noticeably degrade perceptual quality or require access to the watermarking scheme.

arXiv CS 9d ago

What the Eyes See, the LLMs Miss: Exploiting Human Perception for Adversarial Text Attacks

arXiv:2606.09700v1 Announce Type: new Abstract: Large language model (LLM)-powered content moderation systems have become a critical defense against harmful online content. However, these systems primarily operate on tokenized text and largely ignore the visual cues that humans naturally rely on when interpreting content.

arXiv CS 1d ago

Deep learning four decades of human migration

Abstract Human migration is a fundamental driver of global demographic change, shaping population structure, labour markets and social policy across countries1,2,3. Although long-term migration patterns are often linked to economic development4, they can shift rapidly in response to shocks such as conflict, environmental crises and political change5. Despite its importance, migration remains difficult to measure consistently: existing data are sparse, concentrated in high-income settings and...

Nature 17h ago

OTora: A Unified Red Teaming Framework for Reasoning-Level Denial-of-Service in LLM Agents

arXiv:2605.08876v2 Announce Type: replace Abstract: Large Language Models (LLMs) are increasingly deployed as autonomous agents that execute tool-augmented, multi-step tasks, where latency is a critical factor for real-world applications. Yet an overlooked threat is Reasoning-Level Denial-of-Service (R-DoS), in which an attacker preserves task correctness but degrades availability by inflating an agent's reasoning depth or tool-use budget.

arXiv CS 1d ago

Claude Fable 5

Claude Fable 5 and Claude Mythos 5 Today we’re launching Claude Fable 5: a Mythos-class1 model that we’ve made safe for general use. Fable 5’s capabilities exceed those of any model we’ve ever made generally available.

Hacker News 1d ago

Apple's AI Can Now Change Your Passwords. What Could Possibly Go Wrong?

Image: Apple Apple's AI Can Now Change Your Passwords. What Could Possibly Go Wrong? Apple's new AI can automatically change compromised passwords, but giving an agent control of account credentials introduces risks involving prompt injection, lockouts, consent, and compromised devices.

Hacker News 22h ago