CIFAR
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Student Capacity Moderates Knowledge Distillation Effectiveness: A Systematic Study Across ResNet Teacher-Student Pairs on CIFAR-10
arXiv:2605.31191v1 Announce Type: new Abstract: We investigate how teacher-student capacity relationships modulate knowledge distillation (KD) effectiveness in ResNet-based image classification on CIFAR-10. Across three teacher-student pairs -- R50->R18, R34->R18, and R50->R34 -- we compare Logit-KD and Feature-KD under controlled, reproducible conditions (3 seeds, mean+/-std reported throughout). We report three main findings.
The CIFAR Synthetic Evidence Corpus for Detecting AI-Generated Evidence
arXiv:2606.07916v1 Announce Type: new Abstract: The growing ability of generative models to produce realistic documents poses a direct challenge to evidentiary workflows in the justice system and the courts, where decisions increasingly depend on the authenticity of evidence such as receipts, communications, and administrative records. Unlike social media or academic settings, evidentiary documents are often only subtly altered, with small, localized edits that preserve overall plausibility...
HalfNet: Randomized Neural Networks with Learned Subspace Geometry
arXiv:2606.04583v1 Announce Type: new Abstract: Many researchers investigated neural networks with some of their weights fixed to values randomly drawn from a given distribution, e.g., $N(0, I)$. Our proposed HalfNet draws random weights from $N(0, \Sigma)$, where $\Sigma$, which defines the geometry of the distribution, has a low-rank factorization that we learn from data. Experiments on MNIST and CIFAR-10 demonstrate that HalfNet can match the performance of fully trained multilayer...
DHAuDS: A Dynamic and Heterogeneous Audio Benchmark for Test-Time Adaptation
arXiv:2511.18421v2 Announce Type: replace Abstract: Existing Test-time Adaptation (TTA) studies rely heavily on static and homogeneous corruption protocols, such as ImageNet-C and CIFAR-10-C/100-C, leading to inconsistent evaluation settings and potentially inflated robustness estimates that are compared with real-world situations. TTA lacks a standardized evaluation infrastructure capable of modeling realistic heterogeneous acoustic degradation.
A Closer Look at In-Distribution vs. Out-of-Distribution Accuracy for Open-Set Test-time Adaptation
arXiv:2606.01973v1 Announce Type: new Abstract: Open-set test-time adaptation (TTA) updates models on new data in the presence of input shifts and unknown output classes. While recent methods have made progress on improving in-distribution (InD) accuracy for known classes, their ability to accurately detect out-of-distribution (OOD) unknown classes remains underexplored. We benchmark robust and open-set TTA methods (SAR, OSTTA, UniEnt, and SoTTA) on the standard corruption benchmarks of...
SaluNet: Enabling Total Plasticity in Normalization-Free Deep Networks
Announce Type: new Abstract: Normalization layers such as BatchNorm and LayerNorm have long been considered essential for stable training in deep networks. This work demonstrates that they can be fully replaced by a single learnable activation mechanism. We identify a plasticity suppression effect induced by standard normalization: learnable activation parameters rapidly lose adaptability when paired with normalization layers.
Frequency-Domain Latent Attention Gating for Cross-Domain Token Aggregation
arXiv:2606.08191v1 Announce Type: new Abstract: Token aggregation is a common bottleneck in models that map token representations to sample-level predictions, yet most pooling methods operate only in the original token domain. We propose FLaG, a plug-in aggregation module that transforms token representations with the real FFT, summarizes spectral components with learnable latent queries, applies a channel-wise gate, and reconstructs enhanced time-domain tokens for final pooling.
Synthetic Benchmarks Overstate Forward-Forward Scaling: Real-Data Limits of Layer-Local Training
new Abstract: Forward-Forward (FF) learning [Hinton, 2022] replaces backpropagation with strictly layer-local goodness updates. Recent FF-CNN work has narrowed the gap to BP on 32x32 benchmarks, raising the question of whether layer-local training is becoming a viable alternative at realistic scale. To probe this rigorously, we develop DTG-FF -- dynamic temperature goodness, decoupled normalization, and multi-layer fusion -- as an instrument that sets FF-family state of the art across nine...
GUDA: Counterfactual Group-wise Training Data Attribution for Diffusion Models via Unlearning
Announce Type: replace Abstract: Training-data attribution for vision generative models aims to identify which training data influenced a given output. While most methods score individual examples, practitioners often need group-level answers (e.g., artistic styles or object classes). Group-wise attribution is counterfactual: how would a model's behavior on a generated sample change if a group were absent from training?
Non-Parametric Probabilistic Robustness: A Conservative Risk Estimator under Unknown Perturbation Distributions
Announce Type: replace Abstract: Deep learning (DL) models, despite their remarkable success, remain vulnerable to small input perturbations that can cause erroneous outputs, motivating the recent proposal of probabilistic robustness (PR) as a complementary alternative to adversarial robustness (AR). However, existing PR formulations assume a fixed and known perturbation distribution, an unrealistic expectation in practice. To address this limitation, we propose non-parametric probabilistic...