Home Knowledge Base Resource-Constrained Adaptive Inference for

Resource-Constrained Adaptive Inference for

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Resource-Constrained Adaptive Inference for Sequential Pricing

arXiv:2606.03736v1 Announce Type: cross Abstract: Resource-constrained pricing controllers can make fixed-price inference impossible: the controller's resource state may remove the target price neighborhood from the feasible set, even when every realized action has a known positive density. We formalize this support-exclusion failure through a local non-identification result and a realized information clock. We then design a target-aware pricing controller that certifies feasible target...

arXiv CS 7d ago

Distilling Safe LLM Systems via Soft Prompts for On Device Settings

arXiv:2606.09388v1 Announce Type: new Abstract: Deploying safe large language models (LLMs) on resource-constrained edge devices presents a critical challenge: while dual-model systems combining LLMs with guard models provide effective safety guarantees, their substantial memory and computational demands make them prohibitively expensive for on-device deployment. This paper presents a comprehensive study of parameter-efficient safety alignment methods for resource-constrained settings....

arXiv CS 1d ago

CANS: Accelerating Multiuser Collaborative Edge Inference via Cooperative Autodidactic NeuroSurgeon

Announce Type: new Abstract: Recently, mobile edge computing (MEC)-enabled collaborative deep neural network (DNN) inference has emerged as a promising approach for delivering intelligent services to resource-constrained mobile devices. A representative scenario is multi-user collaborative edge inference, where distinct devices independently partition their DNN models and offload backend computation to a common edge server over wireless networks. However, determining the optimal DNN...

arXiv CS 1d ago

Scale When Needed: Adaptive Neuron-level Mixed Precision Quantization Aware Training

arXiv:2605.25054v2 Announce Type: replace Abstract: Deploying deep neural networks on resource-constrained 6G edge devices demands aggressive compression with minimal accuracy loss. Quantization-Aware Training (QAT) has emerged as a leading compression approach; however, existing mixed-precision methods typically operate at coarse layer- or channel-level granularity. These methods often rely on heuristic or search-based bit-allocation strategies, which may overlook fine-grained variability...

arXiv CS 2d ago

Reformulate LLM Reinforcement Learning for Efficient Training under Black-box Discrepancy

Announce Type: new Abstract: Reinforcement Learning (RL) has emerged as a pivotal post-training paradigm, yet it frequently suffers from unpredictable sub-optimum performance or even training collapses. Recent findings attribute these failures to a hidden train-inference discrepancy (or mismatch), stemming from the disparate underlying engines and architecture. We find that the training policy can actively self-correct such a discrepancy when provided with an appropriate learning signal.

arXiv CS 1d ago

MSTN: A Lightweight and Fast Model for General TimeSeries Analysis

arXiv:2511.20577v5 Announce Type: replace Abstract: Real-world time series often exhibit strong non-stationarity, complex nonlinear dynamics, and behavior expressed across multiple temporal scales, from rapid local fluctuations to slow-evolving long-range trends. However, many contemporary architectures impose rigid, fixed-scale structural priors such as patch-based tokenization, predefined receptive fields, or frozen backbone encoders - which can over-regularize temporal dynamics and limit...

arXiv CS 5d ago

Ethical Fairness in Ubiquitous Health Sensing without Known Attributes

arXiv:2603.13373v4 Announce Type: replace Abstract: In ubiquitous and mobile health systems, computational models infer human states from wearable, behavioral, and physiological sensing data. In these settings, high accuracy alone is insufficient; models must act ethically and equitably across diverse people, contexts, and devices.

arXiv CS 8d ago

GenAutoML: An Agentic Framework for Dynamic Architecture Generation and Optimization in Time-Series Analysis

arXiv:2606.05860v1 Announce Type: new Abstract: Designing neural architectures for time-series forecasting and anomaly detection remains a resource-intensive task that often requires substantial domain expertise. Traditional Automated Machine Learning (AutoML) systems typically rely on static, predefined search spaces, limiting their ability to adapt to diverse data characteristics. We present GenAutoML, an agentic framework that leverages Large Language Models (LLMs) as neural architects to...

arXiv CS 5d ago

Dual-Integrated Low-Latency Single-Lens Infrared Computational Imaging for Object Detection

Announce Type: replace-cross Abstract: Computational imaging enables compact infrared systems, but deep-learning pipelines that combine image reconstruction and object detection often introduce substantial inference latency. Most existing acceleration strategies compress the reconstruction network while overlooking physical priors from the optical path, leaving a trade-off between accuracy and speed. We present Physics-aware Dual-Integrated Network (PDI-Net), a low-latency framework that...

arXiv Physics 8d ago

Dual-Integrated Low-Latency Single-Lens Infrared Computational Imaging for Object Detection

Announce Type: replace Abstract: Computational imaging enables compact infrared systems, but deep-learning pipelines that combine image reconstruction and object detection often introduce substantial inference latency. Most existing acceleration strategies compress the reconstruction network while overlooking physical priors from the optical path, leaving a trade-off between accuracy and speed. We present Physics-aware Dual-Integrated Network (PDI-Net), a low-latency framework that...

arXiv CS 8d ago