RM
No mentions found
This entity hasn't been tracked yet, or Iris is still building its knowledge base.
Related Articles from SNS
Comprehensive Ab Initio Quantum Computations of CO$_{\rm 2}$-H$_{\rm 2}$ and CO$_{\rm 2}$-He Collisional Properties
Announce Type: replace Abstract: We present comprehensive \textsl{ab initio} fully quantum calculations of CO$_{\rm 2}$--H$_{\rm 2}$ and CO$_{\rm 2}$--He collisional properties. Our framework combines CCSD(T) potential-energy-surface calculations with close-coupling dynamical scattering in the \YUMI~framework to derive elastic and inelastic cross sections, rate coefficients, and pressure broadening parameters. We characterize the rotational dependence of the broadening coefficients up to...
Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill
arXiv:2606.03980v1 Announce Type: new Abstract: Reward models (RMs) provide critical feedback signals for LLM post-training, notably in reinforced fine-tuning (RFT) and reinforcement learning (RL) pipelines. However, current reward evaluation relies on heterogeneous criteria such as rule-based verifiers, ground-truth references, procedural checklists, and complex rubrics, where a unified mechanism to integrate all types of evidence remains unexplored. To this end, we propose Skill Reward...
Betti Numbers and Higher Weight Spectra of Reed-Muller Codes $RM_q(2,2)$
arXiv:2408.02548v3 Announce Type: replace-cross Abstract: We determine all the Betti numbers of the $q$-ary second order Reed-Muller codes of length $q^2$, and also of the elongations of matroids associated to these codes. We then use it to determine the higher weight spectra of these codes. As a special case, we recover some results of Kaplan and Matei about counting certain curves over finite fields with prescribed rational intersection points.
The Origin of Da Scaling: Suppressed Cooling in Fast-Cooling Mixing Layers
arXiv:2606.04093v1 Announce Type: cross Abstract: In numerical experiments simulating Turbulent Radiative Mixing Layers (TRMLs) it is observed that as the cooling time in the mixed gas, $t_{\rm cool}$, becomes very short compared to the dynamical time of the turbulence, $t_{\rm eddy}/t_{\rm cool} \gg 1$, there is a change in the scaling behavior of the total energy radiated in the TRML as a function of this ratio, also known as the Damk\"{o}hler number, ${\rm Da} \equiv t_{\rm eddy}/t_{\rm...
A Unifying Lens on Reward Uncertainty in RLHF
arXiv:2606.09073v1 Announce Type: new Abstract: Reinforcement learning from human feedback (RLHF) is bottlenecked by \emph{reward hacking}, where the policy exploits errors in a proxy reward model (RM) and produces high RM scores without genuine quality gains. A natural mitigation is \emph{pessimism}: penalizing rewards in regions where the RM is uncertain. However, standard scalar RMs provide no principled notion of uncertainty.
Palindrome complexity versus factor complexity
arXiv:2606.08127v1 Announce Type: cross Abstract: Let ${\bf x} = (a_i)_{i \geq 0}$ be an infinite word over a finite alphabet $\Sigma$. Let $\rho (n)$ be the factor complexity function for $\bf x$ and ${\rm Pal}(n)$ be the palindrome complexity function for $\bf x$. We give a new relationship between these two quantities; namely, if $\bf x$ is not ultimately periodic, then $$ \lim_{n \rightarrow \infty} {{ {\rm Pal} (n) \log ({\rm Pal} (n) + 1)} \over {\rho (n)}} = 0.
Statistical orientation and distribution of columnar ice crystals in turbulent flows
Announce Type: new Abstract: We study the motion of columnar ice crystals that form in clouds over a range of low temperature. Our focus here is on elongated ice crystals, which are smaller than the size of the smallest eddies in the flow, with a moderate aspect ratio comprised between $3$ and $5$. We determine turbulent solutions of the Navier-Stokes equations over a range of turbulent kinetic energy dissipation characteristic of clouds ($4.41\;{\rm cm}^2/{\rm s}^3 \le \varepsilon \le...
RadioDiff-Inv2: Differentiable Diffusion Inversion under Location Drift from Sparse Noisy Measurements for Radio Map Estimation
arXiv:2606.08439v1 Announce Type: new Abstract: Radio map (RM) estimation is a key enabler for environment-aware optimization in 6G wireless networks. In practice, RM construction increasingly relies on crowdsourced received signal strength (RSS) feedback that is inherently sparse and noisy. A further and often overlooked challenge is location drift, whereby privacy constraints and user mobility cause reported sampling coordinates to deviate from the true measurement locations.
The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement
arXiv:2605.30888v1 Announce Type: new Abstract: Building strong reward models (RMs) for language model alignment is bottlenecked by the cost and difficulty of acquiring diverse and reliable preference data from human annotation or judge models. It is dramatically worse as the policy evolves beyond the static RM training. Therefore, we propose SAVE (Self-supervised reward model improvement via Value-Anchored On-policy feedback), a framework that grades on-policy responses as feedback by using...
MODIS Thermal Infrared Sounding (MOTIS): Estimating Tropical Cyclone Central Pressure from Warm-Core Anomalies
Announce Type: new Abstract: This study presents a novel framework for estimating the central sea-level pressure ($P_{\rm c}$) of tropical cyclones (TCs) using infrared radiometers. We leverage the long-overlooked combination of high spatial resolution and sounding capability of the Moderate Resolution Imaging Spectroradiometer (MODIS) to measure warm-core anomalies in TC eyes. We develop the MODIS Thermal Infrared Sounding (MOTIS) framework, which performs instrument-specific preprocessing...