Home Knowledge Base RM

RM

No mentions found

This entity hasn't been tracked yet, or Iris is still building its knowledge base.

Related Articles from SNS

Comprehensive Ab Initio Quantum Computations of CO$_{\rm 2}$-H$_{\rm 2}$ and CO$_{\rm 2}$-He Collisional Properties

Announce Type: replace Abstract: We present comprehensive \textsl{ab initio} fully quantum calculations of CO$_{\rm 2}$--H$_{\rm 2}$ and CO$_{\rm 2}$--He collisional properties. Our framework combines CCSD(T) potential-energy-surface calculations with close-coupling dynamical scattering in the \YUMI~framework to derive elastic and inelastic cross sections, rate coefficients, and pressure broadening parameters. We characterize the rotational dependence of the broadening coefficients up to...

arXiv Physics 5d ago

Skill-RM: Unifying Heterogeneous Evaluation Criteria via Agent Skill

arXiv:2606.03980v1 Announce Type: new Abstract: Reward models (RMs) provide critical feedback signals for LLM post-training, notably in reinforced fine-tuning (RFT) and reinforcement learning (RL) pipelines. However, current reward evaluation relies on heterogeneous criteria such as rule-based verifiers, ground-truth references, procedural checklists, and complex rubrics, where a unified mechanism to integrate all types of evidence remains unexplored. To this end, we propose Skill Reward...

arXiv CS 7d ago

Betti Numbers and Higher Weight Spectra of Reed-Muller Codes $RM_q(2,2)$

arXiv:2408.02548v3 Announce Type: replace-cross Abstract: We determine all the Betti numbers of the $q$-ary second order Reed-Muller codes of length $q^2$, and also of the elongations of matroids associated to these codes. We then use it to determine the higher weight spectra of these codes. As a special case, we recover some results of Kaplan and Matei about counting certain curves over finite fields with prescribed rational intersection points.

arXiv CS 6d ago

The Origin of Da Scaling: Suppressed Cooling in Fast-Cooling Mixing Layers

arXiv:2606.04093v1 Announce Type: cross Abstract: In numerical experiments simulating Turbulent Radiative Mixing Layers (TRMLs) it is observed that as the cooling time in the mixed gas, $t_{\rm cool}$, becomes very short compared to the dynamical time of the turbulence, $t_{\rm eddy}/t_{\rm cool} \gg 1$, there is a change in the scaling behavior of the total energy radiated in the TRML as a function of this ratio, also known as the Damk\"{o}hler number, ${\rm Da} \equiv t_{\rm eddy}/t_{\rm...

arXiv Physics 6d ago

A Unifying Lens on Reward Uncertainty in RLHF

arXiv:2606.09073v1 Announce Type: new Abstract: Reinforcement learning from human feedback (RLHF) is bottlenecked by \emph{reward hacking}, where the policy exploits errors in a proxy reward model (RM) and produces high RM scores without genuine quality gains. A natural mitigation is \emph{pessimism}: penalizing rewards in regions where the RM is uncertain. However, standard scalar RMs provide no principled notion of uncertainty.

arXiv CS 1d ago

Palindrome complexity versus factor complexity

arXiv:2606.08127v1 Announce Type: cross Abstract: Let ${\bf x} = (a_i)_{i \geq 0}$ be an infinite word over a finite alphabet $\Sigma$. Let $\rho (n)$ be the factor complexity function for $\bf x$ and ${\rm Pal}(n)$ be the palindrome complexity function for $\bf x$. We give a new relationship between these two quantities; namely, if $\bf x$ is not ultimately periodic, then $$ \lim_{n \rightarrow \infty} {{ {\rm Pal} (n) \log ({\rm Pal} (n) + 1)} \over {\rho (n)}} = 0.

arXiv CS 1d ago

Statistical orientation and distribution of columnar ice crystals in turbulent flows

Announce Type: new Abstract: We study the motion of columnar ice crystals that form in clouds over a range of low temperature. Our focus here is on elongated ice crystals, which are smaller than the size of the smallest eddies in the flow, with a moderate aspect ratio comprised between $3$ and $5$. We determine turbulent solutions of the Navier-Stokes equations over a range of turbulent kinetic energy dissipation characteristic of clouds ($4.41\;{\rm cm}^2/{\rm s}^3 \le \varepsilon \le...

arXiv Physics 5d ago

RadioDiff-Inv2: Differentiable Diffusion Inversion under Location Drift from Sparse Noisy Measurements for Radio Map Estimation

arXiv:2606.08439v1 Announce Type: new Abstract: Radio map (RM) estimation is a key enabler for environment-aware optimization in 6G wireless networks. In practice, RM construction increasingly relies on crowdsourced received signal strength (RSS) feedback that is inherently sparse and noisy. A further and often overlooked challenge is location drift, whereby privacy constraints and user mobility cause reported sampling coordinates to deviate from the true measurement locations.

arXiv CS 1d ago

The Flip Side of RLHF: On-Policy Feedback for Reward Model Self-Supervised Improvement

arXiv:2605.30888v1 Announce Type: new Abstract: Building strong reward models (RMs) for language model alignment is bottlenecked by the cost and difficulty of acquiring diverse and reliable preference data from human annotation or judge models. It is dramatically worse as the policy evolves beyond the static RM training. Therefore, we propose SAVE (Self-supervised reward model improvement via Value-Anchored On-policy feedback), a framework that grades on-policy responses as feedback by using...

arXiv CS 9d ago

MODIS Thermal Infrared Sounding (MOTIS): Estimating Tropical Cyclone Central Pressure from Warm-Core Anomalies

Announce Type: new Abstract: This study presents a novel framework for estimating the central sea-level pressure ($P_{\rm c}$) of tropical cyclones (TCs) using infrared radiometers. We leverage the long-overlooked combination of high spatial resolution and sounding capability of the Moderate Resolution Imaging Spectroradiometer (MODIS) to measure warm-core anomalies in TC eyes. We develop the MODIS Thermal Infrared Sounding (MOTIS) framework, which performs instrument-specific preprocessing...

arXiv Physics 5d ago