MyArxiv Feed
High Energy Physics - Experimental
☆ Single pion-production and pion propagation in Achilles
We extend the applicability of Achilles (A CHIcagoLand Lepton Event Simulator) by incorporating the single-pion production mechanism in a fully exclusive fashion. The electroweak interaction vertex is modeled by combining the state-of-the-art Dynamical Coupled-Channels approach with realistic hole spectral functions, which account for correlations in both the initial target state and the residual spectator system. Final-state interactions are treated using a semi-classical intranuclear cascade that leverages nuclear configurations to determine the correlated spatial distribution of protons and neutrons. The meson-baryon scattering amplitudes used in the cascade are computed within the Dynamical Coupled-Channels framework, consistent with the electroweak vertex. To model pion absorption, we employ the optical potential approach of Oset and Salcedo. As an alternative approach, we explicitly model the production and propagation of resonances which mediate pion-nucleon scattering and pion absorption. We validate out approach against pion-nucleon and pion-nucleus scattering data, and present comparisons with electron- and neutrino-nucleus measurements from e4$\nu$, T2K, MINER$\nu$A, and MicroBooNE.
comment: 24 pages, 17 figures
☆ Flow-dependent tagging of $^{214}$Pb decays in the LZ dark matter detector
The LUX-ZEPLIN (LZ) experiment is searching for dark matter interactions in a liquid xenon time projection chamber (LXe-TPC). This article demonstrates how control of the flow state in the LXe-TPC enables the identification of pairs of sequential alpha-decays, which are used to map fluid flow and ion drift in the liquid target. The resulting transport model is used to tag $^{214}$Pb beta-decays, a leading background to dark matter signals in LZ. Temporally evolving volume selections, at a cost of 9.0% of exposure, target the decay of each $^{214}$Pb atom up to 81 minutes after production, resulting in (63 $\pm$ 6$_{\mathrm{stat}}$ $\pm$ 7$_{\mathrm{sys}}$)% identification of $^{214}$Pb decays to ground state. We also demonstrate how flow-based tagging techniques enable a novel calibration side band that is concurrent with science data.
comment: 21 pages, 15 figures
☆ Measurement of the branching fraction of $\psip \to ωηη$
Using a sample of (2.712 $\pm$ 0.014)$\times 10^{9}$ $\psip$ events collected with the BESIII detector at the BEPCII collider in 2009, 2012, and 2021, the decay $\psip \to \omega \eta \eta $ is observed for the first time. The branching fraction of the $\psi(3686)\to\omega\eta\eta$ decay is measured to be (1.65 $\pm$ 0.02 $\pm$ 0.21)$\times 10^{-5}$, where the first uncertainty is statistical and the second systematic. Clear structures associated with the well-established $\omega(1420)$ and $f_{0}(1710)$ resonances are observed in the $\omega\eta$ and $\eta\eta$ invariant-mass spectra, respectively.
☆ $T_{bc\bar s}$ states in the process $Υ\to D^{-} \bar B^{0} D_s^{+}$
We perform a theoretical study of the decay process $\Upsilon \to D^{-} \bar{B}^{0} D_s^{+}$ in search of the doubly heavy tetraquark states $T_{bc\bar{s}}$ with quark content $bc\bar{s}\bar{d}$. These $T_{bc\bar{s}}$ states are assumed to be dynamically generated molecular states from the S-wave interactions between $\bar{B}_s^{(*)0} D^{(*)+}$ and $\bar{B}^{(*)0} D_s^{(*)+}$ meson pairs. Based on the total angular momentum and the type of the constituent mesons (pseudoscalars $P$ or vectors $V$), they are labeled as $T_{bc\bar{s}}^{0, PP}$, $T_{bc\bar{s}}^{0, VV}$, and $T_{bc\bar{s}}^{2, VV}$, respectively. The $\bar{B}^{0} D_s^{+}$ invariant mass distribution for this decay is calculated using current algebra, incorporating contributions from $T_{bc\bar{s}}$ states arising from final-state interactions. Our results reveal a clear peak structure in the $7415 - 7425$ MeV region, which is attributed to the $T_{bc\bar{s}}^{2, VV}$ state. Additionally, a distinct dip structure appears near the $\bar{B}^{*0} D_s^{*+}$ threshold, characteristic of the $T_{bc\bar{s}}^{2, VV}$ as a hadronic molecular state. A near-threshold enhancement associated with the $T_{bc\bar{s}}^{0, PP}$ state and a dip arising from the $T_{bc\bar{s}}^{0,\,VV}$ state are also identified, though the manifestation of these features depends sensitively on model parameter fine-tuning. Therefore, with increased experimental statistics, the decay channel $\Upsilon \to D^{-} \bar{B}^{0} D_s^{+}$ offers a promising avenue for discovering and characterizing the $T_{bc\bar{s}}$ states.
comment: 25 pages, 5 figures, 2 tables, suggestions and comments welcome
☆ Partial-wave decomposition of the diffractively produced $π^-π^+π^-η$ final state at COMPASS
One of the prime goals of the COMPASS experiment at CERN is the study of the light meson spectrum, with a particular emphasis on the search for exotic states. We present the first high-statistics partial-wave decomposition of the diffractive reaction $\pi^-+p\to \pi^-\pi^+\pi^-\eta+p$, which spans a wide range of decay channels, such as $f_1(1285)\pi^-$, $a_2^-(1320)\eta$, $\eta^\prime\pi^-$, and $\rho(770)a_0^-(980)$. This analysis also includes decay channels, i.e. $f_1(1285)\pi^-$ and $\eta^\prime\pi^-$, predicted by theoretical models for the lightest hybrid meson and providing the opportunity to verify the hybrid meson hypothesis of the $\pi_1(1600)$.
comment: The 21st International Conference on Hadron Spectroscopy and Structure (HADRON2025), 27 - 31 March, 2025, Osaka University, Japan
☆ Recommendations for Best Practices for Data Preservation and Open Science in HEP
These recommendations are the result of reflections by scientists and experts who are, or have been, involved in the preservation of high-energy physics data. The work has been done under the umbrella of the Data Lifecycle panel of the International Committee of Future Accelerators (ICFA), drawing on the expertise of a wide range of stakeholders. A key indicator of success in the data preservation efforts is the long-term usability of the data. Experience shows that achieving this requires providing a rich set of information in various forms, which can only be effectively collected and preserved during the period of active data use. The recommendations are intended to be actionable by the indicated actors and specific to the particle physics domain. They cover a wide range of actions, many of which are interdependent. These dependencies are indicated within the recommendations and can be used as a road map to guide implementation efforts. These recommendations are best accessed and viewed through the web application, see https://icfa-data-best-practices.app.cern.ch/
comment: These recommendations are best accessed and viewed through the web application, see https://icfa-data-best-practices.app.cern.ch/ Corresponding editor: Kati Lassila-Perini (contact via icfa-data-best-practices-contact at cern.ch)
☆ Baselines for Abelian Charge Fluctuations in Nuclear Collisions:Theory and Comparison with Experimental Data
We investigate fluctuations in the canonical ensemble of an Abelian charge, such as baryon number. Our focus is on cumulants and factorial cumulants of baryon and antibaryon multiplicity distributions, including their sum and difference, in both the full phase space and subsystems. In particular, we establish a correlation between net-baryon number fluctuations within a subsystem, which is pertinent for fluctuation analyses in nucleus-nucleus collisions, and fluctuations of baryon and antibaryon numbers in the total system. We derive analytical expressions for factorial cumulants of arbitrary order and present concise results in terms of the cumulants of the total baryon number. To account for dynamics beyond global conservation, we introduce local attractive and repulsive multi-particle interactions within a phenomenological framework. A comparison of calculated and generated cumulants with STAR and HADES data indicates that multiparticle interactions play a decisive role in the description of observed fluctuation patterns. At high collision energies, the data are well-reproduced by incorporating repulsive two-proton interactions, while at lower energies, attractive three-particle interactions become essential. Furthermore, our framework facilitates realistic event generation, enabling a direct comparison with experimental measurements.
☆ Beam-Normal Single-Spin Asymmetry in $^{208}$Pb at low energy: discrepancy resolved or new kinematic puzzle?
A longstanding discrepancy between measured and predicted beam-normal single-spin asymmetries $A_{n}$ in elastic electron scattering off $^{208}$Pb has challenged our understanding of two-photon exchange (TPE) in heavy nuclei. We report a new measurement at 570 MeV and $Q^2$=0.04 $GeV^2$/$c^2$, yielding$A_n = (-9.1 \pm 2.1~\text{(stat)} \pm 0.7~\text{(syst)})~\mathrm{ppm}$. This nonzero value contrasts with previous results at higher energies and suggests a kinematic dependence of TPE effects not captured by current theory, prompting a reevaluation of earlier interpretations.
☆ Study of the $χ_{cJ}\rightarrowΛ\barΛη^\prime$ decays
Using a data sample of $(2.712\pm0.014)\times10^{9}$ $\psi(3686)$ events collected with the BESIII detector at the BEPCII collider, we investigate the decays $\chi_{cJ} \rightarrow \Lambda \bar{\Lambda} \eta^\prime$ for $J=0,~1,~2$ via the radiative transition $\psi(3686) \rightarrow \gamma \chi_{cJ}$. The decays $\chi_{c0,2}\rightarrow\Lambda\bar{\Lambda}\eta^\prime$ are observed for the first time, with statistical significances of 6.7$\,\sigma$ and 6.4$\,\sigma$, respectively. Evidence for the decay $\chi_{c1}\rightarrow\Lambda\bar{\Lambda}\eta^\prime$ is found with a statistical significance of 3.3$\,\sigma$. The corresponding branching fractions are measured to be $\mathscr{B}(\chi_{c0}\rightarrow\Lambda\bar{\Lambda}\eta^\prime)=(7.56\pm1.42\pm0.90)\times10^{-5}$, $\mathscr{B}(\chi_{c1}\rightarrow\Lambda\bar{\Lambda}\eta^\prime)=(1.54\pm0.51\pm0.16)\times10^{-5}$, and $\mathscr{B}(\chi_{c2}\rightarrow\Lambda\bar{\Lambda}\eta^\prime)=(3.03\pm0.61\pm0.29)\times10^{-5}$, where the first uncertainties are statistical and the second systematic. No significant excited $\Lambda$ baryon states or $\Lambda\bar{\Lambda}$ near-threshold enhancements are observed.
☆ Structure of leptonic Yukawa couplings in the Zee model
The radiative neutrino mass matrix $m^\nu$ in the Zee model depends on leptonic Yukawa couplings $F$ to a singlet scalar and $Y^\ell$ to a new Higgs doublet. Leveraging the skew-symmetric structure of $F$, we derive a unique identity linking $F$ and $m^\nu$ that is explicitly independent of $Y^\ell$. This relation implies that five entries of $Y^\ell$ can, in principle, be determined directly from $m^\nu$ and $F$, while the remaining four can be selected based on phenomenological assumptions. As an illustration, we apply this framework to the two-zero texture $B2$, highlighting its enhancement of the muon $g-2$.
comment: 11 pages
☆ Decays of $Υ(10860)$ and $Υ(10753)$ into $ωχ_{bJ}$
In this paper, we model $\Upsilon(10753)$ as a $4S$-$3D$ bottomonium mixture and $\Upsilon(10860)$ as a $5S$-$4D$ mixture, and predict the properties of a new bottomonium state, $\Upsilon(10950)$, as the mixing partner of $\Upsilon(10860)$. The mixing angles are derived from dielectron decay widths and mass shifts of these bottomonia. We consider open-bottom meson loops in the decays of the $D$-wave bottomonium components, based on a nonrelativistic effective field theory power counting. We show that the $S$-$D$ mixing scheme is consistent with the experimental data of the decays of $\Upsilon(10860)$ into $\omega\chi_{bJ}$ ($J=0,1,2$) and $\Upsilon(10753)$ into $\omega\chi_{bJ}$ ($J=1,2$). Predictions for the dielectron widths and partial decay widths into $\chi_{bJ}\omega$ for $\Upsilon(10753)$ and $\Upsilon(10950)$ are presented.
comment: 14pages, 6figures
☆ A $Z_4$ Scotogenic Model with Higgs Portal
We propose a new Scotogenic type model based on a global $Z_4$ symmetry involving dark matter candidates. After the symmetry breaking as $Z_4$ to $Z_2$ via the singlet scalar vacuum expectation value (VEV), the lightest Majorana fermion works as a viable thermal freeze-out dark matter (DM) candidate, and the mass terms for active neutrinos are generated as a finite quantum correction at the 1-loop level. A key point of realising our Scotogenic structure is to introduce two types of Majorana fermions (heavy right-handed neutrinos) and inert Higgs doublets with opposite $Z_4$ parities. Since a large VEV for the singlet scalar is not so harmful in an appropriate realisation of the Higgs mechanism for the SM gauge symmetry, we can naturally realise a TeV-scale fermionic DM candidate, where constraints via direct detection experiments are less than those for sub-TeV DM. Our scenario involves the Higgs-portal DM interactions, which help the realisation of the correct DM relic abundance. Relying on the structure of the model, it is possible to find a natural partner for coannihilation. Our scenario can be investigated via the measurement of the Higgs trilinear self-coupling at the Large Hadron Collider. The simplest way to evade the domain-wall problem by adding a tiny soft $Z_2$ breaking term works, keeping a sufficient longevity of the decaying DM lifetime.
comment: 47 pages, 6 figures, 5 tables
☆ Search for $χ_{c1}\to π^{+}π^{-}η_c$ via $ψ(3686)\toγχ_{c1}$
Utilizing $(2712.4 \pm 14.3) \times 10^6$ $\psi(3686)$ events collected with the BESIII detector at the BEPCII collider, we search for the hadronic transition process $\chi_{c1} \to \pi^+\pi^-\eta_c$ following the decay $\psi(3686)\to \gamma \chi_{c1}$. No significant signal is observed, and an upper limit of $\mathcal{B}(\chi_{c1}\to\pi^+\pi^-\eta_c)$ is determined to be $3.1 times 10^{-4}$~at 90\% confidence level, which is one order of magnitude more stringent than the previous measurement.
☆ Search for a bound state of $Λ_{c}\barΣ_{c}$ near threshold
We search for a possible $\Lambda_{c} \bar{{\Sigma}}_{c}$ bound state, denoted as $H_{c}^{\pm}$, via the $ e^{+}e^{-} \to \pi^{+} \pi^{-} \Lambda_{c}^{+}\bar{\Lambda}_{c}^{-}$ process for the first time. This analysis utilizes 207.8 and 159.3 pb$^{-1}$ of $e^{+}e^{-}$ annihilation data at the center-of-mass energies of 4918.02 and 4950.93 MeV, respectively, collected with the BESIII detector at the BEPCII collider. No statistically significant signal is observed. The upper limits of the product of Born cross section and branching fraction $\sigma(e^{+}e^{-} \to \pi^{+} H_c^{-} + c.c.) \times \mathcal{B}(H_c^{-} \rightarrow \pi^{-}\Lambda_{c}^{+}\bar{\Lambda}_{c}^{-})$ at a 90\% confidence level are reported at each energy point and for various $H_{c}$ mass hypotheses (4715, 4720, 4725, 4730, and 4735 MeV/$c^{2}$) and widths (5, 10, or 20 MeV), with the upper limits ranging from 1.1 pb to 6.4 pb.
♻ ☆ Lifetime study of the ColdADC for the Deep Underground Neutrino Experiment
ColdADC is a custom ASIC digitizer implemented in 65 nm CMOS technology using specialized techniques for long-term reliability in cryogenic environments. ColdADC was developed for use in the DUNE Far Detector complex, which will consist of four liquid argon time projection chambers. Each contains 17 kilotons liquid argon as the target material in order to measure neutrino oscillations. Approximately 40,000 ColdADC ASICs will be installed for DUNE in the first two large detectors and will be operated at cryogenic temperatures during the experiment without replacement. The lifetime of the ColdADC is a critical parameter affecting the data quality and physics sensitivity of the experiment. A measurement of the lifetime of the ColdADC was carried out, and the results shown in this paper assure orders of magnitude longer lifetime of the ColdADC than the planned operation time of the detectors.
comment: Minor revision based on referee comments
♻ ☆ Non-parametric and continuous extraction of amplitudes in electroweak penguin decays
We introduce a novel approach to extract the decay-amplitudes in $B\to V(\to M_1M_2)\ell^+\ell^-$ processes, where $V$ represents a meson with either $J = 0$ (S-wave) or $J = 1$ (P-wave). This approach enables the decay-amplitudes across the dihadron and dilepton invariant-masses to be extracted from data in a model-independent and continuous way. To achieve this, the angular decay rate is expressed in a way that allows the application of the \sPlot technique to likelihood fits of the decay angles. We illustrate the abilities of this method on simulated $B^0\to K^+\pi^-\ell^+\ell^-$ data, containing both S- and P-wave contributions to the $K^+\pi^-$ system. We provide the weight functions and show that they allow the extraction of the absolute value of the P-wave and S-wave transversity amplitudes as a function of the dihadron and dimuon invariant-masses in a continuous, unbiased, and statistically-powerful way, while only relying on the angular distribution. As a consequence, the extracted amplitude shapes are model-independent -- up to the angular terms included in the fit -- and can be directly compared to theoretical predictions. A measurement using this technique can improve the sensitivity to potential new physics effects in $b\to s\ell^+\ell^-$ transitions, as well as the understanding of hadronic form factors, particularly in the S-wave system.
♻ ☆ Search for CP violation in e+e- -> psi(3770) -> DDbar via D -> KsPi0
Utilizing data sample of electron-positron collisions recorded with the BESIII detector at the center-of-mass energies of 3.773~GeV, corresponding to an integrated luminosity of 20.28~fb$^{-1}$, we report the first search for the CP forbidden process $e^+e^- \to \psi(3773) \to D^0\bar{D}^0 \to (K^0_S\pi^0)(K^0_S\pi^0)$. No significant signal is observed. We set the upper limit on the observed cross section to be 7.37~fb, and the upper limit on the joint branching fraction of the C-odd correlated neutral $D$ pair $\mathcal{B}[(D^0\bar{D}^0)_{\text{C-odd}} \to (K^0_S\pi^0)(K^0_S\pi^0)]$ to be $2.04 \times 10^{-6}$ at the 90\% confidence level.
comment: 9 pages, 4 figures
♻ ☆ Quantum Sensing Radiative Decays of Neutrinos and Dark Matter Particles
We explore a novel strategy for detecting the radiative decay of very weakly interacting particles by leveraging the extreme sensitivity of quantum devices, such as superconducting transmon qubits and trapped ion systems, to faint electromagnetic signals. By modeling the effective electric field induced by the decay photons, we evaluate the response of quantum sensors across two particle physics scenarios: the cosmic neutrino background and two-component dark matter. We assess the discovery potential of these devices and outline the parameter space accessible under current experimental capabilities. Our analysis demonstrates that quantum sensors can probe radiative decays of dark matter candidates using existing technology, while probing neutrino magnetic moments beyond current limits will require scalable quantum architectures with enhanced coherence.
comment: 12 pages, 4 figures, 2 tables, discussion added, conclusions unchanged
♻ ☆ Differentiating a HEP Analysis Pipeline within the Scikit-HEP Software Ecosystem
A first differentiable analysis pipeline is presented for an example high-energy physics (HEP) use case with publicly available collision data from the Compact Muon Solenoid detector at the Large Hadron Collider. The pipeline combines tools from the Scikit-HEP ecosystem with JAX. The study is based on an existing search for a hypothetical particle, the $Z^{\prime}$ boson, and uses a realistic, yet simplified, statistical model. The gradient-based optimization techniques employed in this work can advance HEP workflows by enabling end-to-end tuning of analysis parameters, improving both computational scalability and overall sensitivity. The challenges of adopting such techniques in HEP workflows are highlighted, along with practical mitigation to those challenges. This framework results in a significant improvement in expected statistical significance compared to a baseline analysis by fine-tuning $\mathcal{O}(10^3)$ parameters in the pipeline. Perspectives on future applications and recommendations for broader engagement with differentiable techniques in the field are also outlined.
Searches for direct slepton production in the compressed-mass corridor in $\sqrt{s}=13$ TeV $pp$ collisions with the ATLAS detector
This paper presents searches for the direct pair production of charged light-flavour sleptons, each decaying into a stable neutralino and an associated Standard Model lepton. The analyses focus on the challenging ``corridor'' region, where the mass difference, $\Delta m$, between the slepton ($\tilde{e}$ or $\tilde{\mu}$) and the lightest neutralino ($\tilde{\chi}^{0}_{1}$) is less or similar to the mass of the $W$ boson, $m(W)$, with the aim to close a persistent gap in sensitivity to models with $\Delta m \lesssim m(W)$. Events are required to contain a high-energy jet, significant missing transverse momentum, and two same-flavour opposite-sign leptons ($e$ or $\mu$). The analysis uses $pp$ collision data at $\sqrt{s} = 13$ TeV recorded by the ATLAS detector, corresponding to an integrated luminosity of 140 fb$^{-1}$. Several kinematic selections are applied, including a set of boosted decision trees. These are each optimised for different $\Delta m$ to provide expected sensitivity for the first time across the full $\Delta m$ corridor. The results are generally consistent with the Standard Model, with the most significant deviations observed with a local significance of 2.0 $\sigma$ in the selectron search, and 2.4 $\sigma$ in the smuon search. While these deviations weaken the observed exclusion reach in some parts of the signal parameter space, the previously present sensitivity gap to this corridor is largely reduced. Constraints at the 95% confidence level are set on simplified models of selectron and smuon pair production, where selectrons (smuons) with masses up to 300 (350) GeV can be excluded for $\Delta m$ between 2 GeV and 100 GeV.
comment: 66 pages in total, author list starting page 49, 14 figures, 11 tables, published in JHEP. All figures including auxiliary figures are available at https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/PAPERS/HMBS-2024-64/
High Energy Physics - Phenomenology
☆ Quantifying fluctuation signatures of the QCD critical point using maximum entropy freeze-out
A key question about the QCD phase diagram is whether there is a critical point somewhere on the boundary between the hadronic and quark-gluon plasma phases, and if so where. Heavy-ion collisions offer a unique opportunity to search for signatures of such a critical point by analyzing event-by-event fluctuations in particle multiplicities. To draw meaningful conclusions from experimental data, a theoretical framework is needed to link QCD thermodynamics with the particle spectra and correlations observed in detectors. The Equation of State (EoS) of QCD near a critical point can be related to the universal Gibbs free energy of the 3D Ising model using four currently unknown non-universal mapping parameters whose values are determined by the microscopic details of QCD. We utilize the maximum entropy approach to freeze-out the fluctuations in order to make estimates for factorial cumulants of proton multiplicities, assuming thermal equilibrium, for a family of EoS with a 3D Ising-like critical point, varying the microscopic inputs that determine the strength and structure of the critical features. We quantify the effect of the non-universal mapping parameters, and the distance between the critical point and the freeze-out curve, on the factorial cumulants of proton multiplicities.
comment: 57 pages, 9 figures and 4 appendices with 4 additional figures
☆ The Quark Jet Function for $k_T$-like Variables in NNLO QCD
The precise description of jet processes requires observables capable of efficiently capturing the dynamics of the energy flow in hadronic final states. We consider a class of tranverse-momentum like resolution variables that smoothly describe the $n+1$ to $n$ jet transition in multi-jet processes. We discuss a general method for the computation of the corresponding quark jet function at next-to-next-to-leading order in perturbative QCD. Rapidity divergences are regulated by using a time-like auxiliary vector. We present explicit results for a variant of $y_{23}$ in the $E$-scheme and in the WTA scheme.
comment: 20 pages, 0 figures
☆ Classification of color superconductivity by one-gluon exchange helicity amplitudes and renormalization group equations
Quark matter at high baryon density exhibits diverse pairing patterns classified by color, flavor, and angular momentum quantum numbers. We compute one-gluon exchange (OGE) helicity amplitudes and introduce a nonrelativistic classification of the pairing channel, justified by the channel decomposition in a Lorentz-noninvariant medium and the decoupling of renormalization group flows at leading order. We find the new attractive channel in OGE; the medium effects can render the vacuum-repulsive color $\boldsymbol{6}$ channel attractive in the spin-triplet sector. For color $\boldsymbol{\bar{3}}$ with antisymmetric flavor, the dominant pairing is ${}^1S_0$, while for symmetric flavor the same-helicity $^1P_1$ prevails over $^3S_1$, revising the conventional single-flavor picture. With a mismatch of the Fermi momenta, $^1S_0$ channel, leading to color-flavor locked or two-flavor color superconductor, remains most stable when the separation is small, and the color-spin locked pairing becomes favored as the mismatch gets large. We suggest there are possible quark-hadron continuity in certain cases as expected in the literature.
comment: 26 pages, 6 figures
☆ The Magnetic Origin of Primordial Black Holes: A Viable Dark Matter Scenario
Primordial Black Holes (PBHs) are compelling candidates for explaining the present-day relic abundance of cold dark matter (CDM), yet their formation typically requires finely tuned early-universe dynamics. In this work, we propose a novel PBH formation mechanism within a well-established magnetogenesis framework. This scenario simultaneously accounts for the large-scale magnetic fields observed today and generates an enhanced curvature power spectrum at intermediate scales, leading to PBH formation with masses that can survive until the present epoch. We identify a narrow reheating temperature range, $10^5\,\mathrm{GeV} \leq T_{re} \leq 3\times 10^5\,\mathrm{GeV}$, within which the resulting PBHs can constitute the entirety of the observed CDM abundance. Furthermore, our model predicts a stochastic gravitational wave (GW) background as a byproduct of the PBH formation process. Remarkably, the predicted GW signal lies within the sensitivity reach of upcoming space-based interferometers, such as the LISA, DECIGO, or SKA mission, offering a direct observational probe of this PBH generation mechanism.
comment: 12 pages, 4 figures, Comments and suggestions are always welcome!
☆ Single pion-production and pion propagation in Achilles
We extend the applicability of Achilles (A CHIcagoLand Lepton Event Simulator) by incorporating the single-pion production mechanism in a fully exclusive fashion. The electroweak interaction vertex is modeled by combining the state-of-the-art Dynamical Coupled-Channels approach with realistic hole spectral functions, which account for correlations in both the initial target state and the residual spectator system. Final-state interactions are treated using a semi-classical intranuclear cascade that leverages nuclear configurations to determine the correlated spatial distribution of protons and neutrons. The meson-baryon scattering amplitudes used in the cascade are computed within the Dynamical Coupled-Channels framework, consistent with the electroweak vertex. To model pion absorption, we employ the optical potential approach of Oset and Salcedo. As an alternative approach, we explicitly model the production and propagation of resonances which mediate pion-nucleon scattering and pion absorption. We validate out approach against pion-nucleon and pion-nucleus scattering data, and present comparisons with electron- and neutrino-nucleus measurements from e4$\nu$, T2K, MINER$\nu$A, and MicroBooNE.
comment: 24 pages, 17 figures
☆ Hits to Higgs: Reconstruction-Free Higgs Classification from Raw LHC Detector Data Using Higgsformers
We present a comparative study of Higgs event classification at the Large Hadron Collider that bypasses the traditional reconstruction chain. As a benchmark, we focus on distinguishing \( t\bar{t}H \) from \( t\bar{t} \) events with \( H \to b\bar{b} \), a particularly challenging task due to their similar final-state topologies. Our pipeline begins with event generation in \texttt{Pythia8}, fast simulation with ACTS/Fatras, and classification directly from raw detector hits. We show for the first time that a transformer model originally developed for inner tracker hit-to-track assignment can be retrained to classify Higgs events directly from raw hits. For comparison, we reconstruct the same events with \texttt{Delphes} and train object-based classifiers, including multilayer perceptrons and the Particle Transformer. We evaluate both approaches under varying dataset sizes and pileup levels. Despite operating solely on inner tracker hits, Higgsformer achieves strong performance, reaching an AUC of 0.792.
comment: 10 pages, 5 figures, 4 tables
☆ Triple top baryon $Ω_{ttt}$
Recently, the toponium, a bound state of a top-antitop quark pair, was discovered by the CMS and ATLAS collaborations. This observation has sparked renewed interest in the study of heavy-quark bound states, particularly the hypothetical triple top baryon ($\Omega_{ttt}$). We present a theoretical discussion on the mass, production, and decay of this exotic particle within the Standard Model. Using a variational method with an effective potential, we estimate the mass and binding energy of the $\Omega_{ttt}$. We further analyze its production cross-section at future high-energy colliders and its dominant weak decay channels. The possibility of experimental observation is also considered, and we conclude that its observation will be extremely challenging in the near future due to an anticipated tiny production cross-section and complex final states. However, we propose that the search for this particle would represent a new frontier in probing the strong force at unprecedented mass scales.
comment: 9 pages
☆ Quintessential dark energy crossing the phantom divide
Motivated by recent results from the DESI collaboration, we explore two classes of quintessence models that can give rise to crossing of the dark energy equation of state through the ``phantom divide'' $w=-1$. These are models with Lagrangians that involve higher powers of the kinetic energy $\dot\phi^2$, or where the dark matter (DM) mass is a function of $\phi$. Both have similar features with respect to the reconstructed redshift-dependent $w(z)$: moderate tuning of parameters is required to achieve the desired shape, and it is difficult or impossible for $w(z)$ to continue evolving smoothly as $z$ becomes large. Nevertheless, they give a strong improvement over $\Lambda$CDM in fitting the data. We point out that models of coupled dark matter and dark energy that cross the phantom divide are under pressure from constraints on long-range DM forces. They rule out the simplest renormalizable coupling of scalar DM to quintessence, but leave the fermionic case marginally allowed, while exponentially coupled models are safe from current constraints.
comment: 10 pages, 6 figures
☆ SN2023syz and SN2025cbj: Two Type IIn Supernovae Associated with IceCube High-energy Neutrinos
Type IIn supernovae (SNe IIn) are a subclass of core-collapse SNe in which strong interactions occur between the ejecta and dense circumstellar material, creating ideal conditions for the production of high-energy neutrinos. This makes them promising candidate sources of neutrinos. In this work, we conduct an association study between 163 SNe IIn observed by the Zwicky Transient Facility and 138 neutrino alert events detected by the IceCube neutrino observatory. After excluding alerts with poor localization, we find two SNe that are spatiotemporally coincident with neutrino events. IC231027A and IC250421A coincide with the positions of SN2023syz and SN2025cbj, respectively, within their localization uncertainties, and the neutrino arrival times are delayed by 38 days and 61 days relative to the discovery times of the corresponding SNe. Using Monte Carlo simulations, we estimate that the probability of such two coincidences occurring by chance in our sample is $p \sim 0.67\%$, suggesting a high likelihood that they arise from genuine associations, though the result is not yet statistical significant. Furthermore, model calculations show that the expected numbers of neutrino events from these SNe IIn could be consistent with the actual observations. Our study provides possible evidence that interacting SNe may be potential neutrino-emitting sources.
comment: 10 pages, 3 figures
☆ A simple algorithm for polarized parton evolution
We present an algorithm to include the correlation between the production and decay planes of gluons in a parton-shower simulation. The technique is based on identifying the charge currents responsible for the creation and annihilation of the vector field. It is applicable in both the hard-collinear and the soft wide-angle region. As a function of the number of particles, the algorithm scales linear in computing time and memory. We demonstrate agreement with fixed-order perturbative calculations in the relevant kinematical limits, and present a new observable that can be used to probe correlations beyond current-current interactions.
comment: 14 pages, 8 figures
☆ $T_{bc\bar s}$ states in the process $Υ\to D^{-} \bar B^{0} D_s^{+}$
We perform a theoretical study of the decay process $\Upsilon \to D^{-} \bar{B}^{0} D_s^{+}$ in search of the doubly heavy tetraquark states $T_{bc\bar{s}}$ with quark content $bc\bar{s}\bar{d}$. These $T_{bc\bar{s}}$ states are assumed to be dynamically generated molecular states from the S-wave interactions between $\bar{B}_s^{(*)0} D^{(*)+}$ and $\bar{B}^{(*)0} D_s^{(*)+}$ meson pairs. Based on the total angular momentum and the type of the constituent mesons (pseudoscalars $P$ or vectors $V$), they are labeled as $T_{bc\bar{s}}^{0, PP}$, $T_{bc\bar{s}}^{0, VV}$, and $T_{bc\bar{s}}^{2, VV}$, respectively. The $\bar{B}^{0} D_s^{+}$ invariant mass distribution for this decay is calculated using current algebra, incorporating contributions from $T_{bc\bar{s}}$ states arising from final-state interactions. Our results reveal a clear peak structure in the $7415 - 7425$ MeV region, which is attributed to the $T_{bc\bar{s}}^{2, VV}$ state. Additionally, a distinct dip structure appears near the $\bar{B}^{*0} D_s^{*+}$ threshold, characteristic of the $T_{bc\bar{s}}^{2, VV}$ as a hadronic molecular state. A near-threshold enhancement associated with the $T_{bc\bar{s}}^{0, PP}$ state and a dip arising from the $T_{bc\bar{s}}^{0,\,VV}$ state are also identified, though the manifestation of these features depends sensitively on model parameter fine-tuning. Therefore, with increased experimental statistics, the decay channel $\Upsilon \to D^{-} \bar{B}^{0} D_s^{+}$ offers a promising avenue for discovering and characterizing the $T_{bc\bar{s}}$ states.
comment: 25 pages, 5 figures, 2 tables, suggestions and comments welcome
☆ Perturbative and nonperturbative aspects of neutrino oscillations in quantum field theory
In this work, we present a comprehensive and pedagogical review of two quantum field theoretical approaches to neutrino flavor mixing and oscillations: the non-perturbative flavor Fock space formalism and the perturbative interaction picture framework.
comment: 60 pages, 2 figures. Invited Review for INTERNATIONAL JOURNAL OF MODERN PHYSICS A
♻ ☆ The Lifespan of our Universe
The Dark Energy Survey (DES) and the Dark Energy Spectroscopic Instrument (DESI) measurements claim that the dark energy equation of state $w \ne -1$. This observation can be explained by the axion Dark Energy (aDE) model of an ultralight axion plus a cosmological constant $\Lambda$. Despite a relatively large degeneracy, there is a high probability that $\Lambda <0$. This negative $\Lambda$ leads the universe to end in a big crunch. Using the best-fit values of the model as a benchmark, we find the lifespan of our universe to be 33 billion years.
comment: 11 pages, 4 figures. Accepted by JCAP
♻ ☆ A Semi-Analytic model for Effects of Fuzzy Dark Matter Granule Perturbations on Orbital Motion
In fuzzy dark matter scenarios, the quantum wave nature of ultralight axion-like particles generates stochastic density fluctuations inside dark matter halos. These fluctuations, known as granules, perturb the orbits of subhalos and other orbiting bodies. While previous studies have simulated these effects using N-body techniques or modeled them statistically using diffusion approximations, we propose an alternative framework based on representing the perturbations as a Fourier series with random coefficients, which can be applied to individual orbits, not just populations. We extend the model to finite-size subhalos, identifying a critical length scale below which subhalos behave as point-mass particles. In contrast, larger subhalos exhibit suppressed perturbations from granules due to their extended mass profiles. Using FDM-Simulator, we validate our finite-size model by isolating granule accelerations and confirming their statistical effects on subhalo dynamics.
comment: 17 pages, 13 figures, accepted to MNRAS
♻ ☆ Non-parametric and continuous extraction of amplitudes in electroweak penguin decays
We introduce a novel approach to extract the decay-amplitudes in $B\to V(\to M_1M_2)\ell^+\ell^-$ processes, where $V$ represents a meson with either $J = 0$ (S-wave) or $J = 1$ (P-wave). This approach enables the decay-amplitudes across the dihadron and dilepton invariant-masses to be extracted from data in a model-independent and continuous way. To achieve this, the angular decay rate is expressed in a way that allows the application of the \sPlot technique to likelihood fits of the decay angles. We illustrate the abilities of this method on simulated $B^0\to K^+\pi^-\ell^+\ell^-$ data, containing both S- and P-wave contributions to the $K^+\pi^-$ system. We provide the weight functions and show that they allow the extraction of the absolute value of the P-wave and S-wave transversity amplitudes as a function of the dihadron and dimuon invariant-masses in a continuous, unbiased, and statistically-powerful way, while only relying on the angular distribution. As a consequence, the extracted amplitude shapes are model-independent -- up to the angular terms included in the fit -- and can be directly compared to theoretical predictions. A measurement using this technique can improve the sensitivity to potential new physics effects in $b\to s\ell^+\ell^-$ transitions, as well as the understanding of hadronic form factors, particularly in the S-wave system.
♻ ☆ Probing Dark Matter-Electron Interactions in the Cosmic Microwave Background Radiation
In this article, we consider Dark Matter (DM) interactions and study the same in the light of the Cosmic Microwave Background Radiation (CMBR) data. In particular, we focus on the DM-electron interactions. Assuming that such interactions are mediated by rather heavy mediators, we consider effective operators describing the relevant interaction terms in the lagrangian. The presence of such interaction terms leads to both DM annihilation and DM-electron scattering (drag). We focus on operators which lead to velocity-independent DM annihilation and DM-electron scattering cross-sections. Using the CMBR data, we study the implications of both of these effects, imposing constraints on the respective effective operators. This analysis underscores the importance of taking both scattering and annihilation processes into consideration in the study of DM interactions. We observe that the constraints on the DM annihilation and scattering cross-sections can change, up to about 13\% and 12\%, respectively, for the benchmark scenarios we considered, depending on the mass of DM, as compared to the scenario where only DM annihilation is accounted for.
comment: Accepted in JCAP
♻ ☆ Neutron Stars in Causal Scalar-Tensor Theories
We study static, spherically symmetric neutron stars in a class of scalar-tensor theories with non-canonical kinetic terms (K-essence) obeying all causality and hyperbolicity conditions. These models have non-trivial dynamics that lead to a type of anti-screening of the scalar. They lead to small corrections in the solar system due to a small coupling, but can lead to large corrections in regimes of high densities, especially neutron stars. We solve the modified Tolman-Oppenheimer-Volkoff equations numerically using realistic equations of state (SLy4, WFF1, MS1, MPA1). For a given central density, we find that two distinct configurations may exist, forming two separate branches of solutions. We find that above a certain critical central density solutions with the correct asymptotic behavior at spatial infinity cannot be obtained. We obtain precise predictions for the mass-radius relation for neutron stars for different values of the parameters in the model and we compare to data.
comment: 12 pages in double column format, 8 figures. V2: Some further results. Accepted for publication in Phys. Rev. D
♻ ☆ Production of $K^+K^-$ Pairs Through Decay of $φ$ Meson
We develop a theoretical framework for the production of $K^+K^-$ pairs through the decay of $\phi$ mesons produced from a thermal background, based on the Nambu-Jona-Lasinio (NJL) model. The differential production rate of $K^+K^-$ is related to the self-energy of the $\phi$ meson, incorporating the contributions of the quark loop at leading order and the kaon loop at next-to-leading order in the $1/N_c$ expansion. We numerically evaluate the invariant mass spectrum of the $K^+K^-$ pair both in vacuum and at finite temperature. The inclusion of the kaon loop results in a finite width of the spectrum, improving agreement with experimental data. We also investigate the spin alignment of the $\phi$ meson induced by its motion relative to the thermal background. In the NJL model with only chiral condensates, we find that deviations from the unpolarized limit of 1/3 are negligible.
comment: 15 pages, 11 figures
♻ ☆ Kination-like Era Driven by the Effective Inflaton/Higgs Potential
Based on the minimal $U(1)_X$ extended Standard Model, we explore cosmic inflation where the $U(1)_X$ Higgs field serves as the inflaton. We demonstrate that a stiff era with an equation of state $w > 1/3$ can emerge during the inflaton's oscillatory phase after inflation, driven by the Coleman-Weinberg potential of the inflaton, arising due to radiative corrections. This leads to significant modulation and enhancement of the irreducible stochastic gravitational wave (GW) background from inflation, deviating from the conventional scale-invariant spectrum. Such a distinct GW spectrum could be detectable by next-generation GW interferometer missions, such as U-DECIGO. In our framework, the GW spectrum depends on the $U(1)_X$ gauge coupling and the mass of the $U(1)_X$ gauge boson ($Z^\prime$). As a result, future GW observations and $Z^\prime$ boson resonance searches at high-energy collider experiments are complementary to one another.
comment: Accepted in Physical Review D
♻ ☆ Quantum Sensing Radiative Decays of Neutrinos and Dark Matter Particles
We explore a novel strategy for detecting the radiative decay of very weakly interacting particles by leveraging the extreme sensitivity of quantum devices, such as superconducting transmon qubits and trapped ion systems, to faint electromagnetic signals. By modeling the effective electric field induced by the decay photons, we evaluate the response of quantum sensors across two particle physics scenarios: the cosmic neutrino background and two-component dark matter. We assess the discovery potential of these devices and outline the parameter space accessible under current experimental capabilities. Our analysis demonstrates that quantum sensors can probe radiative decays of dark matter candidates using existing technology, while probing neutrino magnetic moments beyond current limits will require scalable quantum architectures with enhanced coherence.
comment: 12 pages, 4 figures, 2 tables, discussion added, conclusions unchanged
Machine Learning - Statistics
☆ Understanding Tool-Integrated Reasoning
We study why Tool-Integrated Reasoning (TIR) makes Large Language Models (LLMs) more capable. While LLMs integrated with tools like Python code interpreters show great promise, a principled theory explaining why this paradigm is effective has been missing. This work provides the first formal proof that TIR fundamentally expands an LLM's capabilities. We demonstrate that tools enable a strict expansion of the model's empirical and feasible support, breaking the capability ceiling of pure-text models by unlocking problem-solving strategies that are otherwise impossible or intractably verbose. To guide model behavior without compromising training stability and performance, we also introduce Advantage Shaping Policy Optimization (ASPO), a novel algorithm that directly modifies the advantage function to guide the policy behavior. We conduct comprehensive experiments on challenging mathematical benchmarks, leveraging a Python interpreter as the external tool. Our results show that the TIR model decisively outperforms its pure-text counterpart on the pass@k metric. Crucially, this advantage is not confined to computationally-intensive problems but extends to those requiring significant abstract insight. We further identify the emergent cognitive patterns that illustrate how models learn to think with tools. Finally, we report improved tool usage behavior with early code invocation and much more interactive turns with ASPO. Overall, our work provides the first principled explanation for TIR's success, shifting the focus from the mere fact that tools work to why and how they enable more powerful reasoning.
☆ Echoes of the past: A unified perspective on fading memory and echo states
Recurrent neural networks (RNNs) have become increasingly popular in information processing tasks involving time series and temporal data. A fundamental property of RNNs is their ability to create reliable input/output responses, often linked to how the network handles its memory of the information it processed. Various notions have been proposed to conceptualize the behavior of memory in RNNs, including steady states, echo states, state forgetting, input forgetting, and fading memory. Although these notions are often used interchangeably, their precise relationships remain unclear. This work aims to unify these notions in a common language, derive new implications and equivalences between them, and provide alternative proofs to some existing results. By clarifying the relationships between these concepts, this research contributes to a deeper understanding of RNNs and their temporal information processing capabilities.
☆ Composition and Alignment of Diffusion Models using Constrained Learning
Diffusion models have become prevalent in generative modeling due to their ability to sample from complex distributions. To improve the quality of generated samples and their compliance with user requirements, two commonly used methods are: (i) Alignment, which involves fine-tuning a diffusion model to align it with a reward; and (ii) Composition, which combines several pre-trained diffusion models, each emphasizing a desirable attribute in the generated outputs. However, trade-offs often arise when optimizing for multiple rewards or combining multiple models, as they can often represent competing properties. Existing methods cannot guarantee that the resulting model faithfully generates samples with all the desired properties. To address this gap, we propose a constrained optimization framework that unifies alignment and composition of diffusion models by enforcing that the aligned model satisfies reward constraints and/or remains close to (potentially multiple) pre-trained models. We provide a theoretical characterization of the solutions to the constrained alignment and composition problems and develop a Lagrangian-based primal-dual training algorithm to approximate these solutions. Empirically, we demonstrate the effectiveness and merits of our proposed approach in image generation, applying it to alignment and composition, and show that our aligned or composed model satisfies constraints effectively, and improves on the equally-weighted approach. Our implementation can be found at https://github.com/shervinkhalafi/constrained_comp_align.
☆ The GINN framework: a stochastic QED correspondence for stability and chaos in deep neural networks
The development of a Euclidean stochastic field-theoretic approach that maps deep neural networks (DNNs) to quantum electrodynamics (QED) with local U(1) symmetry is presented. Neural activations and weights are represented by fermionic matter and gauge fields, with a fictitious Langevin time enabling covariant gauge fixing. This mapping identifies the gauge parameter with kernel design choices in wide DNNs, relating stability thresholds to gauge-dependent amplification factors. Finite-width fluctuations correspond to loop corrections in QED. As a proof of concept, we validate the theoretical predictions through numerical simulations of standard multilayer perceptrons and, in parallel, propose a gauge-invariant neural network (GINN) implementation using magnitude--phase parameterization of weights. Finally, a double-copy replica approach is shown to unify the computation of the largest Lyapunov exponent in stochastic QED and wide DNNs.
comment: 18 pages, 3 figures, 1 table
☆ Sparse minimum Redundancy Maximum Relevance for feature selection
We propose a feature screening method that integrates both feature-feature and feature-target relationships. Inactive features are identified via a penalized minimum Redundancy Maximum Relevance (mRMR) procedure, which is the continuous version of the classic mRMR penalized by a non-convex regularizer, and where the parameters estimated as zero coefficients represent the set of inactive features. We establish the conditions under which zero coefficients are correctly identified to guarantee accurate recovery of inactive features. We introduce a multi-stage procedure based on the knockoff filter enabling the penalized mRMR to discard inactive features while controlling the false discovery rate (FDR). Our method performs comparably to HSIC-LASSO but is more conservative in the number of selected features. It only requires setting an FDR threshold, rather than specifying the number of features to retain. The effectiveness of the method is illustrated through simulations and real-world datasets. The code to reproduce this work is available on the following GitHub: https://github.com/PeterJackNaylor/SmRMR.
☆ Federated Learning with Heterogeneous and Private Label Sets
Although common in real-world applications, heterogeneous client label sets are rarely investigated in federated learning (FL). Furthermore, in the cases they are, clients are assumed to be willing to share their entire label sets with other clients. Federated learning with private label sets, shared only with the central server, adds further constraints on learning algorithms and is, in general, a more difficult problem to solve. In this work, we study the effects of label set heterogeneity on model performance, comparing the public and private label settings -- when the union of label sets in the federation is known to clients and when it is not. We apply classical methods for the classifier combination problem to FL using centralized tuning, adapt common FL methods to the private label set setting, and discuss the justification of both approaches under practical assumptions. Our experiments show that reducing the number of labels available to each client harms the performance of all methods substantially. Centralized tuning of client models for representational alignment can help remedy this, but often at the cost of higher variance. Throughout, our proposed adaptations of standard FL methods perform well, showing similar performance in the private label setting as the standard methods achieve in the public setting. This shows that clients can enjoy increased privacy at little cost to model accuracy.
☆ Efficient Best-of-Both-Worlds Algorithms for Contextual Combinatorial Semi-Bandits
We introduce the first best-of-both-worlds algorithm for contextual combinatorial semi-bandits that simultaneously guarantees $\widetilde{\mathcal{O}}(\sqrt{T})$ regret in the adversarial regime and $\widetilde{\mathcal{O}}(\ln T)$ regret in the corrupted stochastic regime. Our approach builds on the Follow-the-Regularized-Leader (FTRL) framework equipped with a Shannon entropy regularizer, yielding a flexible method that admits efficient implementations. Beyond regret bounds, we tackle the practical bottleneck in FTRL (or, equivalently, Online Stochastic Mirror Descent) arising from the high-dimensional projection step encountered in each round of interaction. By leveraging the Karush-Kuhn-Tucker conditions, we transform the $K$-dimensional convex projection problem into a single-variable root-finding problem, dramatically accelerating each round. Empirical evaluations demonstrate that this combined strategy not only attains the attractive regret bounds of best-of-both-worlds algorithms but also delivers substantial per-round speed-ups, making it well-suited for large-scale, real-time applications.
☆ Lightweight posterior construction for gravitational-wave catalogs with the Kolmogorov-Arnold network
Neural density estimation has seen widespread applications in the gravitational-wave (GW) data analysis, which enables real-time parameter estimation for compact binary coalescences and enhances rapid inference for subsequent analysis such as population inference. In this work, we explore the application of using the Kolmogorov-Arnold network (KAN) to construct efficient and interpretable neural density estimators for lightweight posterior construction of GW catalogs. By replacing conventional activation functions with learnable splines, KAN achieves superior interpretability, higher accuracy, and greater parameter efficiency on related scientific tasks. Leveraging this feature, we propose a KAN-based neural density estimator, which ingests megabyte-scale GW posterior samples and compresses them into model weights of tens of kilobytes. Subsequently, analytic expressions requiring only several kilobytes can be further distilled from these neural network weights with minimal accuracy trade-off. In practice, GW posterior samples with fidelity can be regenerated rapidly using the model weights or analytic expressions for subsequent analysis. Our lightweight posterior construction strategy is expected to facilitate user-level data storage and transmission, paving a path for efficient analysis of numerous GW events in the next-generation GW detectors.
comment: 14 pages, 9 figures
☆ Revisiting Follow-the-Perturbed-Leader with Unbounded Perturbations in Bandit Problems
Follow-the-Regularized-Leader (FTRL) policies have achieved Best-of-Both-Worlds (BOBW) results in various settings through hybrid regularizers, whereas analogous results for Follow-the-Perturbed-Leader (FTPL) remain limited due to inherent analytical challenges. To advance the analytical foundations of FTPL, we revisit classical FTRL-FTPL duality for unbounded perturbations and establish BOBW results for FTPL under a broad family of asymmetric unbounded Fr\'echet-type perturbations, including hybrid perturbations combining Gumbel-type and Fr\'echet-type tails. These results not only extend the BOBW results of FTPL but also offer new insights into designing alternative FTPL policies competitive with hybrid regularization approaches. Motivated by earlier observations in two-armed bandits, we further investigate the connection between the $1/2$-Tsallis entropy and a Fr\'echet-type perturbation. Our numerical observations suggest that it corresponds to a symmetric Fr\'echet-type perturbation, and based on this, we establish the first BOBW guarantee for symmetric unbounded perturbations in the two-armed setting. In contrast, in general multi-armed bandits, we find an instance in which symmetric Fr\'echet-type perturbations violate the key condition for standard BOBW analysis, which is a problem not observed with asymmetric or nonnegative Fr\'echet-type perturbations. Although this example does not rule out alternative analyses achieving BOBW results, it suggests the limitations of directly applying the relationship observed in two-armed cases to the general case and thus emphasizes the need for further investigation to fully understand the behavior of FTPL in broader settings.
comment: Preprint
♻ ☆ Learning the Simplest Neural ODE
Since the advent of the ``Neural Ordinary Differential Equation (Neural ODE)'' paper, learning ODEs with deep learning has been applied to system identification, time-series forecasting, and related areas. Exploiting the diffeomorphic nature of ODE solution maps, neural ODEs has also enabled their use in generative modeling. Despite the rich potential to incorporate various kinds of physical information, training Neural ODEs remains challenging in practice. This study demonstrates, through the simplest one-dimensional linear model, why training Neural ODEs is difficult. We then propose a new stabilization method and provide an analytical convergence analysis. The insights and techniques presented here serve as a concise tutorial for researchers beginning work on Neural ODEs.
comment: Accepted SICE FES 2025
♻ ☆ Comparison of Data Reduction Criteria for Online Gaussian Processes
Gaussian Processes (GPs) are widely used for regression and system identification due to their flexibility and ability to quantify uncertainty. However, their computational complexity limits their applicability to small datasets. Moreover in a streaming scenario, more and more datapoints accumulate which is intractable even for Sparse GPs. Online GPs aim to alleviate this problem by e.g. defining a maximum budget of datapoints and removing redundant datapoints. This work provides a unified comparison of several reduction criteria, analyzing both their computational complexity and reduction behavior. The criteria are evaluated on benchmark functions and real-world datasets, including dynamic system identification tasks. Additionally, acceptance criteria are proposed to further filter out redundant datapoints. This work yields practical guidelines for choosing a suitable criterion for an online GP algorithm.
comment: 24 pages
♻ ☆ Subjective Perspectives within Learned Representations Predict High-Impact Innovation
Existing studies of innovation emphasize the power of social structures to shape innovation capacity. Emerging machine learning approaches, however, enable us to model innovators' personal perspectives and interpersonal innovation opportunities as a function of their prior experience. We theorize and then quantify subjective perspectives and their interaction based on innovator positions within the geometric space of concepts inscribed by dynamic machine-learned language representations. Using data on millions of scientists, inventors, screenplay writers, entrepreneurs, and Wikipedia contributors across their respective creative domains, here we show that measured subjective perspectives predict which ideas individuals and groups will creatively attend to and successfully combine in the future. Across all cases and time periods we examine, when perspective diversity is decomposed as the difference between collaborators' perspectives on their creation, and background diversity as the difference between their experiences, the former consistently anticipates creative achievement while the latter portends its opposite. We analyze a natural experiment and simulate creative collaborations between AI agents designed with various perspective and background diversity, which support our observational findings. We explore mechanisms underlying these findings and identify how successful collaborators leverage common language to weave together diverse experiences obtained through trajectories of prior work. These perspectives converge and provoke one another to innovate. We examine the significance of these findings for team formation and research policy.
comment: 123 pages, 23 figures
♻ ☆ Curvature Learning for Generalization of Hyperbolic Neural Networks
Hyperbolic neural networks (HNNs) have demonstrated notable efficacy in representing real-world data with hierarchical structures via exploiting the geometric properties of hyperbolic spaces characterized by negative curvatures. Curvature plays a crucial role in optimizing HNNs. Inappropriate curvatures may cause HNNs to converge to suboptimal parameters, degrading overall performance. So far, the theoretical foundation of the effect of curvatures on HNNs has not been developed. In this paper, we derive a PAC-Bayesian generalization bound of HNNs, highlighting the role of curvatures in the generalization of HNNs via their effect on the smoothness of the loss landscape. Driven by the derived bound, we propose a sharpness-aware curvature learning method to smooth the loss landscape, thereby improving the generalization of HNNs. In our method, we design a scope sharpness measure for curvatures, which is minimized through a bi-level optimization process. Then, we introduce an implicit differentiation algorithm that efficiently solves the bi-level optimization by approximating gradients of curvatures. We present the approximation error and convergence analyses of the proposed method, showing that the approximation error is upper-bounded, and the proposed method can converge by bounding gradients of HNNs. Experiments on four settings: classification, learning from long-tailed data, learning from noisy data, and few-shot learning show that our method can improve the performance of HNNs.
comment: Accepted by International Journal of Computer Vision (IJCV)
Machine Learning - Computer Science
☆ Predicting the Order of Upcoming Tokens Improves Language Modeling
Multi-Token Prediction (MTP) has been proposed as an auxiliary objective to improve next-token prediction (NTP) in language model training but shows inconsistent improvements, underperforming in standard NLP benchmarks. We argue that MTP's exact future token prediction is too difficult as an auxiliary loss. Instead, we propose Token Order Prediction (TOP), which trains models to order upcoming tokens by their proximity using a learning-to-rank loss. TOP requires only a single additional unembedding layer compared to MTP's multiple transformer layers. We pretrain models of 340M, 1.8B, and 7B parameters using NTP, MTP, and TOP objectives. Results on eight standard NLP benchmarks show that TOP overall outperforms both NTP and MTP even at scale. Our code is available at https://github.com/zaydzuhri/token-order-prediction
☆ Understanding Tool-Integrated Reasoning
We study why Tool-Integrated Reasoning (TIR) makes Large Language Models (LLMs) more capable. While LLMs integrated with tools like Python code interpreters show great promise, a principled theory explaining why this paradigm is effective has been missing. This work provides the first formal proof that TIR fundamentally expands an LLM's capabilities. We demonstrate that tools enable a strict expansion of the model's empirical and feasible support, breaking the capability ceiling of pure-text models by unlocking problem-solving strategies that are otherwise impossible or intractably verbose. To guide model behavior without compromising training stability and performance, we also introduce Advantage Shaping Policy Optimization (ASPO), a novel algorithm that directly modifies the advantage function to guide the policy behavior. We conduct comprehensive experiments on challenging mathematical benchmarks, leveraging a Python interpreter as the external tool. Our results show that the TIR model decisively outperforms its pure-text counterpart on the pass@k metric. Crucially, this advantage is not confined to computationally-intensive problems but extends to those requiring significant abstract insight. We further identify the emergent cognitive patterns that illustrate how models learn to think with tools. Finally, we report improved tool usage behavior with early code invocation and much more interactive turns with ASPO. Overall, our work provides the first principled explanation for TIR's success, shifting the focus from the mere fact that tools work to why and how they enable more powerful reasoning.
☆ Planning-Query-Guided Model Generation for Model-Based Deformable Object Manipulation
Efficient planning in high-dimensional spaces, such as those involving deformable objects, requires computationally tractable yet sufficiently expressive dynamics models. This paper introduces a method that automatically generates task-specific, spatially adaptive dynamics models by learning which regions of the object require high-resolution modeling to achieve good task performance for a given planning query. Task performance depends on the complex interplay between the dynamics model, world dynamics, control, and task requirements. Our proposed diffusion-based model generator predicts per-region model resolutions based on start and goal pointclouds that define the planning query. To efficiently collect the data for learning this mapping, a two-stage process optimizes resolution using predictive dynamics as a prior before directly optimizing using closed-loop performance. On a tree-manipulation task, our method doubles planning speed with only a small decrease in task performance over using a full-resolution model. This approach informs a path towards using previous planning and control data to generate computationally efficient yet sufficiently expressive dynamics models for new tasks.
comment: 9 pages, 7 figures
☆ Emotions as Ambiguity-aware Ordinal Representations
Emotions are inherently ambiguous and dynamic phenomena, yet existing continuous emotion recognition approaches either ignore their ambiguity or treat ambiguity as an independent and static variable over time. Motivated by this gap in the literature, in this paper we introduce \emph{ambiguity-aware ordinal} emotion representations, a novel framework that captures both the ambiguity present in emotion annotation and the inherent temporal dynamics of emotional traces. Specifically, we propose approaches that model emotion ambiguity through its rate of change. We evaluate our framework on two affective corpora -- RECOLA and GameVibe -- testing our proposed approaches on both bounded (arousal, valence) and unbounded (engagement) continuous traces. Our results demonstrate that ordinal representations outperform conventional ambiguity-aware models on unbounded labels, achieving the highest Concordance Correlation Coefficient (CCC) and Signed Differential Agreement (SDA) scores, highlighting their effectiveness in modeling the traces' dynamics. For bounded traces, ordinal representations excel in SDA, revealing their superior ability to capture relative changes of annotated emotion traces.
☆ Get Global Guarantees: On the Probabilistic Nature of Perturbation Robustness
In safety-critical deep learning applications, robustness measures the ability of neural models that handle imperceptible perturbations in input data, which may lead to potential safety hazards. Existing pre-deployment robustness assessment methods typically suffer from significant trade-offs between computational cost and measurement precision, limiting their practical utility. To address these limitations, this paper conducts a comprehensive comparative analysis of existing robustness definitions and associated assessment methodologies. We propose tower robustness to evaluate robustness, which is a novel, practical metric based on hypothesis testing to quantitatively evaluate probabilistic robustness, enabling more rigorous and efficient pre-deployment assessments. Our extensive comparative evaluation illustrates the advantages and applicability of our proposed approach, thereby advancing the systematic understanding and enhancement of model robustness in safety-critical deep learning applications.
☆ Leveraging Evolutionary Surrogate-Assisted Prescription in Multi-Objective Chlorination Control Systems
This short, written report introduces the idea of Evolutionary Surrogate-Assisted Prescription (ESP) and presents preliminary results on its potential use in training real-world agents as a part of the 1st AI for Drinking Water Chlorination Challenge at IJCAI-2025. This work was done by a team from Project Resilience, an organization interested in bridging AI to real-world problems.
☆ From Tabula Rasa to Emergent Abilities: Discovering Robot Skills via Real-World Unsupervised Quality-Diversity
Autonomous skill discovery aims to enable robots to acquire diverse behaviors without explicit supervision. Learning such behaviors directly on physical hardware remains challenging due to safety and data efficiency constraints. Existing methods, including Quality-Diversity Actor-Critic (QDAC), require manually defined skill spaces and carefully tuned heuristics, limiting real-world applicability. We propose Unsupervised Real-world Skill Acquisition (URSA), an extension of QDAC that enables robots to autonomously discover and master diverse, high-performing skills directly in the real world. We demonstrate that URSA successfully discovers diverse locomotion skills on a Unitree A1 quadruped in both simulation and the real world. Our approach supports both heuristic-driven skill discovery and fully unsupervised settings. We also show that the learned skill repertoire can be reused for downstream tasks such as real-world damage adaptation, where URSA outperforms all baselines in 5 out of 9 simulated and 3 out of 5 real-world damage scenarios. Our results establish a new framework for real-world robot learning that enables continuous skill discovery with limited human intervention, representing a significant step toward more autonomous and adaptable robotic systems. Demonstration videos are available at http://adaptive-intelligent-robotics.github.io/URSA .
comment: Accepted at CoRL 2025
Few-Shot Connectivity-Aware Text Line Segmentation in Historical Documents
A foundational task for the digital analysis of documents is text line segmentation. However, automating this process with deep learning models is challenging because it requires large, annotated datasets that are often unavailable for historical documents. Additionally, the annotation process is a labor- and cost-intensive task that requires expert knowledge, which makes few-shot learning a promising direction for reducing data requirements. In this work, we demonstrate that small and simple architectures, coupled with a topology-aware loss function, are more accurate and data-efficient than more complex alternatives. We pair a lightweight UNet++ with a connectivity-aware loss, initially developed for neuron morphology, which explicitly penalizes structural errors like line fragmentation and unintended line merges. To increase our limited data, we train on small patches extracted from a mere three annotated pages per manuscript. Our methodology significantly improves upon the current state-of-the-art on the U-DIADS-TL dataset, with a 200% increase in Recognition Accuracy and a 75% increase in Line Intersection over Union. Our method also achieves an F-Measure score on par with or even exceeding that of the competition winner of the DIVA-HisDB baseline detection task, all while requiring only three annotated pages, exemplifying the efficacy of our approach. Our implementation is publicly available at: https://github.com/RafaelSterzinger/acpr_few_shot_hist.
comment: 15 pages, accepted at ACPR2025
☆ Playstyle and Artificial Intelligence: An Initial Blueprint Through the Lens of Video Games
Contemporary artificial intelligence (AI) development largely centers on rational decision-making, valued for its measurability and suitability for objective evaluation. Yet in real-world contexts, an intelligent agent's decisions are shaped not only by logic but also by deeper influences such as beliefs, values, and preferences. The diversity of human decision-making styles emerges from these differences, highlighting that "style" is an essential but often overlooked dimension of intelligence. This dissertation introduces playstyle as an alternative lens for observing and analyzing the decision-making behavior of intelligent agents, and examines its foundational meaning and historical context from a philosophical perspective. By analyzing how beliefs and values drive intentions and actions, we construct a two-tier framework for style formation: the external interaction loop with the environment and the internal cognitive loop of deliberation. On this basis, we formalize style-related characteristics and propose measurable indicators such as style capacity, style popularity, and evolutionary dynamics. The study focuses on three core research directions: (1) Defining and measuring playstyle, proposing a general playstyle metric based on discretized state spaces, and extending it to quantify strategic diversity and competitive balance; (2) Expressing and generating playstyle, exploring how reinforcement learning and imitation learning can be used to train agents exhibiting specific stylistic tendencies, and introducing a novel approach for human-like style learning and modeling; and (3) Practical applications, analyzing the potential of these techniques in domains such as game design and interactive entertainment. Finally, the dissertation outlines future extensions, including the role of style as a core element in building artificial general intelligence (AGI).
comment: PhD Dissertation, National Yang Ming Chiao Tung University, 2025. This is the public version without Chinese abstract or postscript
☆ Saddle Hierarchy in Dense Associative Memory
Dense associative memory (DAM) models have been attracting renewed attention since they were shown to be robust to adversarial examples and closely related to state-of-the-art machine learning paradigms, such as the attention mechanisms in transformers and generative diffusion models. We study a DAM built upon a three-layer Boltzmann machine with Potts hidden units, which represent data clusters and classes. Through a statistical mechanics analysis, we derive saddle-point equations that characterize both the stationary points of DAMs trained on real data and the fixed points of DAMs trained on synthetic data within a teacher-student framework. Based on these results, we propose a novel regularization scheme that makes training significantly more stable. Moreover, we show empirically that our DAM learns interpretable solutions to both supervised and unsupervised classification problems. Pushing our theoretical analysis further, we find that the weights learned by relatively small DAMs correspond to unstable saddle points in larger DAMs. We implement a network-growing algorithm that leverages this saddle-point hierarchy to drastically reduce the computational cost of training dense associative memory.
comment: 55 pages, 10 figures
☆ Echoes of the past: A unified perspective on fading memory and echo states
Recurrent neural networks (RNNs) have become increasingly popular in information processing tasks involving time series and temporal data. A fundamental property of RNNs is their ability to create reliable input/output responses, often linked to how the network handles its memory of the information it processed. Various notions have been proposed to conceptualize the behavior of memory in RNNs, including steady states, echo states, state forgetting, input forgetting, and fading memory. Although these notions are often used interchangeably, their precise relationships remain unclear. This work aims to unify these notions in a common language, derive new implications and equivalences between them, and provide alternative proofs to some existing results. By clarifying the relationships between these concepts, this research contributes to a deeper understanding of RNNs and their temporal information processing capabilities.
☆ A Bag of Tricks for Efficient Implicit Neural Point Clouds
Implicit Neural Point Cloud (INPC) is a recent hybrid representation that combines the expressiveness of neural fields with the efficiency of point-based rendering, achieving state-of-the-art image quality in novel view synthesis. However, as with other high-quality approaches that query neural networks during rendering, the practical usability of INPC is limited by comparatively slow rendering. In this work, we present a collection of optimizations that significantly improve both the training and inference performance of INPC without sacrificing visual fidelity. The most significant modifications are an improved rasterizer implementation, more effective sampling techniques, and the incorporation of pre-training for the convolutional neural network used for hole-filling. Furthermore, we demonstrate that points can be modeled as small Gaussians during inference to further improve quality in extrapolated, e.g., close-up views of the scene. We design our implementations to be broadly applicable beyond INPC and systematically evaluate each modification in a series of experiments. Our optimized INPC pipeline achieves up to 25% faster training, 2x faster rendering, and 20% reduced VRAM usage paired with slight image quality improvements.
comment: Project page: https://fhahlbohm.github.io/inpc_v2/
☆ Active Query Selection for Crowd-Based Reinforcement Learning
Preference-based reinforcement learning has gained prominence as a strategy for training agents in environments where the reward signal is difficult to specify or misaligned with human intent. However, its effectiveness is often limited by the high cost and low availability of reliable human input, especially in domains where expert feedback is scarce or errors are costly. To address this, we propose a novel framework that combines two complementary strategies: probabilistic crowd modelling to handle noisy, multi-annotator feedback, and active learning to prioritize feedback on the most informative agent actions. We extend the Advise algorithm to support multiple trainers, estimate their reliability online, and incorporate entropy-based query selection to guide feedback requests. We evaluate our approach in a set of environments that span both synthetic and real-world-inspired settings, including 2D games (Taxi, Pacman, Frozen Lake) and a blood glucose control task for Type 1 Diabetes using the clinically approved UVA/Padova simulator. Our preliminary results demonstrate that agents trained with feedback on uncertain trajectories exhibit faster learning in most tasks, and we outperform the baselines for the blood glucose control task.
comment: 7 pages, 4 figures, 2 tables plus appendices
♻ ☆ Cohort-Aware Agents for Individualized Lung Cancer Risk Prediction Using a Retrieval-Augmented Model Selection Framework
Accurate lung cancer risk prediction remains challenging due to substantial variability across patient populations and clinical settings -- no single model performs best for all cohorts. To address this, we propose a personalized lung cancer risk prediction agent that dynamically selects the most appropriate model for each patient by combining cohort-specific knowledge with modern retrieval and reasoning techniques. Given a patient's CT scan and structured metadata -- including demographic, clinical, and nodule-level features -- the agent first performs cohort retrieval using FAISS-based similarity search across nine diverse real-world cohorts to identify the most relevant patient population from a multi-institutional database. Second, a Large Language Model (LLM) is prompted with the retrieved cohort and its associated performance metrics to recommend the optimal prediction algorithm from a pool of eight representative models, including classical linear risk models (e.g., Mayo, Brock), temporally-aware models (e.g., TD-VIT, DLSTM), and multi-modal computer vision-based approaches (e.g., Liao, Sybil, DLS, DLI). This two-stage agent pipeline -- retrieval via FAISS and reasoning via LLM -- enables dynamic, cohort-aware risk prediction personalized to each patient's profile. Building on this architecture, the agent supports flexible and cohort-driven model selection across diverse clinical populations, offering a practical path toward individualized risk assessment in real-world lung cancer screening.
♻ ☆ Emergent time-keeping mechanisms in a deep reinforcement learning agent performing an interval timing task
Drawing parallels between Deep Artificial Neural Networks (DNNs) and biological systems can aid in understanding complex biological mechanisms that are difficult to disentangle. Temporal processing, an extensively researched topic, is one such example that lacks a coherent understanding of its underlying mechanisms. In this study, we investigate temporal processing in a Deep Reinforcement Learning (DRL) agent performing an interval timing task and explore potential biological counterparts to its emergent behavior. The agent was successfully trained to perform a duration production task, which involved marking successive occurrences of a target interval while viewing a video sequence. Analysis of the agent's internal states revealed oscillatory neural activations, a ubiquitous pattern in biological systems. Interestingly, the agent's actions were predominantly influenced by neurons exhibiting these oscillations with high amplitudes and frequencies corresponding to the target interval. Parallels are drawn between the agent's time-keeping strategy and the Striatal Beat Frequency (SBF) model, a biologically plausible model of interval timing. Furthermore, the agent maintained its oscillatory representations and task performance when tested on different video sequences (including a blank video). Thus, once learned, the agent internalized its time-keeping mechanism and showed minimal reliance on its environment to perform the timing task. A hypothesis about the resemblance between this emergent behavior and certain aspects of the evolution of biological processes like circadian rhythms, has been discussed. This study aims to contribute to recent research efforts of utilizing DNNs to understand biological systems, with a particular emphasis on temporal processing.
comment: Accepted at 2025 Artificial Life Conference
♻ ☆ Distribution free M-estimation
The basic question of delineating those statistical problems that are solvable without making any assumptions on the underlying data distribution has long animated statistics and learning theory. This paper characterizes when a convex M-estimation or stochastic optimization problem is solvable in such an assumption-free setting, providing a precise dividing line between solvable and unsolvable problems. The conditions we identify show, perhaps surprisingly, that Lipschitz continuity of the loss being minimized is not necessary for distribution free minimization, and they are also distinct from classical characterizations of learnability in machine learning.
comment: 45 pages
♻ ☆ Local Learning Rules for Out-of-Equilibrium Physical Generative Models
We show that the out-of-equilibrium driving protocol of score-based generative models (SGMs) can be learned via local learning rules. The gradient with respect to the parameters of the driving protocol is computed directly from force measurements or from observed system dynamics. As a demonstration, we implement an SGM in a network of driven, nonlinear, overdamped oscillators coupled to a thermal bath. We first apply it to the problem of sampling from a mixture of two Gaussians in 2D. Finally, we train a 12x12 oscillator network on the MNIST dataset to generate images of handwritten digits 0 and 1.
comment: 6 pages, 2 figures
♻ ☆ Investigating the Robustness of Extreme Precipitation Super-Resolution Across Climates
The coarse spatial resolution of gridded climate models, such as general circulation models, limits their direct use in projecting socially relevant variables like extreme precipitation. Most downscaling methods estimate the conditional distributions of extremes by generating large ensembles, complicating the assessment of robustness under distributional shifts, such as those induced by climate change. To better understand and potentially improve robustness, we propose super-resolving the parameters of the target variable's probability distribution directly using analytically tractable mappings. Within a perfect-model framework over Switzerland, we demonstrate that vector generalized linear and additive models can super-resolve the generalized extreme value distribution of summer hourly precipitation extremes from coarse precipitation fields and topography. We introduce the notion of a "robustness gap", defined as the difference in predictive error between present-trained and future-trained models, and use it to diagnose how model structure affects the generalization of each quantile to a pseudo-global warming scenario. By evaluating multiple model configurations, we also identify an upper limit on the super-resolution factor based on the spatial auto- and cross-correlation of precipitation and elevation, beyond which coarse precipitation loses predictive value. Our framework is broadly applicable to variables governed by parametric distributions and offers a model-agnostic diagnostic for understanding when and why empirical downscaling generalizes to climate change and extremes.
comment: 40+3 pages, 9 figures, 1 table, submitted to WCE
♻ ☆ Quantum Graph Attention Network: A Novel Quantum Multi-Head Attention Mechanism for Graph Learning
We propose the Quantum Graph Attention Network (QGAT), a hybrid graph neural network that integrates variational quantum circuits into the attention mechanism. At its core, QGAT employs strongly entangling quantum circuits with amplitude-encoded node features to enable expressive nonlinear interactions. Distinct from classical multi-head attention that separately computes each head, QGAT leverages a single quantum circuit to simultaneously generate multiple attention coefficients. This quantum parallelism facilitates parameter sharing across heads, substantially reducing computational overhead and model complexity. Classical projection weights and quantum circuit parameters are optimized jointly in an end-to-end manner, ensuring flexible adaptation to learning tasks. Empirical results demonstrate QGAT's effectiveness in capturing complex structural dependencies and improved generalization in inductive scenarios, highlighting its potential for scalable quantum-enhanced learning across domains such as chemistry, biology, and network analysis. Furthermore, experiments confirm that quantum embedding enhances robustness against feature and structural noise, suggesting advantages in handling real-world noisy data. The modularity of QGAT also ensures straightforward integration into existing architectures, allowing it to easily augment classical attention-based models.
♻ ☆ Leveraging Multi-facet Paths for Heterogeneous Graph Representation Learning
Recent advancements in graph neural networks (GNNs) and heterogeneous GNNs (HGNNs) have advanced node embeddings and relationship learning for various tasks. However, existing methods often rely on domain-specific predefined meta-paths, which are coarse-grained and focus solely on aspects like node type, limiting their ability to capture complex interactions. We introduce MF2Vec, a model that uses multi-faceted (fine-grained) paths instead of predefined meta-paths. MF2Vec extracts paths via random walks and generates multi-faceted vectors, ignoring predefined schemas. This method learns diverse aspects of nodes and their relationships, constructs a homogeneous network, and creates node embeddings for classification, link prediction, and clustering. Extensive experiments show that MF2Vec outperforms existing methods, offering a more flexible and comprehensive framework for analyzing complex networks. The code is available at https://anonymous.4open.science/r/MF2Vec-6ABC.
Programming Languages
☆ An LLM-powered Natural-to-Robotic Language Translation Framework with Correctness Guarantees
The Large Language Models (LLM) are increasingly being deployed in robotics to generate robot control programs for specific user tasks, enabling embodied intelligence. Existing methods primarily focus on LLM training and prompt design that utilize LLMs to generate executable programs directly from user tasks in natural language. However, due to the inconsistency of the LLMs and the high complexity of the tasks, such best-effort approaches often lead to tremendous programming errors in the generated code, which significantly undermines the effectiveness especially when the light-weight LLMs are applied. This paper introduces a natural-robotic language translation framework that (i) provides correctness verification for generated control programs and (ii) enhances the performance of LLMs in program generation via feedback-based fine-tuning for the programs. To achieve this, a Robot Skill Language (RSL) is proposed to abstract away from the intricate details of the control programs, bridging the natural language tasks with the underlying robot skills. Then, the RSL compiler and debugger are constructed to verify RSL programs generated by the LLM and provide error feedback to the LLM for refining the outputs until being verified by the compiler. This provides correctness guarantees for the LLM-generated programs before being offloaded to the robots for execution, significantly enhancing the effectiveness of LLM-powered robotic applications. Experiments demonstrate NRTrans outperforms the existing method under a range of LLMs and tasks, and achieves a high success rate for light-weight LLMs.
☆ A Case Study on the Effectiveness of LLMs in Verification with Proof Assistants
Large language models (LLMs) can potentially help with verification using proof assistants by automating proofs. However, it is unclear how effective LLMs are in this task. In this paper, we perform a case study based on two mature Rocq projects: the hs-to-coq tool and Verdi. We evaluate the effectiveness of LLMs in generating proofs by both quantitative and qualitative analysis. Our study finds that: (1) external dependencies and context in the same source file can significantly help proof generation; (2) LLMs perform great on small proofs but can also generate large proofs; (3) LLMs perform differently on different verification projects; and (4) LLMs can generate concise and smart proofs, apply classical techniques to new definitions, but can also make odd mistakes.
comment: Accepted by LMPL 2025
♻ ☆ Correct Black-Box Monitors for Distributed Deadlock Detection: Formalisation and Implementation (Technical Report)
Many software applications rely on concurrent and distributed (micro)services that interact via message-passing and various forms of remote procedure calls (RPC). As these systems organically evolve and grow in scale and complexity, the risk of introducing deadlocks increases and their impact may worsen: even if only a few services deadlock, many other services may block while awaiting responses from the deadlocked ones. As a result, the "core" of the deadlock can be obfuscated by its consequences on the rest of the system, and diagnosing and fixing the problem can be challenging. In this work we tackle the challenge by proposing distributed black-box monitors that are deployed alongside each service and detect deadlocks by only observing the incoming and outgoing messages, and exchanging probes with other monitors. We present a formal model that captures popular RPC-based application styles (e.g., gen_servers in Erlang/OTP), and a distributed black-box monitoring algorithm that we prove sound and complete (i.e., identifies deadlocked services with neither false positives nor false negatives). We implement our results in a tool called DDMon for the monitoring of Erlang/OTP applications, and we evaluate its performance. This is the first work that formalises, proves the correctness, and implements distributed black-box monitors for deadlock detection. Our results are mechanised in Coq. DDMon is the companion artifact of this paper.
Machine Learning - Statistics
☆ Enhancing Trust-Region Bayesian Optimization via Newton Methods
Bayesian Optimization (BO) has been widely applied to optimize expensive black-box functions while retaining sample efficiency. However, scaling BO to high-dimensional spaces remains challenging. Existing literature proposes performing standard BO in multiple local trust regions (TuRBO) for heterogeneous modeling of the objective function and avoiding over-exploration. Despite its advantages, using local Gaussian Processes (GPs) reduces sampling efficiency compared to a global GP. To enhance sampling efficiency while preserving heterogeneous modeling, we propose to construct multiple local quadratic models using gradients and Hessians from a global GP, and select new sample points by solving the bound-constrained quadratic program. Additionally, we address the issue of vanishing gradients of GPs in high-dimensional spaces. We provide a convergence analysis and demonstrate through experimental results that our method enhances the efficacy of TuRBO and outperforms a wide range of high-dimensional BO techniques on synthetic functions and real-world applications.
☆ Deterministic Coreset Construction via Adaptive Sensitivity Trimming
We develop a rigorous framework for deterministic coreset construction in empirical risk minimization (ERM). Our central contribution is the Adaptive Deterministic Uniform-Weight Trimming (ADUWT) algorithm, which constructs a coreset by excising points with the lowest sensitivity bounds and applying a data-dependent uniform weight to the remainder. The method yields a uniform $(1\pm\varepsilon)$ relative-error approximation for the ERM objective over the entire hypothesis space. We provide complete analysis, including (i) a minimax characterization proving the optimality of the adaptive weight, (ii) an instance-dependent size analysis in terms of a \emph{Sensitivity Heterogeneity Index}, and (iii) tractable sensitivity oracles for kernel ridge regression, regularized logistic regression, and linear SVM. Reproducibility is supported by precise pseudocode for the algorithm, sensitivity oracles, and evaluation pipeline. Empirical results align with the theory. We conclude with open problems on instance-optimal oracles, deterministic streaming, and fairness-constrained ERM.
comment: 6 pages, 5 algorithms, 1 table
♻ ☆ Data Compression using Rank-1 Lattices for Parameter Estimation in Machine Learning
The mean squared error and regularized versions of it are standard loss functions in supervised machine learning. However, calculating these losses for large data sets can be computationally demanding. Modifying an approach of J. Dick and M. Feischl [Journal of Complexity 67 (2021)], we present algorithms to reduce extensive data sets to a smaller size using rank-1 lattices. Rank-1 lattices are quasi-Monte Carlo (QMC) point sets that are, if carefully chosen, well-distributed in a multidimensional unit cube. The compression strategy in the preprocessing step assigns every lattice point a pair of weights depending on the original data and responses, representing its relative importance. As a result, the compressed data makes iterative loss calculations in optimization steps much faster. We analyze the errors of our QMC data compression algorithms and the cost of the preprocessing step for functions whose Fourier coefficients decay sufficiently fast so that they lie in certain Wiener algebras or Korobov spaces. In particular, we prove that our approach can lead to arbitrary high convergence rates as long as the functions are sufficiently smooth.
comment: 28 pages, 1 updated figure, 1 new figure, new Appendix; To be published in Mathematics of Computation
♻ ☆ Learning Optimal Classification Trees Robust to Distribution Shifts
We consider the problem of learning classification trees that are robust to distribution shifts between training and testing/deployment data. This problem arises frequently in high stakes settings such as public health and social work where data is often collected using self-reported surveys which are highly sensitive to e.g., the framing of the questions, the time when and place where the survey is conducted, and the level of comfort the interviewee has in sharing information with the interviewer. We propose a method for learning optimal robust classification trees based on mixed-integer robust optimization technology. In particular, we demonstrate that the problem of learning an optimal robust tree can be cast as a single-stage mixed-integer robust optimization problem with a highly nonlinear and discontinuous objective. We reformulate this problem equivalently as a two-stage linear robust optimization problem for which we devise a tailored solution procedure based on constraint generation. We evaluate the performance of our approach on numerous publicly available datasets, and compare the performance to a regularized, non-robust optimal tree. We show an increase of up to 12.48% in worst-case accuracy and of up to 4.85% in average-case accuracy across several datasets and distribution shifts from using our robust solution in comparison to the non-robust one.
comment: 51 pages, 10 figures
♻ ☆ Sharp Lower Bounds on Interpolation by Deep ReLU Neural Networks at Irregularly Spaced Data
We study the interpolation power of deep ReLU neural networks. Specifically, we consider the question of how efficiently, in terms of the number of parameters, deep ReLU networks can interpolate values at $N$ datapoints in the unit ball which are separated by a distance $\delta$. We show that $\Omega(N)$ parameters are required in the regime where $\delta$ is exponentially small in $N$, which gives the sharp result in this regime since $O(N)$ parameters are always sufficient. This also shows that the bit-extraction technique used to prove lower bounds on the VC dimension cannot be applied to irregularly spaced datapoints. Finally, as an application we give a lower bound on the approximation rates that deep ReLU neural networks can achieve for Sobolev spaces at the embedding endpoint.
♻ ☆ How many samples are needed to train a deep neural network?
Neural networks have become standard tools in many areas, yet many important statistical questions remain open. This paper studies the question of how much data are needed to train a ReLU feed-forward neural network. Our theoretical and empirical results suggest that the generalization error of ReLU feed-forward neural networks scales at the rate $1/\sqrt{n}$ in the sample size $n$ rather than the usual "parametric rate" $1/n$. Thus, broadly speaking, our results underpin the common belief that neural networks need "many" training samples.
♻ ☆ Activation degree thresholds and expressiveness of polynomial neural networks
We study the expressive power of deep polynomial neural networks through the geometry of their neurovariety. We introduce the notion of the activation degree threshold of a network architecture to express when the dimension of the neurovariety achieves its theoretical maximum. We prove the existence of the activation degree threshold for all polynomial neural networks without width-one bottlenecks and demonstrate a universal upper bound that is quadratic in the width of largest size. In doing so, we prove the high activation degree conjecture of Kileel, Trager, and Bruna. Certain structured architectures have exceptional activation degree thresholds, making them especially expressive in the sense of their neurovariety dimension. In this direction, we prove that polynomial neural networks with equi-width architectures are maximally expressive by showing their activation degree threshold is one.
comment: 24 pages, 1 figure
Programming Languages
☆ Compositional Verification in Concurrent Separation Logic with Permissions Regions
Concurrent separation logic with fractional permissions (CSLPerm) provides a promising reasoning system to verify most complex sequential and concurrent fine-grained programs. The logic with strong and weak separating conjunctions offers a solid foundation for producing concise and precise proofs. However, it lacks automation and compositionality support. This paper addresses this limitation by introducing a compositional verification system for concurrent programs that manipulate regions of shared memory. The centre of our system is novel logical principles and an entailment procedure that can infer the residual heaps in the frame rule for a fragment of CSL-Perm with explicit arithmetical constraints for memory heaps' disjointness. This procedure enables the compositional reasoning for concurrent threads and function calls. We have implemented the proposal in a prototype tool called CoSl, tested it with 10 challenging concurrent programs, including those beyond the state-of-the-art, and confirmed the advantage of our approach.
☆ MetaGen: A DSL, Database, and Benchmark for VLM-Assisted Metamaterial Generation
Metamaterials are micro-architected structures whose geometry imparts highly tunable-often counter-intuitive-bulk properties. Yet their design is difficult because of geometric complexity and a non-trivial mapping from architecture to behaviour. We address these challenges with three complementary contributions. (i) MetaDSL: a compact, semantically rich domain-specific language that captures diverse metamaterial designs in a form that is both human-readable and machine-parsable. (ii) MetaDB: a curated repository of more than 150,000 parameterized MetaDSL programs together with their derivatives-three-dimensional geometry, multi-view renderings, and simulated elastic properties. (iii) MetaBench: benchmark suites that test three core capabilities of vision-language metamaterial assistants-structure reconstruction, property-driven inverse design, and performance prediction. We establish baselines by fine-tuning state-of-the-art vision-language models and deploy an omni-model within an interactive, CAD-like interface. Case studies show that our framework provides a strong first step toward integrated design and understanding of structure-representation-property relationships.
♻ ☆ Synthesizing DSLs for Few-Shot Learning
We study the problem of synthesizing domain-specific languages (DSLs) for few-shot learning in symbolic domains. Given a base language and instances of few-shot learning problems, where each instance is split into training and testing samples, the DSL synthesis problem asks for a grammar over the base language that guarantees that small expressions solving training samples also solve corresponding testing samples. We prove that the problem is decidable for a class of languages whose semantics over fixed structures can be evaluated by tree automata and when expression size corresponds to parse tree depth in the grammar, and, furthermore, the grammars solving the problem correspond to a regular set of trees. We also prove decidability results for variants of the problem where DSLs are only required to express solutions for input learning problems and where DSLs are defined using macro grammars.
♻ ☆ Place Capability Graphs: A General-Purpose Model of Rust's Ownership and Borrowing Guarantees
Rust's novel type system has proved an attractive target for verification and program analysis tools, due to the rich guarantees it provides for controlling aliasing and mutability. However, fully understanding, extracting and exploiting these guarantees is subtle and challenging: existing models for Rust's type checking either support a smaller idealised language disconnected from real-world Rust code, or come with severe limitations in terms of precise modelling of Rust borrows, composite types storing them, function signatures and loops. In this paper, we present a novel model of Rust's type-checking called Place Capability Graphs, which lifts these limitations, and which can be directly calculated from the Rust compiler's own programmatic representations and analyses. We demonstrate that our model supports over 97% of Rust functions in the most popular public crates, and show its suitability as a general-purpose basis for verification and program analysis tools by developing promising new prototype versions of the existing Flowistry and Prusti tools.
♻ ☆ Complete the Cycle: Reachability Types with Expressive Cyclic References (Extended Version)
Local reasoning about programs that combine aliasing and mutable state is a longstanding challenge. Existing approaches -- ownership systems, linear and affine types, uniqueness types, and lexical effect tracking -- impose global restrictions such as uniqueness or linearity, or rely on shallow syntactic analyses. These designs fall short with higher-order functions and shared mutable state. Reachability Types (RT) track aliasing and separation in higher-order programs, ensuring runtime safety and non-interference. However, RT systems face three key limitations: (1) they prohibit cyclic references, ruling out non-terminating computations and fixed-point combinators; (2) they require deep tracking, where a qualifier must include all transitively reachable locations, reducing precision and hindering optimizations like fine-grained parallelism; and (3) referent qualifier invariance prevents referents from escaping their allocation contexts, making reference factories inexpressible. In this work, we address these limitations by extending RT with three mechanisms that enhance expressiveness. First, we introduce cyclic references, enabling recursive patterns to be encoded directly through the store. Second, we adopt shallow qualifier tracking, decoupling references from their transitively reachable values. Finally, we introduce an escaping rule with reference subtyping, allowing referent qualifiers to outlive their allocation context. These extensions are formalized in the $\mathsf{F}_{<:}^{\circ}$-calculus with a mechanized proof of type soundness, and case studies illustrate expressiveness through fixpoint combinators, non-interfering parallelism, and escaping
♻ ☆ Linear Layouts: Robust Code Generation of Efficient Tensor Computation Using $\mathbb{F}_2$
Efficient tensor computation is a cornerstone of modern deep learning (DL) workloads, yet existing approaches struggle to achieve flexible and performant design and implementation of tensor layouts -- mappings between logical tensors and hardware resources. The increasing complexity of DL algorithms and hardware demands a generic and systematic approach to handling tensor layouts. In this work, we introduce Linear Layouts, a novel approach that models tensor layouts using linear algebra over $\mathbb{F}_2$. By representing tensor layouts as binary matrices acting on the bits of the hardware representation, our approach enables a generic layout definition -- as opposed to the classical case-by-case approach -- and allows for generic layout-to-layout conversions, eliminating the quadratic explosion that plagues existing solutions. We integrate linear layouts with Triton and demonstrate their effectiveness in optimizing individual Triton operators as well as kernels written in Triton. We also show that linear layouts reduce engineering effort in the compiler backend while fixing several bugs in Triton's legacy layout system.
♻ ☆ Optimized Execution of FreeCHR
Constraint Handling Rules (CHR) is a rule-based programming language that rewrites collections of constraints. It is typically embedded into a general-purpose language. There exists a plethora of implementation for numerous host languages. However, the existing implementations often re-invent the method of embedding, which impedes maintenance and weakens assertions of correctness. To formalize and thereby standardize the embedding of a ground subset of CHR into arbitrary host languages, we introduced the framework FreeCHR and proved it to be a valid representation of classical CHR. For the sake of simplicity, abstract implementations of our framework did not yet include a concrete matching algorithm nor optimizations. In this paper, we introduce an improved execution and matching algorithm for FreeCHR. We also provide empirical evaluation of the algorithm.
comment: This is a preprint of a paper submitted to the 39th Workshop on (Constraint and Functional) Logic Programming (WLP 2025); minor changes throughout the paper, additional explanations in sections 3.3 to Def 1; re-arranged section 3, added appendix
♻ ☆ Efficient Decrease-and-Conquer Linearizability Monitoring
Linearizability has become the de facto correctness specification for implementations of concurrent data structures. While formally verifying such implementations remains challenging, linearizability monitoring has emerged as a promising first step to rule out early problems in the development of custom implementations, and serves as a key component in approaches that stress test such implementations. In this work, we investigate linearizability monitoring -- check if an execution history of an implementation is linearizable. While this problem is intractable in general, a systematic understanding of when it becomes tractable has remained elusive. We revisit this problem and first present a unified `decrease-and-conquer' algorithmic framework for linearizability monitoring. At its heart, this framework asks to identify special linearizability-preserving values in a given history -- values whose presence yields an equilinearizable sub-history when removed, and whose absence indicates non-linearizability. We prove that a polynomial time algorithm for the problem of identifying linearizability-preserving values, yields a polynomial time algorithm for linearizability monitoring, while conversely, intractability of this problem implies intractability of the monitoring problem. We demonstrate our framework's effectiveness by instantiating it for several popular data types -- sets, stacks, queues and priority queues -- deriving polynomial time algorithms for each, with the unambiguity restriction, where each insertion to the underlying data structure adds a distinct value. We optimize these algorithms to achieve the optimal log-linear time complexity by amortizing the cost of solving sub-problems through efficient data structures. Our implementation and evaluation on publicly available implementations show that our approach scales to large histories and outperforms existing tools.
♻ ☆ A2HCoder: An LLM-Driven Coding Agent for Hierarchical Algorithm-to-HDL Translation
In wireless communication systems, stringent requirements such as ultra-low latency and power consumption have significantly increased the demand for efficient algorithm-to-hardware deployment. However, a persistent and substantial gap remains between algorithm design and hardware implementation. Bridging this gap traditionally requires extensive domain expertise and time-consuming manual development, due to fundamental mismatches between high-level programming languages like MATLAB and hardware description languages (HDLs) such as Verilog-in terms of memory access patterns, data processing manners, and datatype representations. To address this challenge, we propose A2HCoder: a Hierarchical Algorithm-to-HDL Coding Agent, powered by large language models (LLMs), designed to enable agile and reliable algorithm-to-hardware translation. A2HCoder introduces a hierarchical framework that enhances both robustness and interpretability while suppressing common hallucination issues in LLM-generated code. In the horizontal dimension, A2HCoder decomposes complex algorithms into modular functional blocks, simplifying code generation and improving consistency. In the vertical dimension, instead of relying on end-to-end generation, A2HCoder performs step-by-step, fine-grained translation, leveraging external toolchains such as MATLAB and Vitis HLS for debugging and circuit-level synthesis. This structured process significantly mitigates hallucinations and ensures hardware-level correctness. We validate A2HCoder through a real-world deployment case in the 5G wireless communication domain, demonstrating its practicality, reliability, and deployment efficiency.
comment: 15 pages, 6 figures
Programming Languages
☆ Who Wins the Race? (R Vs Python) - An Exploratory Study on Energy Consumption of Machine Learning Algorithms
The utilization of Machine Learning (ML) in contemporary software systems is extensive and continually expanding. However, its usage is energy-intensive, contributing to increased carbon emissions and demanding significant resources. While numerous studies examine the performance and accuracy of ML, only a limited few focus on its environmental aspects, particularly energy consumption. In addition, despite emerging efforts to compare energy consumption across various programming languages for specific algorithms and tasks, there remains a gap specifically in comparing these languages for ML-based tasks. This paper aims to raise awareness of the energy costs associated with employing different programming languages for ML model training and inference. Through this empirical study, we measure and compare the energy consumption along with run-time performance of five regression and five classification tasks implemented in Python and R, the two most popular programming languages in this context. Our study results reveal a statistically significant difference in costs between the two languages in 95% of the cases examined. Furthermore, our analysis demonstrates that the choice of programming language can influence energy efficiency significantly, up to 99.16% during model training and up to 99.8% during inferences, for a given ML task.
comment: 18 pages including references, 5 figures
☆ Borrowing Dirty Qubits in Quantum Programs
Dirty qubits are ancillary qubits that can be borrowed from idle parts of a computation, enabling qubit reuse and reducing the demand for fresh, clean qubits-a resource that is typically scarce in practice. For such reuse to be valid, the initial states of the dirty qubits must not affect the functionality of the quantum circuits in which they are employed. Moreover, their original states, including any entanglement they possess, must be fully restored after use-a requirement commonly known as safe uncomputation. In this paper, we formally define the semantics of dirty-qubit borrowing as a feature in quantum programming languages, and introduce a notion of safe uncomputation for dirty qubits in quantum programs. We also present an efficient algorithm, along with experimental results, for verifying safe uncomputation of dirty qubits in certain quantum circuits.
Programming Languages
☆ Syntactic Completions with Material Obligations
Code editors provide essential services that help developers understand, navigate, and modify programs. However, these services often fail in the presence of syntax errors. Existing syntax error recovery techniques, like panic mode and multi-option repairs, are either too coarse, e.g. in deleting large swathes of code, or lead to a proliferation of possible completions. This paper introduces $\texttt{tylr}$, a parser and editor generator that completes arbitrarily malformed code by inserting obligations, which generalize holes to cover missing operands, operators, mixfix keywords, and sort transitions. $\texttt{tylr}$ is backed by a novel theory of tile-based parsing, which extends operator-precedence parsing in two ways. First, traditional token precedence comparisons are replaced by a notion of grammar walks, which form the basis for generating obligations. Second, a distinct "molding" system based on grammar zippers expand grammar expressivity by allowing the system to disambiguate between possible parses and completions based on an obligation minimization criterion. In addition to serving as a novel approach to error correction, $\texttt{tylr}$'s design enables the development of an editor that visually materializes obligations to the human user, serving as a novel hybrid between a text editor and a structure editor. We introduce $\texttt{tylr}$ by example, then formalize its key ideas. Finally, we conduct a human subjects study to evaluate the extent to which an editor like $\texttt{tylr}$ that materializes syntactic obligations might be usable and useful, finding both points of positivity and interesting new avenues for future work.
♻ ☆ Products of Recursive Programs for Hypersafety Verification (Extended Version)
We study the problem of automated hypersafety verification of infinite-state recursive programs. We propose an infinite class of product programs, specifically designed with recursion in mind, that reduce the hypersafety verification of a recursive program to standard safety verification. For this, we combine insights from language theory and concurrency theory to propose an algorithmic solution for constructing an infinite class of recursive product programs. One key insight is that, using the simple theory of visibly pushdown languages, one can maintain the recursive structure of syntactic program alignments which is vital to constructing a new product program that can be viewed as a classic recursive program -- that is, one that can be executed on a single stack. Another key insight is that techniques from concurrency theory can be generalized to help define product programs based on the view that the parallel composition of individual recursive programs includes all possible alignments from which a sound set of alignments that faithfully preserve the satisfaction of the hypersafety property can be selected. On the practical side, we formulate a family of parametric canonical product constructions that are intuitive to programmers and can be used as building blocks to specify recursive product programs for the purpose of relational and hypersafety verification, with the idea that the right product program can be verified automatically using existing techniques. We demonstrate the effectiveness of these techniques through an implementation and highly promising experimental results.
comment: 26 pages of main text (7 figures) and 45 pages in total; extended version of the paper "Products of Recursive Programs for Hypersafety Verification" accepted at OOPSLA 2025
Programming Languages
☆ SafeTree: Expressive Tree Policies for Microservices
A microservice-based application is composed of multiple self-contained components called microservices, and controlling inter-service communication is important for enforcing safety properties. Presently, inter-service communication is configured using microservice deployment tools. However, such tools only support a limited class of single-hop policies, which can be overly permissive because they ignore the rich service tree structure of microservice calls. Policies that can express the service tree structure can offer development and security teams more fine-grained control over communication patterns. To this end, we design an expressive policy language to specify service tree structures, and we develop a visibly pushdown automata-based dynamic enforcement mechanism to enforce service tree policies. Our technique is non-invasive: it does not require any changes to service implementations, and does not require access to microservice code. To realize our method, we build a runtime monitor on top of a service mesh, an emerging network infrastructure layer that can control inter-service communication during deployment. In particular, we employ the programmable network traffic filtering capabilities of Istio, a popular service mesh implementation, to implement an online and distributed monitor. Our experiments show that our monitor can enforce rich safety properties while adding minimal latency overhead on the order of milliseconds.
♻ ☆ Automated Discovery of Tactic Libraries for Interactive Theorem Proving
Enabling more concise and modular proofs is essential for advancing formal reasoning using interactive theorem provers (ITPs). Since many ITPs, such as Rocq and Lean, use tactic-style proofs, learning higher-level custom tactics is crucial for proof modularity and automation. This paper presents a novel approach to tactic discovery, which leverages Tactic Dependence Graphs (TDGs) to identify reusable proof strategies across multiple proofs. TDGs capture logical dependencies between tactic applications while abstracting away irrelevant syntactic details, allowing for both the discovery of new tactics and the refactoring of existing proofs into more modular forms. We have implemented this technique in a tool called TacMiner and compare it against an anti-unification-based approach Peano to tactic discovery. Our evaluation demonstrates that TacMiner can learn 3x as many tactics as Peano and reduces the size of proofs by 26% across all benchmarks. Furthermore, our evaluation demonstrates the benefits of learning custom tactics for proof automation, allowing a state-of-the-art proof automation tool to achieve a relative increase of 172% in terms of success rate.
♻ ☆ Proof Repair across Quotient Type Equivalences
Proofs in proof assistants like Rocq can be brittle, breaking easily in response to changes. To address this, recent work introduced an algorithm and tool in Rocq to automatically repair broken proofs in response to changes that correspond to type equivalences. However, many changes remained out of the scope of this algorithm and tool -- especially changes in underlying \emph{behavior}. We extend this proof repair algorithm so that it can express certain changes in behavior that were previously out of scope. We focus in particular on equivalences between \emph{quotient types} -- types equipped with a relation that describes what it means for any two elements of that type to be equal. Quotient type equivalences can be used to express interesting changes in representations of mathematical structures, as well as changes in the implementations of data structures. We extend this algorithm and tool to support quotient type equivalences in Rocq. Notably, since Rocq lacks quotient types entirely, our extensions use Rocq's setoid machinery in place of quotients. Specifically, (1) our extension to the algorithm supports new changes corresponding to setoids, and (2) our extension to the tool supports this new class of changes and further automates away some of the new proof obligations. We demonstrate our extensions on proof repair case studies for previously unsupported changes. We also perform manual proof repair in Cubical Agda, a language with a univalent metatheory, which allows us to construct the first ever internal proofs of correctness for proof repair.
comment: 26 pages, 22 figures, for associated code, see https://github.com/uwplse/pumpkin-pi/releases/tag/oopsla2025
♻ ☆ Software Model Checking via Summary-Guided Search (Extended Version)
In this work, we describe a new software model-checking algorithm called GPS. GPS treats the task of model checking a program as a directed search of the program states, guided by a compositional, summary-based static analysis. The summaries produced by static analysis are used both to prune away infeasible paths and to drive test generation to reach new, unexplored program states. GPS can find both proofs of safety and counter-examples to safety (i.e., inputs that trigger bugs), and features a novel two-layered search strategy that renders it particularly efficient at finding bugs in programs featuring long, input-dependent error paths. To make GPS refutationally complete (in the sense that it will find an error if one exists, if it is allotted enough time), we introduce an instrumentation technique and show that it helps GPS achieve refutation-completeness without sacrificing overall performance. We benchmarked GPS on a suite of benchmarks including both programs from the Software Verification Competition (SV-COMP) and from prior literature, and found that our implementation of GPS outperforms state-of-the-art software model checkers (including the top performers in SV-COMP ReachSafety-Loops category), both in terms of the number of benchmarks solved and in terms of running time.
comment: Extended version of paper in OOPSLA 2025 (with updated evaluations section compared to v1 manuscript). 37 pages