MyArxiv Feed
High Energy Physics - Experimental
☆ Extraction of ground-state nuclear deformations from ultra-relativistic heavy-ion collisions: Nuclear structure physics context
The collective-flow-assisted nuclear shape-imaging method in ultra-relativistic heavy-ion collisions has recently been used to characterize nuclear collective states. In this paper, we assess the foundations of the shape-imaging technique employed in these studies. We conclude that, on the whole, the discussion regarding low-energy nuclear physics is confusing and the suggested impact on nuclear structure research is overstated. Conversely, efforts to incorporate existing knowledge on nuclear shapes into analysis pipelines can be beneficial for benchmarking tools and calibrating models used to extract information from ultra-relativistic heavy ion experiments.
comment: Comments welcome
☆ Subthreshold poles in electron-positron annihilation. $\rm D_s^+\rm D_s^-$ final state
We present the evidence of a subthreshold pole in the $e^+ e^-\rightarrow \rm D_s^+\rm D_s^-$ process, which can be interpreted as an excited (2p) state of the $\rm D_s^+\rm D_s^-$ molecule with a binding energy of (28.1$^{+12.1}_{-9.3}$)~MeV.
comment: 3 pages, 2 figures
☆ Electromagnetic energy calibration of the SoLid detector with horizontal muons
SoLid is a neutrino experiment at very-short baseline searching for active-to-sterile oscillations of reactor antineutrinos. The detection principle is based on the pairing of two types of solid scintillators: polyvinyl toluene and $^6$Li:ZnS(Ag), which is a new technology used in this field of Physics. In addition to good neutron-gamma discrimination, this setup allows the detector to be highly segmented; the basic detection unit is a 5 cm cube. High segmentation provides numerous advantages including precise localisation of the Inverse Beta Decay (IBD) products, the derivation of an antineutrino energy estimator based on the isolated positron energy, and a powerful background reduction tool that relies on the topological signature of the signal. Finally, the system is read out by a network of WLS fibres coupled to photosensors. A relative electromagnetic calibration is performed with horizontal cosmic muons. This source poses the simplest calibration problem in which a single detection unit is involved. In addition, large muon energy deposits allow us to perform a calibration at the most detailed level (i.e. per fibre) and to accurately define the fraction of energy escaping to neighbouring detection cells. A statistical precision at the sub-percent level is reached. The paper also discusses two methods to calibrate the absolute energy scale. The first method relies on horizontal muons, though the precision is limited to around 10\% because of the uncertainty in the energy distribution of such muons. A novel, alternative method based on the radioactive AmBe source is proposed. It takes advantage of the electron-positron pair-production process and provides a calibration point at 3.4 MeV (i.e. in the core of the IBD positron spectrum). The paper is concluded with various cross-check including a determination of the energy spectrum of the standard cosmogenic background candle: $^{12}$B.
comment: 23 pages, 23 figures
☆ Recasting Experimental Constraints on Relativistic Magnetic Monopoles
Magnetic monopoles with masses up to $10^{14}$ GeV can be accelerated to relativistic velocities in Galactic and intergalactic magnetic fields. The cosmic flux of relativistic monopoles is constrained by various experiments, with the limits given as functions of the monopole velocity (Lorentz factor) at the detectors. The velocity, however, is usually treated as a free parameter due to the ambiguity in the computation of the acceleration before the monopoles arrive at Earth. We explicitly evaluate the velocity by exploiting recent studies on cosmic magnetic fields and the monopole acceleration therein, to recast experimental limits in terms of the mass of monopoles. By applying our method to various terrestrial experiments, including the Pierre Auger Observatory, IceCube, MACRO, and the upcoming Cherenkov Telescope Array Observatory, as well as to astrophysical constraints, we report limits on the flux of monopoles for a wide range of monopole masses. We also highlight the role of monopoles as messengers of cosmic magnetic fields, and discuss the possibility of using monopole experiments to probe intergalactic magnetic fields.
comment: 14 pages, 9 figures
Search for heavy pseudoscalar and scalar bosons decaying to a top quark pair in proton-proton collisions at $\sqrt{s}$ = 13 TeV
A search for pseudoscalar or scalar bosons decaying to a top quark pair ($\mathrm{t\bar{t}}$) in final states with one or two charged leptons is presented. The analyzed proton-proton collision data was recorded at $\sqrt{s}$ = 13 TeV by the CMS experiment at the CERN LHC and corresponds to an integrated luminosity of 138 fb$^{-1}$. The invariant mass $m_\mathrm{t\bar{t}}$ of the reconstructed $\mathrm{t\bar{t}}$ system and variables sensitive to its spin and parity are used to discriminate against the standard model $\mathrm{t\bar{t}}$ background. Interference between pseudoscalar or scalar boson production and the standard model $\mathrm{t\bar{t}}$ continuum is included, leading to peak-dip structures in the $m_\mathrm{t\bar{t}}$ distribution. An excess of the data above the background prediction, based on perturbative quantum chromodynamics (QCD) calculations, is observed near the kinematic $\mathrm{t\bar{t}}$ production threshold, while good agreement is found for high $m_\mathrm{t\bar{t}}$. The data are consistent with the background prediction if the contribution from the production of a color-singlet ${}^1\mathrm{S}_0^{[1]}$ $\mathrm{t\bar{t}}$ quasi-bound state $\eta_\mathrm{t}$, predicted by nonrelativistic QCD, is added. Upper limits at 95% confidence level are set on the coupling between the pseudoscalar or scalar bosons and the top quark for boson masses in the range 365$-$1000 GeV, relative widths between 0.5 and 25%, and two background scenarios with or without $\eta_\mathrm{t}$ contribution.
comment: Submitted to Reports on Progress in Physics. All figures and tables can be found at http://cms-results.web.cern.ch/cms-results/public-results/publications/HIG-22-013 (CMS Public Pages)
☆ Weak decays of $B_s$ meson in self-consistent covariant light-front approach
We present a comprehensive study of weak transition form factors and decay observables of the $B_s$ meson in transitions to pseudoscalar ($P$) and vector ($V$) mesons. The $B_s \to P(V)$ form factors are calculated using the self-consistent covariant light-front quark model, with a model-independent $z$-series expansion calibrated to lattice QCD and phenomenological inputs for quark masses and $\beta$ parameters. This enables reliable $q^2$-resolved predictions across the full kinematic range. Based on these form factors, we predict branching ratios and angular observables, including forward-backward asymmetries, polarization fractions, and leptonic convexity parameters for semileptonic decays. Nonleptonic two-body decay rates for $B_s \to PP$ and $B_s \to PV$ modes are also computed within the standard factorization framework using the same dynamical input. Comparisons with results from lattice QCD, light-cone sum rules, and other approaches are presented, highlighting both theoretical consistency and persistent tensions with experiment.
comment: 40 pages, 10 tables, 11 figures
Real-Time Graph-based Point Cloud Networks on FPGAs via Stall-Free Deep Pipelining
Graph-based Point Cloud Networks (PCNs) are powerful tools for processing sparse sensor data with irregular geometries, as found in high-energy physics detectors. However, deploying models in such environments remains challenging due to stringent real-time requirements for both latency, and throughput. In this work, we present a deeply pipelined dataflow architecture for executing graph-based PCNs on FPGAs. Our method supports efficient processing of dynamic, sparse point clouds while meeting hard real-time constraints. We introduce specialized processing elements for core graph operations, such as GraVNet convolution and condensation point clustering, and demonstrate our design on the AMD Versal VCK190. Compared to a GPU baseline, our FPGA implementation achieves up to 5.25x speedup in throughput while maintaining latencies below 10 {\mu}s, satisfying the demands of real-time trigger systems in particle physics experiments. An open-source reference implementation is provided.
comment: Accepted to IEEE SBCCI 2025
Observation of the decays $B^{+} \to Σ_{c}(2455)^{++} \overlineΞ_{c}^{-}$ and $B^{0} \to Σ_{c}(2455)^{0} \overlineΞ_{c}^{0}$
We report the first observation of the two-body baryonic decays $B^{+} \to \Sigma_{c}(2455)^{++} \overline{\Xi}_{c}^{-}$ and $B^{0} \to \Sigma_{c}(2455)^{0} \overline{\Xi}_{c}^{0}$ with significances of $7.3\,\sigma$ and $6.2\,\sigma$, respectively, including statistical and systematic uncertainties. The branching fractions are measured to be $\mathcal{B}(B^{+} \to \Sigma_{c}(2455)^{++} \overline{\Xi}_{c}^{-}) = (5.74 \pm 1.11 \pm 0.42_{-1.53}^{+2.47}) \times 10^{-4}$ and $\mathcal{B}(B^{0} \to \Sigma_{c}(2455)^{0} \overline{\Xi}_{c}^{0}) = (4.83 \pm 1.12 \pm 0.37_{-0.60}^{+0.72}) \times 10^{-4}$. The first and second uncertainties are statistical and systematic, respectively, while the third ones arise from the absolute branching fractions of $\overline{\Xi}_{c}^{-}$ or $\overline{\Xi}_{c}^{0}$ decays. The data samples used for this analysis have integrated luminosities of 711~$\mathrm{fb}^{-1}$ and 365~$\mathrm{fb}^{-1}$, and were collected at the $\Upsilon(4S)$ resonance by the Belle and Belle~II detectors operating at the KEKB and SuperKEKB asymmetric-energy $e^{+}e^{-}$ colliders, respectively.
☆ Recent studies on heavy-flavor femtoscopy in heavy-ion collisions by STAR
At the initial stage of nuclear-nuclear collisions, heavy quarks are generated in hard partonic scatterings. This allows them to participate in the entire evolution of the heavy-ion collisions. During hadronization, different type of hadrons are produced including D mesons and light-flavoured hadrons, like pion ($\pi$), kaon ($K$), proton ($p$) etc. We can observe different interactions between these hadrons based on the size of collision systems. Such as, hadron re-scattering, suppression of charm quarks and collective effects are weak or missing in $p+p$ system in comparison to $Au+Au$ or $Pb+Pb$ collision system. Femtoscopy is one of the most significant and unique tools for examining the final state interaction behaviors between correlated pair of particles at low momentum in a pair rest frame. It is also possible to explore the size and geometry of emission source through the measurements of femtoscopic correlation functions. Here we report the studies of correlations between neutral charmed meson ($D^0$ / $\overline{D^0}$) and identified charged hadrons ($\pi^{\pm}$, $K^{\pm}$, $p^{\pm}$) in Au+Au collisions at STAR experiment using femtoscopy technique. This is the first measurement of heavy-flavor femtoscopy in heavy-ion collisions at 200 GeV. STAR results can provide valuable insights into the interactions between $D^{0}/\overline{D^0}$-$\pi^{\pm}$, $D^{0}/\overline{D^0}$-$K^{\pm}$ and $D^{0}/\overline{D^0}$-$p^{\pm}$ pairs during the hadronic phase. $D^0 (\overline{D^0})$ mesons are reconstructed via the $K^{\mp}-{\pi}^{\pm}$ decay channel using topological criteria enabled by the HFT (Heavy Flavor Tracker) detector with excellent track pointing resolution. These proceedings show a comparison study between STAR results and theory predictions using NLO-HMChPT (Next-to-Leading Order-Heavy Meson Chiral Perturbation Theory) scheme and associated physics implications.
comment: 6 pages, 2 figures, Zimanyi School 2024
☆ Replacing detector simulation with heterogeneous GNNs in flavour physics analyses
Driven by the increasing volume of recorded data, the demand for simulation from experiments based at the Large Hadron Collider will rise sharply in the coming years. Addressing this demand solely with existing computationally intensive workflows is not feasible. This paper introduces a new fast simulation tool designed to address this demand at the LHCb experiment. This tool emulates the detector response to arbitrary multibody decay topologies at LHCb. Rather than memorising specific decay channels, the model learns generalisable patterns within the response, allowing it to interpolate to channels not present in the training data. Novel heterogeneous graph neural network architectures are employed that are designed to embed the physical characteristics of the task directly into the network structure. We demonstrate the performance of the tool across a range of decay topologies, showing the networks can correctly model the relationships between complex variables. The architectures and methods presented are generic and could readily be adapted to emulate workflows at other simulation-intensive particle physics experiments.
Measurement of the $ D^{0}\rightarrow K^{-}π^{+}e^{+}e^{-} $ branching fraction and search for $ D^{0}\rightarrow π^{+}π^{-}e^{+}e^{-} $ and $D^{0}\rightarrow K^{+}K^{-}e^{+}e^{-} $ decays at Belle
We present a study of the rare charm meson decays $ D^{0}\rightarrow K^{+}K^{-}e^{+}e^{-} $, $ \pi^{+}\pi^{-}e^{+}e^{-} $, and $ K^{-}\pi^{+}e^{+}e^{-} $ using a 942 fb$^{-1}$ data set collected by the Belle detector at the KEKB asymmetric-energy $ e^{+}e^{-} $ collider. We use $ D^{0} $ candidates identified by the charge of the pion in $ D^{*} \rightarrow D^{0} \pi $ decays and normalize the branching fractions to $ D^{0} \rightarrow K^{-}\pi^{+}\pi^{-}\pi^{+} $ decays. The branching fraction for decay $ D^{0} \rightarrow K^{-}\pi^{+}e^{+}e^{-} $ is measured to be (39.6 $\pm$ 4.5 (stat) $\pm$ 2.9 (syst)) $\times$ $10^{-7}$, with the dielectron mass in the $ \rho/\omega $ mass region $ 675 < m_{ee} < 875 $ MeV$/c^{2}$. We also search for $ D^{0}\rightarrow h^{-} h^{(\prime)+}e^{+}e^{-} $ ($ h^{(\prime)}=K,\,\pi $) decays with the dielectron mass near the $\eta$ and $\phi$ resonances, and away from these resonances for the $ K^{+}K^{-}e^{+}e^{-} $ and $ \pi^{+}\pi^{-}e^{+}e^{-} $ modes. For these modes, we find no significant signals and set 90$\%$ confidence level upper limits on their branching fractions at the $\mathcal{O}$(10$^{-7}$) level.
Cross sections of $η$ mesons in $p$$+$$p$ collisions at forward rapidity at $\sqrt{s}=500$ GeV and central rapidity at $\sqrt{s}=510$ GeV
We present the first measurements of the forward and midrapidity $\eta$-meson cross sections from $p$$+$$p$ collisions at $\sqrt{s}=500$ and $510$~GeV, respectively. We also report the midrapidity $\eta/\pi^0$ ratio at 510 GeV. The forward cross section is measured differentially in $\eta$-meson transverse momentum ($p_T$) from 1.0 to 6.5~GeV/$c$ for pseudorapidity $3.0<|\eta|<3.8$. The midrapidity cross section is measured from 3.5 to 44 GeV/$c$ for pseudorapidity $|\eta|<0.35$. Both cross sections serve as critical inputs to an updated global analysis of the $\eta$-meson fragmentation functions.
comment: 500 authors from 81 institutions, 14 pages, 7 figures, 3 tables. v1 is version submitted to Physical Review D. HEPdata tables for the points plotted in figures for this and previous PHENIX publications are (or will be) publicly available at http://www.phenix.bnl.gov/papers.html
☆ More visualisation of decay-time-dependent asymmetries in multibody B-meson decays
Methods have been proposed recently to weight data in order to allow visualisation of CP-violation effects in transitions of neutral B mesons to multibody final states that are not CP-eigenstates. These are useful since integration of the unweighted data over the phase space would otherwise wash out the effects of interest. A similar method, elaborated upon here, with a different weighting function can also be used to visualise CP-conserving $B_{(s)}^0$--$\overline{B}{}_{(s)}^0$ oscillations, rather than the CP-violating asymmetries. Together with the other weighting functions, this new method could be useful, for example, to demonstrate the accuracy of the calibration of the flavour tagging algorithms that are crucial for analyses such as the measurement of the CP-violating phase in $B_s^0 \to J/\!\psi\phi$ decays. Their application to the formalism in common use for such decays is explicitly demonstrated.
comment: Extension of method presented in arXiv:2401.13473 by same authors. 14 pages, 3 figures
☆ Modelling top-quark decays in $t\bar{t}t\bar{t}$ production at the LHC
We compare the fixed-order NLO QCD predictions for the $pp\to t\bar{t}t\bar{t}+X$ process in the $4\ell$ decay channel with the parton-shower based results obtained with the POWHEG and MC@NLO matching methods. In the first case, NLO QCD corrections are consistently included in both the $t\bar{t}t\bar{t}$ production step and the decays of the four top quarks, preserving all spin correlations. In the second approach, higher-order effects in top-quark decays with approximate spin correlations are simulated in the PYTHIA parton-shower framework. Additionally, we analyse the impact of including the so-called matrix element corrections in top-quark decays in both parton-shower matched predictions. The comparison is performed at the integrated and differential fiducial cross-section level for the LHC centre-of-mass energy of $\sqrt{s}=13.6$ TeV.
comment: 22 pages, 5 figures, 1 table
♻ ☆ Dispersive Determination of Nucleon Gravitational Form Factors
Being closely connected to the origin of the nucleon mass, the gravitational form factors of the nucleon have attracted significant attention in recent years. We present the first model-independent determinations of the gravitational form factors of the pion and nucleon at the physical pion mass, using a data-driven dispersive approach. The so-called "last global unknown property" of the nucleon, the $D$-term, is determined to be $-3.38^{+0.34}_{-0.35}$. The root mean square radius of the scalar trace density inside the nucleon is determined to be $(0.97 \pm0.03)~\text{fm}$. Notably, this value is larger than the proton charge radius, suggesting a modern structural view of the nucleon where gluons, responsible for most of the nucleon mass, are distributed over a larger spatial region than quarks, which dominate the charge distribution, indicating that the radius of the trace density may be regarded as a confinement radius. We also predict the nucleon angular momentum and mechanical radii, providing further insights into the intricate internal structure of the nucleon.
comment: v2: 10 pages, 6 figures, 1 table. Adopted a more conservative error analysis: minor shifts in results (central values unchanged). Expanded supplementary material: added derivations of GFF unitarity relations and input discussions. Clarifications of mass radius added, title changed, main conclusions unchanged
♻ ☆ Precision measurement of the longitudinal double-spin asymmetry for dijet production at intermediate pseudorapidity in polarized $pp$ collisions at $\sqrt{s}$ = 200 GeV
The STAR Collaboration reports precise measurements of the longitudinal double-spin asymmetry, $A_{LL}$, for dijet production with at least one jet at intermediate pseudorapidity $0.8 < \eta_{\rm jet} < 1.8$ in polarized proton-proton collisions at a center-of-mass energy of 200 GeV. This study explores partons scattered with a longitudinal momentum fraction ($x$) from 0.01 to 0.5, which are predominantly characterized by interactions between high-$x$ valence quarks and low-$x$ gluons. The results are in good agreement with previous measurements at 200 GeV with improved precision and are found to be consistent with the predictions of global analyses that find the gluon polarization to be positive. In contrast, the negative gluon polarization solution from the JAM Collaboration is found to be strongly disfavored.
comment: 17 pages, 9 figures
♻ ☆ Scalable Multi-Task Learning for Particle Collision Event Reconstruction with Heterogeneous Graph Neural Networks
The growing luminosity frontier at the Large Hadron Collider is challenging the reconstruction and analysis of particle collision events. Increased particle multiplicities are straining latency and storage requirements at the data acquisition stage, while new complications are emerging, including higher background levels and more frequent particle vertex misassociations. This in turn necessitates the development of more holistic and scalable reconstruction methods that take advantage of recent advances in machine learning. We propose a novel Heterogeneous Graph Neural Network (HGNN) architecture featuring unique representations for diverse particle collision relationships and integrated graph pruning layers for scalability. Trained with a multi-task paradigm in an environment mimicking the LHCb experiment, this HGNN significantly improves beauty hadron reconstruction performance. Notably, it concurrently performs particle vertex association and graph pruning within a single framework. We quantify reconstruction and pruning performance, demonstrate enhanced inference time scaling with event complexity, and mitigate potential performance loss using a weighted message passing scheme.
comment: 21 pages, 10 figures, 4 tables (planned submission to Machine Learning Science and Technology)
High Energy Physics - Phenomenology
☆ Extraction of ground-state nuclear deformations from ultra-relativistic heavy-ion collisions: Nuclear structure physics context
The collective-flow-assisted nuclear shape-imaging method in ultra-relativistic heavy-ion collisions has recently been used to characterize nuclear collective states. In this paper, we assess the foundations of the shape-imaging technique employed in these studies. We conclude that, on the whole, the discussion regarding low-energy nuclear physics is confusing and the suggested impact on nuclear structure research is overstated. Conversely, efforts to incorporate existing knowledge on nuclear shapes into analysis pipelines can be beneficial for benchmarking tools and calibrating models used to extract information from ultra-relativistic heavy ion experiments.
comment: Comments welcome
☆ Interacting Scalar Fields as Dark Energy and Dark Matter in Einstein scalar Gauss Bonnet Gravity
A Gauss-Bonnet (GB) coupled scalar field $\phi$, responsible for the late-time cosmic acceleration and interacting with a coherent scalar field $\psi$ through an interaction potential $W(\phi,\psi)$, is considered from the point of view of particle physics for two different models. The non-minimal coupling between the GB curvature term and the field $\phi$ leads to a time-dependent speed of gravitational waves (GWs), which is fixed to unity in order to be consistent with current GW observations, rendering the GB coupling function model-independent. We investigate the dynamical stability of the system by formulating it in the form of an autonomous system, and constrain the model parameters using various sets of observational data. We find that both models are physically viable and closely follow the $\Lambda$CDM trend. We also use the updated Roman mock high redshift data to further constrain the parameters of the two models.
comment: 17 pages, 10 figures, 4 tables
☆ Subthreshold poles in electron-positron annihilation. $\rm D_s^+\rm D_s^-$ final state
We present the evidence of a subthreshold pole in the $e^+ e^-\rightarrow \rm D_s^+\rm D_s^-$ process, which can be interpreted as an excited (2p) state of the $\rm D_s^+\rm D_s^-$ molecule with a binding energy of (28.1$^{+12.1}_{-9.3}$)~MeV.
comment: 3 pages, 2 figures
☆ Recasting Experimental Constraints on Relativistic Magnetic Monopoles
Magnetic monopoles with masses up to $10^{14}$ GeV can be accelerated to relativistic velocities in Galactic and intergalactic magnetic fields. The cosmic flux of relativistic monopoles is constrained by various experiments, with the limits given as functions of the monopole velocity (Lorentz factor) at the detectors. The velocity, however, is usually treated as a free parameter due to the ambiguity in the computation of the acceleration before the monopoles arrive at Earth. We explicitly evaluate the velocity by exploiting recent studies on cosmic magnetic fields and the monopole acceleration therein, to recast experimental limits in terms of the mass of monopoles. By applying our method to various terrestrial experiments, including the Pierre Auger Observatory, IceCube, MACRO, and the upcoming Cherenkov Telescope Array Observatory, as well as to astrophysical constraints, we report limits on the flux of monopoles for a wide range of monopole masses. We also highlight the role of monopoles as messengers of cosmic magnetic fields, and discuss the possibility of using monopole experiments to probe intergalactic magnetic fields.
comment: 14 pages, 9 figures
☆ Science of the LISA mission: A Summary for the European Strategy for Particle Physics
The LISA mission is an international collaboration between ESA, its member states, and NASA, for the detection of gravitational waves from space. It was adopted in January 2024 and is scheduled for launch in the mid-2030's. It will be a constellation of three identical spacecraft forming a near-equilateral triangle in an heliocentric orbit, transferring laser beams over $2.5 \cdot 10^6$ km long arms. Laser interferometry is used to track separations between test masses, thus measuring spacetime strain variations as a function of time. LISA Science Objectives tackle many open questions in astrophysics, fundamental physics and cosmology, including ESA's Cosmic Vision questions "What are the fundamental laws of the universe?" and "How did the universe originate and of what is it made?". In this contribution, based on the LISA Red Book, we present a summary of the LISA Science Objectives relevant for the European Strategy for Particle Physics.
comment: 18 pages, 5 figures, input to the European Strategy for Particle Physics - 2026 update. arXiv admin note: text overlap with arXiv:2402.07571
☆ Weak decays of $B_s$ meson in self-consistent covariant light-front approach
We present a comprehensive study of weak transition form factors and decay observables of the $B_s$ meson in transitions to pseudoscalar ($P$) and vector ($V$) mesons. The $B_s \to P(V)$ form factors are calculated using the self-consistent covariant light-front quark model, with a model-independent $z$-series expansion calibrated to lattice QCD and phenomenological inputs for quark masses and $\beta$ parameters. This enables reliable $q^2$-resolved predictions across the full kinematic range. Based on these form factors, we predict branching ratios and angular observables, including forward-backward asymmetries, polarization fractions, and leptonic convexity parameters for semileptonic decays. Nonleptonic two-body decay rates for $B_s \to PP$ and $B_s \to PV$ modes are also computed within the standard factorization framework using the same dynamical input. Comparisons with results from lattice QCD, light-cone sum rules, and other approaches are presented, highlighting both theoretical consistency and persistent tensions with experiment.
comment: 40 pages, 10 tables, 11 figures
☆ Spin alignment of vector mesons in heavy-ion collision
We give a brief review on the spin alignment induced by the strong field and the shear-stress tensor. In experiments, a significant positive deviation from 1/3 is observed for the $\phi$ meson, which can be explained by the anisotropy of the strong field fluctuation in the meson's rest frame, while the anisotropy is mainly a consequence of the motion of meson relative to the quark-gluon plasma. On the other hand, the shear-induced spin alignment is of the order $10^{-4}\sim10^{-5}$ if the magnitude of thermal shear tensor is $10^{-2}$.
comment: 4 pages, 2 figures; proceedings of the The 10th Asian Triangle Heavy-Ion Conference (ATHIC 2025), Berhampur, Odisha, India, 13-16 Jan. 2025
☆ Connecting the Quenched $g_A$ in Nuclear Matter To Dense Compact-Star Matter
An argument is developed that the long-standing mystery in nuclear physics of the effective axial-current coupling constant in nuclei, $g_A^{\rm eff}\approx 1$, could be understood in terms of the mechanism referred to as ``pseudo-conformal sound speed" in dense compact-star matter, $v_{\rm pcs}^2/c^2\approx 1/3$. Both pros and cons are presented using an effective field theory anchored on renormalization-group approach to interacting baryons on the Fermi surface that enables one to go beyond Weinberg's highly successful EFT $\chi$EFT$_\pi$ with the pion field only (in nuclear medium) by implementing heavy-meson degrees of freedom. Both hidden local symmetry and hidden scale symmetry, the former involving the vector mesons $\rho$ and $\omega$ and the latter involving the hidden scalar meson, a dilaton $\hat{\sigma}$ ($f_0(500)$), play the crucial role. Going beyond the density regime applicable to normal nuclear matter $n_0$, the notion of ``hadron-quark continuity" is brought in via the topological structure of the nucleon, i.e., skyrmion considered to be valid in QCD at large $N_c$ limit. The new inputs for the argumentation are the large N limit of the Grassmanian model for hidden local symmetry and the IR fixed point in QCD for $N_f \leq 3$ involving ``genuine/QCD-conformal dilaton" for hidden scale symmetry.
comment: 11 pages, 3 figures
☆ Eta Fragmentation Functions Revisited
We revisit the extraction of parton-to-eta meson fragmentation functions at next-to-leading order accuracy in QCD in the light of the recent hadroproduction measurements in proton-proton collisions obtained by the PHENIX, LHCb, and ALICE collaborations. In addition to an increased precision, the data explore complementary rapidity ranges and center-of-mass system energies. The analysis exploits the theoretical scale dependence to ease tensions among the data sets at different energies that are potentially caused by QCD corrections beyond the next-to-leading order. The resulting set of fragmentation functions yields a consistent description of all available data. Estimates of uncertainties are obtained with the Monte Carlo replica method.
comment: 10 pages, 6 figures
☆ Effect of Off-diagonal NSI Parameters on Entanglement Measurements in Neutrino Oscillations
In this work, we explore the influence of off-diagonal non-standard interaction (NSI) parameters on quantum entanglement within the three-flavor neutrino oscillation framework. By expressing three key entanglement measures: Entanglement of Formation (EOF), Concurrence, and Negativity in terms of oscillation probabilities, we analyze how these quantum correlations are affected by the NSI parameters $\epsilon_{e\mu}$, $\epsilon_{e\tau}$, and $\epsilon_{\mu\tau}$, including their complex phases. Using the DUNE experiment as a benchmark, we find that NSI effects are most significant at low energies, with Negativity showing the highest sensitivity even at high energies. It is observed that $\epsilon_{e \mu}$ and $\epsilon_{e \tau}$ affect entanglement measures mainly through the appearance channel, while the impact of $\epsilon_{\mu \tau}$ on EOF, Concurrence, and Negativity is predominantly linked to the disappearance channel.
comment: 25 pages, 8 figures, 2 tables
♻ ☆ Dispersive Determination of Nucleon Gravitational Form Factors
Being closely connected to the origin of the nucleon mass, the gravitational form factors of the nucleon have attracted significant attention in recent years. We present the first model-independent determinations of the gravitational form factors of the pion and nucleon at the physical pion mass, using a data-driven dispersive approach. The so-called "last global unknown property" of the nucleon, the $D$-term, is determined to be $-3.38^{+0.34}_{-0.35}$. The root mean square radius of the scalar trace density inside the nucleon is determined to be $(0.97 \pm0.03)~\text{fm}$. Notably, this value is larger than the proton charge radius, suggesting a modern structural view of the nucleon where gluons, responsible for most of the nucleon mass, are distributed over a larger spatial region than quarks, which dominate the charge distribution, indicating that the radius of the trace density may be regarded as a confinement radius. We also predict the nucleon angular momentum and mechanical radii, providing further insights into the intricate internal structure of the nucleon.
comment: v2: 10 pages, 6 figures, 1 table. Adopted a more conservative error analysis: minor shifts in results (central values unchanged). Expanded supplementary material: added derivations of GFF unitarity relations and input discussions. Clarifications of mass radius added, title changed, main conclusions unchanged
♻ ☆ Particle Production Scenario in an Algebraically Coupled Quintessence Field with a Dark Matter Fluid
We investigate the dynamics of an algebraically coupled quintessence field with a dark matter fluid, focusing on particle production through the action principle via a modified interaction Lagrangian. The interaction parameter serves as the source of dark matter particle production and entropy generation. As particle creation occurs due to the interaction between the field and fluid sectors, the system exhibits additional pressure. Our analysis includes studying the system's dynamics by considering an exponential type of interaction corresponding to the field's exponential potential. We assess the system's background dynamics using the dynamical system stability technique to derive the constraints on the model parameters. Additionally, we determine the best-fit values of the model parameters against two combinations of data sets: (i) CC+Pantheon+SH0ES, and (ii) CC+Pantheon+SH0ES+SDSS BAO+ DESI BAO. By employing a comprehensive data analysis technique, we compare the evidence of our models to flat $\Lambda$CDM. Based on the Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC), one of the models emerges as a robust alternative to $\Lambda$CDM when considering the joint data sets.
comment: 32 pages, 16 figures, and 5 tables. Accepted for the publication in Chinese Journal of Physics
♻ ☆ Composite ALPs in Composite Higgs Models
We study the effective Lagrangian for the Axion-Like Particle (ALP) arising in a class of composite Higgs models. We discuss the peculiarities of the coupling to the quark sector specific of these models, including flavor violating terms. We then confront the main bounds from rare B-meson decays.
comment: 12 pages, 1 figure. V2: Improved discussion of the RG flow; JHEP version
♻ ☆ QCD predictions for physical multimeson scattering amplitudes
We use lattice QCD calculations of the finite-volume spectra of systems of two and three mesons to determine, for the first time, three-particle scattering amplitudes with physical quark masses. Our results are for combinations of $\pi^+$ and $K^+$, at a lattice spacing $a=0.063\;$fm, and in the isospin-symmetric limit. We also obtain accurate results for maximal-isospin two-meson amplitudes, with those for $\pi^+ K^+$ and $2K^+$ being the first determinations at the physical point. Dense lattice spectra are obtained using the stochastic Laplacian-Heaviside method, and the analysis leading to scattering amplitudes is done using the relativistic finite-volume formalism. Results are compared to chiral perturbation theory and to phenomenological fits to experimental data, finding good agreement.
comment: 7 pages, 4 figures, 2 tables. v3: minor updates in figs 3,4. Matches published version
♻ ☆ What is the Quark-Gluon Plasma made of?
This article surveys our present understanding of the internal structure of the fully developed quark-gluon plasma at temperatures outside the crossover region. The theoretical part of the review covers perturbative and nonperturbative approaches to quark-gluon plasma structure, in particular, hard-thermal loop effective theory, lattice QCD and the functional renormalization group. The phenomenological part of the review scrutinizes the information that has been derived from bulk observables and hard probes in relativistic heavy ion collisions in terms of how it informs our knowledge about the structure of the quark-gluon plasma. The final section lists possible avenues for future progress.
comment: To appear in the book (edited volume) {\it Quark Gluon Plasma at Fifty: A Commemorative Review}
♻ ☆ Looking for Black Hole Morsels in Astrophysical Mergers via Hawking Radiation
Gravitational wave observation has provided numerous insights into the merger of astrophysical black holes. In contrast to other violent events (e.g. supernovae), they are, however, not expected to lead to significant emissions of photons and neutrinos. In this paper we discuss a scenario that would lead to characteristic observable gamma ray bursts, which would provide numerous hints to physics beyond General Relativity. Starting from the hypothesis that micro-black holes (called morsels) are formed during the merger process, we show that it is possible to observe their Hawking radiation, which takes the form of gamma ray bursts of a uniquely characteristic form: with energies in the TeV range, their temporal structure is unlike that stemming from any other astrophysical event. Notably, the time delay from the gravitational wave event is correlated to the mass distribution of the morsels. The integrated mass of the morsels, allowed by the unaccounted merger mass, leads to a Hawking radiation in photons that is above the sensitivity of atmospheric Cherenkov telescopes such as HESS, LHAASO and HAWC, and gamma ray space telescopes, such as Fermi-LAT. This renders the hypothesis of morsel creation experimentally testable, and we provide the first concrete bounds on the total mass of morsels formed in specific events.
comment: 12 pages, 3 figures. Extended discussion on models, bound from primordial black hole searches added to Figure 2. Accepted for publication
♻ ☆ On the Temperature Effects in QCD Axion Mass Mixing
In this work, we extend the QCD axion mass mixing in the early Universe and investigate the temperature effects in the mixing. We explore the scenario where two $Z_{\mathcal N}$ QCD axions undergo mass mixing during the QCD phase transition, yielding three distinct mixing scenarios: the mixing I, II, and III. These scenarios are realized through fine-tuning of the axion decay constants, the temperature parameters, as well as the value of $\mathcal N$. We conduct a thorough analysis of the level crossing phenomena in these three mixing scenarios, detailing the conditions under which they occur. Notably, in the mixing I and II, the level crossing precedes the critical temperature of the QCD phase transition ($T_{\rm QCD}$), with minimal non-essential discrepancies in the cosmological evolution of the mass eigenvalues at $T_{\rm QCD}$. In contrast, the mixing III exhibits a unique double level crossings, occurring both before and at $T_{\rm QCD}$. Despite superficial similarities in axion evolution between the mixing II and III, we uncover fundamental differences between them. Additionally, we briefly address the transition in energy density between the two axions within our mixing scenarios. This work contributes to a deeper understanding of the role of the QCD axion in the early Universe and its potential implications for cold dark matter.
comment: 22 pages, 7 figures. Published in AOP
♻ ☆ Regge trajectories, detectors, and distributions in the critical ${\rm O}(N)$ model
We explore light-ray operators in the critical O$(N)$ model in the large-$N$ limit, focusing on leading-twist and leading ``horizontal" trajectories. We distinguish between light-ray operators in two conformal frames: detector operators, which characterize event shapes of final states, and distribution operators, which probe initial-state distributions. In particular, we identify parton distribution functions (PDFs) and collinear functions as matrix elements of appropriate distribution operators. We renormalize some simple detector operators at leading order in $1/N$, allowing us to extract the Regge intercept and the anomalous spin of the leading horizontal trajectory. We furthermore renormalize distribution versions of these operators, obtaining the leading-twist splitting function and a BFKL-type kernel, which match results from the detector frame. Finally, we show how these results can be read off from OPE data encoded in the Bethe-Salpeter resummation of conformal four-point functions.
comment: 43+20 pages, 19 figures, latex
♻ ☆ Mass Mixing between QCD Axions
We introduce a novel level crossing phenomenon in the mass mixing between the QCD axions, one canonical QCD axion and one $Z_{\mathcal N}$ axion. The level crossing can take place at or slightly before the QCD phase transition critical temperature, depending on the ratio of the axion decay constants $\sim1.69$. The cosmological evolution of the mass eigenvalues in these two scenarios is similar; however, the transition of axion energy density differs significantly. Finally, we estimate the relic density of the QCD axion dark matter in this context. Additionally, this level crossing may have some interesting cosmological implications.
comment: 12 pages, 4 figures. Published in CPC
♻ ☆ Solar System Constraints on Light Propagation from Higher Derivative Corrections to General Relativity and Implications for Fundamental Physics
While the two derivative action of gravitation is specified uniquely, higher derivative operators are also allowed with coefficients that are not specified uniquely by effective field theory. We focus on a four derivative operator in which the Riemann tensor couples directly to the electromagnetic field $a\,R_{\mu\nu\alpha\beta}F^{\mu\nu}F^{\alpha\beta}$. We compute the corresponding corrections to the Shapiro time delay in the solar system and compare this to data from the Cassini probe. We place an observational upper bound on the coefficient $a$ at 95\% confidence $|a|<26\,(1000\,\mbox{km})^2$. By way of motivation, we also compare this to a weak gravity conjecture (WGC) prediction of a bound on the coefficients $a,\,b$ of four derivative operators involving the graviton and the photon; this includes the above term $a\,R_{\mu\nu\alpha\beta}F^{\mu\nu}F^{\alpha\beta}$ as well as $b\,F^4$. We show that by using the observed value of the $b$ coefficient from measurements of light by light scattering, which arises in the Standard Model from integrating out the electron, the WGC predicted bound for $a$ is $a\lesssim 7.8\,(1000\,\mbox{km})^2$. This is consistent with the above observational bound, but is intriguingly close and can be further probed in other observations.
comment: 8 pages in double column format, 3 figures. V2: Some updates, added references
Machine Learning - Statistics
☆ QuEst: Enhancing Estimates of Quantile-Based Distributional Measures Using Model Predictions ICML 2025
As machine learning models grow increasingly competent, their predictions can supplement scarce or expensive data in various important domains. In support of this paradigm, algorithms have emerged to combine a small amount of high-fidelity observed data with a much larger set of imputed model outputs to estimate some quantity of interest. Yet current hybrid-inference tools target only means or single quantiles, limiting their applicability for many critical domains and use cases. We present QuEst, a principled framework to merge observed and imputed data to deliver point estimates and rigorous confidence intervals for a wide family of quantile-based distributional measures. QuEst covers a range of measures, from tail risk (CVaR) to population segments such as quartiles, that are central to fields such as economics, sociology, education, medicine, and more. We extend QuEst to multidimensional metrics, and introduce an additional optimization technique to further reduce variance in this and other hybrid estimators. We demonstrate the utility of our framework through experiments in economic modeling, opinion polling, and language model auto-evaluation.
comment: Published as a conference paper at ICML 2025
☆ DICE: Discrete inverse continuity equation for learning population dynamics
We introduce the Discrete Inverse Continuity Equation (DICE) method, a generative modeling approach that learns the evolution of a stochastic process from given sample populations at a finite number of time points. Models learned with DICE capture the typically smooth and well-behaved population dynamics, rather than the dynamics of individual sample trajectories that can exhibit complex or even chaotic behavior. The DICE loss function is developed specifically to be invariant, even in discrete time, to spatially constant but time-varying spurious constants that can emerge during training; this invariance increases training stability and robustness. Generating a trajectory of sample populations with DICE is fast because samples evolve directly in the time interval over which the stochastic process is formulated, in contrast to approaches that condition on time and then require multiple sampling steps per time step. DICE is stable to train, in situations where other methods for learning population dynamics fail, and DICE generates representative samples with orders of magnitude lower costs than methods that have to condition on time. Numerical experiments on a wide range of problems from random waves, Vlasov-Poisson instabilities and high-dimensional chaos are included to justify these assertions.
☆ Distribution-dependent Generalization Bounds for Tuning Linear Regression Across Tasks
Modern regression problems often involve high-dimensional data and a careful tuning of the regularization hyperparameters is crucial to avoid overly complex models that may overfit the training data while guaranteeing desirable properties like effective variable selection. We study the recently introduced direction of tuning regularization hyperparameters in linear regression across multiple related tasks. We obtain distribution-dependent bounds on the generalization error for the validation loss when tuning the L1 and L2 coefficients, including ridge, lasso and the elastic net. In contrast, prior work develops bounds that apply uniformly to all distributions, but such bounds necessarily degrade with feature dimension, d. While these bounds are shown to be tight for worst-case distributions, our bounds improve with the "niceness" of the data distribution. Concretely, we show that under additional assumptions that instances within each task are i.i.d. draws from broad well-studied classes of distributions including sub-Gaussians, our generalization bounds do not get worse with increasing d, and are much sharper than prior work for very large d. We also extend our results to a generalization of ridge regression, where we achieve tighter bounds that take into account an estimate of the mean of the ground truth distribution.
comment: 49 pages
☆ Vecchia-Inducing-Points Full-Scale Approximations for Gaussian Processes
Gaussian processes are flexible, probabilistic, non-parametric models widely used in machine learning and statistics. However, their scalability to large data sets is limited by computational constraints. To overcome these challenges, we propose Vecchia-inducing-points full-scale (VIF) approximations combining the strengths of global inducing points and local Vecchia approximations. Vecchia approximations excel in settings with low-dimensional inputs and moderately smooth covariance functions, while inducing point methods are better suited to high-dimensional inputs and smoother covariance functions. Our VIF approach bridges these two regimes by using an efficient correlation-based neighbor-finding strategy for the Vecchia approximation of the residual process, implemented via a modified cover tree algorithm. We further extend our framework to non-Gaussian likelihoods by introducing iterative methods that substantially reduce computational costs for training and prediction by several orders of magnitudes compared to Cholesky-based computations when using a Laplace approximation. In particular, we propose and compare novel preconditioners and provide theoretical convergence results. Extensive numerical experiments on simulated and real-world data sets show that VIF approximations are both computationally efficient as well as more accurate and numerically stable than state-of-the-art alternatives. All methods are implemented in the open source C++ library GPBoost with high-level Python and R interfaces.
☆ Sure Convergence and Constructive Universal Approximation for Multi-Layer Neural Networks
We propose a new neural network model, 01Neuro, built on indicator activation neurons. Its boosted variant possesses two key statistical properties: (1) Sure Convergence, where model optimization can be achieved with high probability given sufficient computational resources; and (2) Constructive Universal Approximation: In the infinite sample setting, the model can approximate any finite sum of measurable functions, each depending on only k out of p input features, provided the architecture is properly tuned. Unlike most universal approximation results that are agnostic to training procedures, our guarantees are directly tied to the model's explicit construction and optimization algorithm. To improve prediction stability, we integrate stochastic training and bagging into the boosted 01Neuro framework. Empirical evaluations on simulated and real-world tabular datasets with small to medium sample sizes highlight its strengths: effective approximation of interaction components (multiplicative terms), stable prediction performance (comparable to Random Forests), robustness to many noisy features, and insensitivity to feature scaling. A major limitation of the current implementation of boosted 01Neuro is its higher computational cost, which is approximately 5 to 30 times that of Random Forests and XGBoost.
comment: 39 pages, 3 figures, 8 tables
☆ Intervening to learn and compose disentangled representations
In designing generative models, it is commonly believed that in order to learn useful latent structure, we face a fundamental tension between expressivity and structure. In this paper we challenge this view by proposing a new approach to training arbitrarily expressive generative models that simultaneously learn disentangled latent structure. This is accomplished by adding a simple decoder-only module to the head of an existing decoder block that can be arbitrarily complex. The module learns to process concept information by implicitly inverting linear representations from an encoder. Inspired by the notion of intervention in causal graphical models, our module selectively modifies its architecture during training, allowing it to learn a compact joint model over different contexts. We show how adding this module leads to disentangled representations that can be composed for out-of-distribution generation. To further validate our proposed approach, we prove a new identifiability result that extends existing work on identifying structured representations in nonlinear models.
comment: 45 pages, 14 figures
☆ Optimal Model Selection for Conformalized Robust Optimization
In decision-making under uncertainty, Contextual Robust Optimization (CRO) provides reliability by minimizing the worst-case decision loss over a prediction set, hedging against label variability. While recent advances use conformal prediction to construct prediction sets for machine learning models, the downstream decisions critically depend on model selection. This paper introduces novel model selection frameworks for CRO that unify robustness control with decision risk minimization. We first propose Conformalized Robust Optimization with Model Selection (CROMS), which automatically selects models to approximately minimize the average decision risk in CRO solutions. We develop two algorithms: E-CROMS, which is computationally efficient, and F-CROMS, which enjoys a marginal robustness guarantee in finite samples. Further, we introduce Conformalized Robust Optimization with Individualized Model Selection (CROiMS), which performs individualized model selection by minimizing the conditional decision risk given the covariate of test data. This framework advances conformal prediction methodology by enabling covariate-aware model selection. Theoretically, CROiMS achieves asymptotic conditional robustness and decision efficiency under mild assumptions. Numerical results demonstrate significant improvements in decision efficiency and robustness across diverse synthetic and real-world applications, outperforming baseline approaches.
☆ Mutual Information Optimal Control of Discrete-Time Linear Systems
In this paper, we formulate a mutual information optimal control problem (MIOCP) for discrete-time linear systems. This problem can be regarded as an extension of a maximum entropy optimal control problem (MEOCP). Differently from the MEOCP where the prior is fixed to the uniform distribution, the MIOCP optimizes the policy and prior simultaneously. As analytical results, under the policy and prior classes consisting of Gaussian distributions, we derive the optimal policy and prior of the MIOCP with the prior and policy fixed, respectively. Using the results, we propose an alternating minimization algorithm for the MIOCP. Through numerical experiments, we discuss how our proposed algorithm works.
☆ A General Class of Model-Free Dense Precision Matrix Estimators
We introduce prototype consistent model-free, dense precision matrix estimators that have broad application in economics. Using quadratic form concentration inequalities and novel algebraic characterizations of confounding dimension reductions, we are able to: (i) obtain non-asymptotic bounds for precision matrix estimation errors and also (ii) consistency in high dimensions; (iii) uncover the existence of an intrinsic signal-to-noise -- underlying dimensions tradeoff; and (iv) avoid exact population sparsity assumptions. In addition to its desirable theoretical properties, a thorough empirical study of the S&P 500 index shows that a tuning parameter-free special case of our general estimator exhibits a doubly ascending Sharpe Ratio pattern, thereby establishing a link with the famous double descent phenomenon dominantly present in recent statistical and machine learning literature.
♻ ☆ Role of scrambling and noise in temporal information processing with quantum systems
Scrambling quantum systems have attracted attention as effective substrates for temporal information processing. Here we consider a quantum reservoir processing framework that captures a broad range of physical computing models with quantum systems. We examine the scalability and memory retention of the model with scrambling reservoirs modelled by high-order unitary designs in both noiseless and noisy settings. In the former regime, we show that measurement readouts become exponentially concentrated with increasing reservoir size, yet strikingly do not worsen with the reservoir iterations. Thus, while repeatedly reusing a small scrambling reservoir with quantum data might be viable, scaling up the problem size deteriorates generalization unless one can afford an exponential shot overhead. In contrast, the memory of early inputs and initial states decays exponentially in both reservoir size and reservoir iterations. In the noisy regime, we also prove that memory decays exponentially in time for local noisy channels. These results required us to introduce new proof techniques for bounding concentration in temporal quantum models.
comment: 11+41 pages, 6+6 figures, 1 table
♻ ☆ Distributional Diffusion Models with Scoring Rules
Diffusion models generate high-quality synthetic data. They operate by defining a continuous-time forward process which gradually adds Gaussian noise to data until fully corrupted. The corresponding reverse process progressively "denoises" a Gaussian sample into a sample from the data distribution. However, generating high-quality outputs requires many discretization steps to obtain a faithful approximation of the reverse process. This is expensive and has motivated the development of many acceleration methods. We propose to accomplish sample generation by learning the posterior {\em distribution} of clean data samples given their noisy versions, instead of only the mean of this distribution. This allows us to sample from the probability transitions of the reverse process on a coarse time scale, significantly accelerating inference with minimal degradation of the quality of the output. This is accomplished by replacing the standard regression loss used to estimate conditional means with a scoring rule. We validate our method on image and robot trajectory generation, where we consistently outperform standard diffusion models at few discretization steps.
♻ ☆ A dimensionality reduction technique based on the Gromov-Wasserstein distance
Analyzing relationships between objects is a pivotal problem within data science. In this context, Dimensionality reduction (DR) techniques are employed to generate smaller and more manageable data representations. This paper proposes a new method for dimensionality reduction, based on optimal transportation theory and the Gromov-Wasserstein distance. We offer a new probabilistic view of the classical Multidimensional Scaling (MDS) algorithm and the nonlinear dimensionality reduction algorithm, Isomap (Isometric Mapping or Isometric Feature Mapping) that extends the classical MDS, in which we use the Gromov-Wasserstein distance between the probability measure of high-dimensional data, and its low-dimensional representation. Through gradient descent, our method embeds high-dimensional data into a lower-dimensional space, providing a robust and efficient solution for analyzing complex high-dimensional datasets.
comment: This is a supplementary material for the paper, published as a conference paper at the 7th International Conference on Geometric Information Science - GSI'25
♻ ☆ Computation-Aware Gaussian Processes: Model Selection And Linear-Time Inference NeurIPS 2024
Model selection in Gaussian processes scales prohibitively with the size of the training dataset, both in time and memory. While many approximations exist, all incur inevitable approximation error. Recent work accounts for this error in the form of computational uncertainty, which enables -- at the cost of quadratic complexity -- an explicit tradeoff between computation and precision. Here we extend this development to model selection, which requires significant enhancements to the existing approach, including linear-time scaling in the size of the dataset. We propose a novel training loss for hyperparameter optimization and demonstrate empirically that the resulting method can outperform SGPR, CGGP and SVGP, state-of-the-art methods for GP model selection, on medium to large-scale datasets. Our experiments show that model selection for computation-aware GPs trained on 1.8 million data points can be done within a few hours on a single GPU. As a result of this work, Gaussian processes can be trained on large-scale datasets without significantly compromising their ability to quantify uncertainty -- a fundamental prerequisite for optimal decision-making.
comment: Advances in Neural Information Processing Systems (NeurIPS 2024)
♻ ☆ Follow-the-Perturbed-Leader Approaches Best-of-Both-Worlds for the m-Set Semi-Bandit Problems
We consider a common case of the combinatorial semi-bandit problem, the $m$-set semi-bandit, where the learner exactly selects $m$ arms from the total $d$ arms. In the adversarial setting, the best regret bound, known to be $\mathcal{O}(\sqrt{nmd})$ for time horizon $n$, is achieved by the well-known Follow-the-Regularized-Leader (FTRL) policy. However, this requires to explicitly compute the arm-selection probabilities via optimizing problems at each time step and sample according to them. This problem can be avoided by the Follow-the-Perturbed-Leader (FTPL) policy, which simply pulls the $m$ arms that rank among the $m$ smallest (estimated) loss with random perturbation. In this paper, we show that FTPL with a Fr\'echet perturbation also enjoys the near optimal regret bound $\mathcal{O}(\sqrt{nm}(\sqrt{d\log(d)}+m^{5/6}))$ in the adversarial setting and approaches best-of-both-world regret bounds, i.e., achieves a logarithmic regret for the stochastic setting. Moreover, our lower bounds show that the extra factors are unavoidable with our approach; any improvement would require a fundamentally different and more challenging method.
♻ ☆ Training-Conditional Coverage Bounds under Covariate Shift
Conformal prediction methodology has recently been extended to the covariate shift setting, where the distribution of covariates differs between training and test data. While existing results ensure that the prediction sets from these methods achieve marginal coverage above a nominal level, their coverage rate conditional on the training dataset (referred to as training-conditional coverage) remains unexplored. In this paper, we address this gap by deriving upper bounds on the tail of the training-conditional coverage distribution, offering probably approximately correct (PAC) guarantees for these methods. Our results quantify the relationship between the quality of the prediction sets and the severity of distributional changes, and can potentially be used to compute more efficient prediction sets.
♻ ☆ Model-free Posterior Sampling via Learning Rate Randomization
In this paper, we introduce Randomized Q-learning (RandQL), a novel randomized model-free algorithm for regret minimization in episodic Markov Decision Processes (MDPs). To the best of our knowledge, RandQL is the first tractable model-free posterior sampling-based algorithm. We analyze the performance of RandQL in both tabular and non-tabular metric space settings. In tabular MDPs, RandQL achieves a regret bound of order $\widetilde{O}(\sqrt{H^{5}SAT})$, where $H$ is the planning horizon, $S$ is the number of states, $A$ is the number of actions, and $T$ is the number of episodes. For a metric state-action space, RandQL enjoys a regret bound of order $\widetilde{O}(H^{5/2} T^{(d_z+1)/(d_z+2)})$, where $d_z$ denotes the zooming dimension. Notably, RandQL achieves optimistic exploration without using bonuses, relying instead on a novel idea of learning rate randomization. Our empirical study shows that RandQL outperforms existing approaches on baseline exploration environments.
comment: This revision fixed an error connected to an incorrect use of Proposition 7 inside of Lemma 4, and a misprint in Lemma 12. In the current version, we modified the martingale construction and applied the same argument as before; no results need to be modified as a result of these fixes
♻ ☆ Enhancing variational quantum algorithms by balancing training on classical and quantum hardware
Quantum computers offer a promising route to tackling problems that are classically intractable such as in prime-factorization, solving large-scale linear algebra and simulating complex quantum systems, but potentially require fault-tolerant quantum hardware. On the other hand, variational quantum algorithms (VQAs) are a promising approach for leveraging near-term quantum computers to solve complex problems. However, there remain major challenges in their trainability and resource costs on quantum hardware. Here we address these challenges by adopting Hardware Efficient and dynamical LIe algebra supported Ansatz (HELIA), and propose two training methods that combine an existing classical-enhanced g-sim method and the quantum-based Parameter-Shift Rule (PSR). Our improvement comes from distributing the resources required for gradient estimation and training to both classical and quantum hardware. We numerically evaluate our approach for ground-state estimation of 6 to 18-qubit Hamiltonians using the Variational Quantum Eigensolver (VQE) and quantum phase classification for up to 12-qubit Hamiltonians using quantum neural networks. For VQE, our method achieves higher accuracy and success rates, with an average reduction in quantum hardware calls of up to 60% compared to purely quantum-based PSR. For classification, we observe test accuracy improvements of up to 2.8%. We also numerically demonstrate the capability of HELIA in mitigating barren plateaus, paving the way for training large-scale quantum models.
comment: 37 pages, 14 figures, 6 tables, 4 algorithms
♻ ☆ On the quality of randomized approximations of Tukey's depth
Tukey's depth (or halfspace depth) is a widely used measure of centrality for multivariate data. However, exact computation of Tukey's depth is known to be a hard problem in high dimensions. As a remedy, randomized approximations of Tukey's depth have been proposed. In this paper we explore when such randomized algorithms return a good approximation of Tukey's depth. We study the case when the data are sampled from a log-concave isotropic distribution. We prove that, if one requires that the algorithm runs in polynomial time in the dimension, the randomized algorithm correctly approximates the maximal depth $1/2$ and depths close to zero. On the other hand, for any point of intermediate depth, any good approximation requires exponential complexity.
♻ ☆ Universal approximation results for neural networks with non-polynomial activation function over non-compact domains
This paper extends the universal approximation property of single-hidden-layer feedforward neural networks beyond compact domains, which is of particular interest for the approximation within weighted $C^k$-spaces and weighted Sobolev spaces over unbounded domains. More precisely, by assuming that the activation function is non-polynomial, we establish universal approximation results within function spaces defined over non-compact subsets of a Euclidean space, including $L^p$-spaces, weighted $C^k$-spaces, and weighted Sobolev spaces, where the latter two include the approximation of the (weak) derivatives. Moreover, we provide some dimension-independent rates for approximating a function with sufficiently regular and integrable Fourier transform by neural networks with non-polynomial activation function.
comment: arXiv admin note: text overlap with arXiv:2312.08410
♻ ☆ Breach in the Shield: Unveiling the Vulnerabilities of Large Language Models
Large Language Models (LLMs) and Vision-Language Models (VLMs) have achieved impressive performance across a wide range of tasks, yet they remain vulnerable to carefully crafted perturbations. In this study, we seek to pinpoint the sources of this fragility by identifying parameters and input dimensions (pixels or token embeddings) that are susceptible to such perturbations. To this end, we propose a stability measure called \textbf{FI}, \textbf{F}irst order local \textbf{I}nfluence, which is rooted in information geometry and quantifies the sensitivity of individual parameter and input dimensions. Our extensive analysis across LLMs and VLMs (from 1.5B to 13B parameters) reveals that: (I) A small subset of parameters or input dimensions with high FI values disproportionately contribute to model brittleness. (II) Mitigating the influence of these vulnerable parameters during model merging leads to improved performance.
Machine Learning - Computer Science
☆ Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations
LiDAR representation learning aims to extract rich structural and semantic information from large-scale, readily available datasets, reducing reliance on costly human annotations. However, existing LiDAR representation strategies often overlook the inherent spatiotemporal cues in LiDAR sequences, limiting their effectiveness. In this work, we propose LiMA, a novel long-term image-to-LiDAR Memory Aggregation framework that explicitly captures longer range temporal correlations to enhance LiDAR representation learning. LiMA comprises three key components: 1) a Cross-View Aggregation module that aligns and fuses overlapping regions across neighboring camera views, constructing a more unified and redundancy-free memory bank; 2) a Long-Term Feature Propagation mechanism that efficiently aligns and integrates multi-frame image features, reinforcing temporal coherence during LiDAR representation learning; and 3) a Cross-Sequence Memory Alignment strategy that enforces consistency across driving sequences, improving generalization to unseen environments. LiMA maintains high pretraining efficiency and incurs no additional computational overhead during downstream tasks. Extensive experiments on mainstream LiDAR-based perception benchmarks demonstrate that LiMA significantly improves both LiDAR semantic segmentation and 3D object detection. We hope this work inspires more effective pretraining paradigms for autonomous driving. The code has be made publicly accessible for future research.
comment: ICCV 2025; 26 pages, 12 figures, 10 tables; Code at http://github.com/Xiangxu-0103/LiMA
☆ Spatio-Temporal LLM: Reasoning about Environments and Actions
Despite the significant recent progress of Multimodal Large Language Models (MLLMs), MLLMs still struggle to correctly answer prompts that require a holistic spatio-temporal understanding. Specifically, it is challenging to address prompts that refer to 1) the entirety of an environment that an agent equipped with an MLLM can operate in; and simultaneously also refer to 2) recent actions that just happened and are encoded in a video clip. However, such a holistic spatio-temporal understanding is important for agents operating in the real world. To address this issue, we first develop a framework to collect a large-scale dataset. Using the collected "Reasoning about Environments and Actions" (REA) dataset, we show that recent methods indeed struggle to correctly answer the prompts. To improve, we develop a "spatio-temporal LLM" (ST-LLM), a model equipped with projectors to improve both spatial understanding of an environment and temporal understanding of recent observations. On the collected REA data, we show that the proposed method significantly improves results compared to prior work. Code and data are available at https://zoezheng126.github.io/STLLM-website/.
comment: Code and data are available at https://zoezheng126.github.io/STLLM-website/
☆ From Marginal to Joint Predictions: Evaluating Scene-Consistent Trajectory Prediction Approaches for Automated Driving
Accurate motion prediction of surrounding traffic participants is crucial for the safe and efficient operation of automated vehicles in dynamic environments. Marginal prediction models commonly forecast each agent's future trajectories independently, often leading to sub-optimal planning decisions for an automated vehicle. In contrast, joint prediction models explicitly account for the interactions between agents, yielding socially and physically consistent predictions on a scene level. However, existing approaches differ not only in their problem formulation but also in the model architectures and implementation details used, making it difficult to compare them. In this work, we systematically investigate different approaches to joint motion prediction, including post-processing of the marginal predictions, explicitly training the model for joint predictions, and framing the problem as a generative task. We evaluate each approach in terms of prediction accuracy, multi-modality, and inference efficiency, offering a comprehensive analysis of the strengths and limitations of each approach. Several prediction examples are available at https://frommarginaltojointpred.github.io/.
comment: Accepted at International Conference on Intelligent Transportation Systems 2025 (ITSC 2025)
☆ Physics-Guided Dual Implicit Neural Representations for Source Separation
Significant challenges exist in efficient data analysis of most advanced experimental and observational techniques because the collected signals often include unwanted contributions--such as background and signal distortions--that can obscure the physically relevant information of interest. To address this, we have developed a self-supervised machine-learning approach for source separation using a dual implicit neural representation framework that jointly trains two neural networks: one for approximating distortions of the physical signal of interest and the other for learning the effective background contribution. Our method learns directly from the raw data by minimizing a reconstruction-based loss function without requiring labeled data or pre-defined dictionaries. We demonstrate the effectiveness of our framework by considering a challenging case study involving large-scale simulated as well as experimental momentum-energy-dependent inelastic neutron scattering data in a four-dimensional parameter space, characterized by heterogeneous background contributions and unknown distortions to the target signal. The method is found to successfully separate physically meaningful signals from a complex or structured background even when the signal characteristics vary across all four dimensions of the parameter space. An analytical approach that informs the choice of the regularization parameter is presented. Our method offers a versatile framework for addressing source separation problems across diverse domains, ranging from superimposed signals in astronomical measurements to structural features in biomedical image reconstructions.
☆ Multi-Disease Deep Learning Framework for GWAS: Beyond Feature Selection Constraints
Traditional GWAS has advanced our understanding of complex diseases but often misses nonlinear genetic interactions. Deep learning offers new opportunities to capture complex genomic patterns, yet existing methods mostly depend on feature selection strategies that either constrain analysis to known pathways or risk data leakage when applied across the full dataset. Further, covariates can inflate predictive performance without reflecting true genetic signals. We explore different deep learning architecture choices for GWAS and demonstrate that careful architectural choices can outperform existing methods under strict no-leakage conditions. Building on this, we extend our approach to a multi-label framework that jointly models five diseases, leveraging shared genetic architecture for improved efficiency and discovery. Applied to five million SNPs across 37,000 samples, our method achieves competitive predictive performance (AUC 0.68-0.96), offering a scalable, leakage-free, and biologically meaningful approach for multi-disease GWAS analysis.
☆ Logit Reweighting for Topic-Focused Summarization
Generating abstractive summaries that adhere to a specific topic remains a significant challenge for language models. While standard approaches, such as fine-tuning, are resource-intensive, simpler methods like prompt engineering often struggle to maintain topical focus, particularly with smaller models. To address this, we propose a lightweight method that enhances topical relevance by directly reweighting the logits of topic-relevant tokens during generation. We evaluate three such reweighting techniques: Constant Shift, which adds a constant value to logits; Factor Scaling, which multiplies them by a factor; and Threshold Selection, which selectively boosts logits that exceed a probability threshold. Experiments on the NEWTS topical summarization dataset, using both Gemma-2B and Llama-3-8B models, show that these techniques effectively increase the use of topic-relevant vocabulary. Notably, the Threshold Selection method successfully improves topical focus without compromising summary quality-a trade-off often seen in other approaches. Our findings demonstrate that directly reweighting logits is a practical and resource-efficient alternative to fine-tuning, offering a promising pathway for precisely controlling the thematic content of generated text.
comment: 11 pages, 13 figures
☆ Cascade: Token-Sharded Private LLM Inference ICML 2025
As LLMs continue to increase in parameter size, the computational resources required to run them are available to fewer parties. Therefore, third-party inference services -- where LLMs are hosted by third parties with significant computational resources -- are becoming increasingly popular. However, third party inference raises critical concerns about user data privacy. To mitigate these risks, privacy researchers have developed provably secure schemes for third-party inference, such as Secure Multi-Party Computation (SMPC). However, SMPC protocols have significant computational and communication overhead, and do not scale to large models. In this work, we propose a new multi-party inference protocol, Cascade, that avoids these punitive costs by leveraging sharding in the sequence dimension to maintain privacy, trading off cryptographic privacy guarantees for increased performance and scalability. We demonstrate that Cascade is resistant to a generalization of a recent attack that is highly effective against other statistical privacy schemes, and that it is further resistant to learning-based attacks. As Cascade is orders of magnitude faster than existing schemes, our findings offer practical solutions for secure deployment of modern state-of-the-art LLMs.
comment: To be published in ICML 2025 Main Proceedings as "Hidden No More: Attacking and Defending Private Third-Party LLM Inference", together with arXiv:2505.18332
☆ NavigScene: Bridging Local Perception and Global Navigation for Beyond-Visual-Range Autonomous Driving
Autonomous driving systems have made significant advances in Q&A, perception, prediction, and planning based on local visual information, yet they struggle to incorporate broader navigational context that human drivers routinely utilize. We address this critical gap between local sensor data and global navigation information by proposing NavigScene, an auxiliary navigation-guided natural language dataset that simulates a human-like driving environment within autonomous driving systems. Moreover, we develop three complementary paradigms to leverage NavigScene: (1) Navigation-guided Reasoning, which enhances vision-language models by incorporating navigation context into the prompting approach; (2) Navigation-guided Preference Optimization, a reinforcement learning method that extends Direct Preference Optimization to improve vision-language model responses by establishing preferences for navigation-relevant summarized information; and (3) Navigation-guided Vision-Language-Action model, which integrates navigation guidance and vision-language models with conventional driving models through feature fusion. Extensive experiments demonstrate that our approaches significantly improve performance across perception, prediction, planning, and question-answering tasks by enabling reasoning capabilities beyond visual range and improving generalization to diverse driving scenarios. This work represents a significant step toward more comprehensive autonomous driving systems capable of navigating complex, unfamiliar environments with greater reliability and safety.
comment: Accepted by ACM Multimedia 2025
☆ QuEst: Enhancing Estimates of Quantile-Based Distributional Measures Using Model Predictions ICML 2025
As machine learning models grow increasingly competent, their predictions can supplement scarce or expensive data in various important domains. In support of this paradigm, algorithms have emerged to combine a small amount of high-fidelity observed data with a much larger set of imputed model outputs to estimate some quantity of interest. Yet current hybrid-inference tools target only means or single quantiles, limiting their applicability for many critical domains and use cases. We present QuEst, a principled framework to merge observed and imputed data to deliver point estimates and rigorous confidence intervals for a wide family of quantile-based distributional measures. QuEst covers a range of measures, from tail risk (CVaR) to population segments such as quartiles, that are central to fields such as economics, sociology, education, medicine, and more. We extend QuEst to multidimensional metrics, and introduce an additional optimization technique to further reduce variance in this and other hybrid estimators. We demonstrate the utility of our framework through experiments in economic modeling, opinion polling, and language model auto-evaluation.
comment: Published as a conference paper at ICML 2025
☆ A 3D Machine Learning based Volume Of Fluid scheme without explicit interface reconstruction
We present a machine-learning based Volume Of Fluid method to simulate multi-material flows on three-dimensional domains. One of the novelties of the method is that the flux fraction is computed by evaluating a previously trained neural network and without explicitly reconstructing any local interface approximating the exact one. The network is trained on a purely synthetic dataset generated by randomly sampling numerous local interfaces and which can be adapted to improve the scheme on less regular interfaces when needed. Several strategies to ensure the efficiency of the method and the satisfaction of physical constraints and properties are suggested and formalized. Numerical results on the advection equation are provided to show the performance of the method. We observe numerical convergence as the size of the mesh tends to zero $h=1/N_h\searrow 0$, with a better rate than two reference schemes.
☆ Bridging Prediction and Intervention Problems in Social Systems
Many automated decision systems (ADS) are designed to solve prediction problems -- where the goal is to learn patterns from a sample of the population and apply them to individuals from the same population. In reality, these prediction systems operationalize holistic policy interventions in deployment. Once deployed, ADS can shape impacted population outcomes through an effective policy change in how decision-makers operate, while also being defined by past and present interactions between stakeholders and the limitations of existing organizational, as well as societal, infrastructure and context. In this work, we consider the ways in which we must shift from a prediction-focused paradigm to an interventionist paradigm when considering the impact of ADS within social systems. We argue this requires a new default problem setup for ADS beyond prediction, to instead consider predictions as decision support, final decisions, and outcomes. We highlight how this perspective unifies modern statistical frameworks and other tools to study the design, implementation, and evaluation of ADS systems, and point to the research directions necessary to operationalize this paradigm shift. Using these tools, we characterize the limitations of focusing on isolated prediction tasks, and lay the foundation for a more intervention-oriented approach to developing and deploying ADS.
☆ Pre-Trained Policy Discriminators are General Reward Models
We offer a novel perspective on reward modeling by formulating it as a policy discriminator, which quantifies the difference between two policies to generate a reward signal, guiding the training policy towards a target policy with desired behaviors. Based on this conceptual insight, we propose a scalable pre-training method named Policy Discriminative Learning (POLAR), which trains a reward model (RM) to discern identical policies and discriminate different ones. Unlike traditional reward modeling methods relying on absolute preferences, POLAR captures the relative difference between one policy and an arbitrary target policy, which is a scalable, high-level optimization objective suitable for modeling generic ranking relationships. Leveraging the POLAR pre-training paradigm, we present a series of RMs with parameter scales from 1.8B to 7B. Empirical results show that POLAR substantially outperforms traditional non-pre-trained methods, significantly enhancing RM performance. For instance, POLAR-7B could improve preference accuracy from 54.8% to 81.0% on STEM tasks and from 57.9% to 85.5% on creative writing tasks compared to SOTA baselines. POLAR also shows robust generalization capabilities in RLHF using Reinforcement Fine-tuning (RFT), providing reliable reward signals and markedly enhancing policy performance--improving LLaMa3.1-8B from an average of 47.36% to 56.33% and Qwen2.5-32B from 64.49% to 70.47% on 20 benchmarks. Moreover, scaling experiments reveal a clear power-law relationship between computation and performance, supported by linear correlation coefficients approaching 0.99. The impressive performance, strong generalization, and scaling properties suggest that POLAR is a promising direction for developing general and strong reward models.
☆ Train-before-Test Harmonizes Language Model Rankings
Existing language model benchmarks provide contradictory model rankings, even for benchmarks that aim to capture similar skills. This dilemma of conflicting rankings hampers model selection, clouds model comparisons, and adds confusion to a growing ecosystem of competing models. Recent work attributed ranking disagreement to the phenomenon of training on the test task: As released, different models exhibit a different level of preparation for any given test task. A candidate solution to the problem is train-before-test: Give each model the same benchmark-specific finetuning before evaluation. Our primary contribution is a broad empirical evaluation of train-before-test across 24 benchmarks and 61 models. We show that train-before-test significantly improves ranking agreement consistently across all benchmarks. Whereas rankings have little external validity to start with, they enjoy a significant degree of external validity when applying train-before-test: Model rankings transfer gracefully from one benchmark to the other. Even within the same model family, train-before-test reduces strong ranking disagreement to near-perfect agreement. In addition, train-before-test reduces the model-score matrix to essentially rank one, revealing new insights into the latent factors of benchmark performance. Our work supports the recommendation to make train-before-test a default component of LLM benchmarking.
☆ $\varphi$-Adapt: A Physics-Informed Adaptation Learning Approach to 2D Quantum Material Discovery
Characterizing quantum flakes is a critical step in quantum hardware engineering because the quality of these flakes directly influences qubit performance. Although computer vision methods for identifying two-dimensional quantum flakes have emerged, they still face significant challenges in estimating flake thickness. These challenges include limited data, poor generalization, sensitivity to domain shifts, and a lack of physical interpretability. In this paper, we introduce one of the first Physics-informed Adaptation Learning approaches to overcome these obstacles. We focus on two main issues, i.e., data scarcity and generalization. First, we propose a new synthetic data generation framework that produces diverse quantum flake samples across various materials and configurations, reducing the need for time-consuming manual collection. Second, we present $\varphi$-Adapt, a physics-informed adaptation method that bridges the performance gap between models trained on synthetic data and those deployed in real-world settings. Experimental results show that our approach achieves state-of-the-art performance on multiple benchmarks, outperforming existing methods. Our proposed approach advances the integration of physics-based modeling and domain adaptation. It also addresses a critical gap in leveraging synthesized data for real-world 2D material analysis, offering impactful tools for deep learning and materials science communities.
♻ ☆ Human2LocoMan: Learning Versatile Quadrupedal Manipulation with Human Pretraining
Quadrupedal robots have demonstrated impressive locomotion capabilities in complex environments, but equipping them with autonomous versatile manipulation skills in a scalable way remains a significant challenge. In this work, we introduce a cross-embodiment imitation learning system for quadrupedal manipulation, leveraging data collected from both humans and LocoMan, a quadruped equipped with multiple manipulation modes. Specifically, we develop a teleoperation and data collection pipeline, which unifies and modularizes the observation and action spaces of the human and the robot. To effectively leverage the collected data, we propose an efficient modularized architecture that supports co-training and pretraining on structured modality-aligned data across different embodiments. Additionally, we construct the first manipulation dataset for the LocoMan robot, covering various household tasks in both unimanual and bimanual modes, supplemented by a corresponding human dataset. We validate our system on six real-world manipulation tasks, where it achieves an average success rate improvement of 41.9% overall and 79.7% under out-of-distribution (OOD) settings compared to the baseline. Pretraining with human data contributes a 38.6% success rate improvement overall and 82.7% under OOD settings, enabling consistently better performance with only half the amount of robot data. Our code, hardware, and data are open-sourced at: https://human2bots.github.io.
♻ ☆ SEE-2-SOUND: Zero-Shot Spatial Environment-to-Spatial Sound
Generating combined visual and auditory sensory experiences is critical for the consumption of immersive content. Recent advances in neural generative models have enabled the creation of high-resolution content across multiple modalities such as images, text, speech, and videos. Despite these successes, there remains a significant gap in the generation of high-quality spatial audio that complements generated visual content. Furthermore, current audio generation models excel in either generating natural audio or speech or music but fall short in integrating spatial audio cues necessary for immersive experiences. In this work, we introduce SEE-2-SOUND, a zero-shot approach that decomposes the task into (1) identifying visual regions of interest; (2) locating these elements in 3D space; (3) generating mono-audio for each; and (4) integrating them into spatial audio. Using our framework, we demonstrate compelling results for generating spatial audio for high-quality videos, images, and dynamic images from the internet, as well as media generated by learned approaches.
comment: Project Page: https://see2sound.github.io/
♻ ☆ OminiControl: Minimal and Universal Control for Diffusion Transformer
We present OminiControl, a novel approach that rethinks how image conditions are integrated into Diffusion Transformer (DiT) architectures. Current image conditioning methods either introduce substantial parameter overhead or handle only specific control tasks effectively, limiting their practical versatility. OminiControl addresses these limitations through three key innovations: (1) a minimal architectural design that leverages the DiT's own VAE encoder and transformer blocks, requiring just 0.1% additional parameters; (2) a unified sequence processing strategy that combines condition tokens with image tokens for flexible token interactions; and (3) a dynamic position encoding mechanism that adapts to both spatially-aligned and non-aligned control tasks. Our extensive experiments show that this streamlined approach not only matches but surpasses the performance of specialized methods across multiple conditioning tasks. To overcome data limitations in subject-driven generation, we also introduce Subjects200K, a large-scale dataset of identity-consistent image pairs synthesized using DiT models themselves. This work demonstrates that effective image control can be achieved without architectural complexity, opening new possibilities for efficient and versatile image generation systems.
comment: Accepted to ICCV 2025
♻ ☆ Multilevel Picard approximations and deep neural networks with ReLU, leaky ReLU, and softplus activation overcome the curse of dimensionality when approximating semilinear parabolic partial differential equations in $L^p$-sense
We prove that multilevel Picard approximations and deep neural networks with ReLU, leaky ReLU, and softplus activation are capable of approximating solutions of semilinear Kolmogorov PDEs in $L^\mathfrak{p}$-sense, $\mathfrak{p}\in [2,\infty)$, in the case of gradient-independent, Lipschitz-continuous nonlinearities, while the computational effort of the multilevel Picard approximations and the required number of parameters in the neural networks grow at most polynomially in both dimension $d\in \mathbb{N}$ and reciprocal of the prescribed accuracy $\epsilon$.
♻ ☆ ST-LoRA: Low-rank Adaptation for Spatio-Temporal Forecasting
Spatio-temporal forecasting is essential for understanding future dynamics within real-world systems by leveraging historical data from multiple locations. Existing methods often prioritize the development of intricate neural networks to capture the complex dependencies of the data. These methods neglect node-level heterogeneity and face over-parameterization when attempting to model node-specific characteristics. In this paper, we present a novel low-rank adaptation framework for existing spatio-temporal prediction models, termed \model, which alleviates the aforementioned problems through node-level adjustments. Specifically, we introduce the node-adaptive low-rank layer and node-specific predictor, capturing the complex functional characteristics of nodes while maintaining computational efficiency. Extensive experiments on multiple real-world datasets demonstrate that our method consistently achieves superior performance across various forecasting models with minimal computational overhead, improving performance by 7% with only 1% additional parameter cost. The source code is available at https://github.com/RWLinno/ST-LoRA.
comment: Published at ECML-PKDD 2025
♻ ☆ MMD-OPT : Maximum Mean Discrepancy Based Sample Efficient Collision Risk Minimization for Autonomous Driving
We propose MMD-OPT: a sample-efficient approach for minimizing the risk of collision under arbitrary prediction distribution of the dynamic obstacles. MMD-OPT is based on embedding distribution in Reproducing Kernel Hilbert Space (RKHS) and the associated Maximum Mean Discrepancy (MMD). We show how these two concepts can be used to define a sample efficient surrogate for collision risk estimate. We perform extensive simulations to validate the effectiveness of MMD-OPT on both synthetic and real-world datasets. Importantly, we show that trajectory optimization with our MMD-based collision risk surrogate leads to safer trajectories at low sample regimes than popular alternatives based on Conditional Value at Risk (CVaR).
Programming Languages
☆ React-tRace: A Semantics for Understanding React Hooks
React has become the most widely used web front-end framework, enabling the creation of user interfaces in a declarative and compositional manner. Hooks are a set of APIs that manage side effects in functional components in React. However, their semantics are often seen as opaque to developers, leading to UI bugs. In this paper, we formalize the semantics of the essence of React Hooks we name React-tRace, providing a framework that clarifies their behavior. We demonstrate that our model captures the behavior of React, by theoretically showing that it embodies essential properties of Hooks and empirically comparing our React-tRace-definitional interpreter against a test suite. Furthermore, we showcase a practical visualization tool based on the formalization to demonstrate how developers can better understand the semantics of Hooks.
comment: Conditionally accepted to OOPSLA 2025
☆ ChipSeek-R1: Generating Human-Surpassing RTL with LLM via Hierarchical Reward-Driven Reinforcement Learning
Large Language Models (LLMs) show significant potential for automating Register-Transfer Level (RTL) code generation. However, current approaches face a critical challenge: they can not simultaneously optimize for functional correctness and hardware quality (Power, Performance, Area - PPA). Methods based on supervised fine-tuning often generate functionally correct but PPA-suboptimal code, lacking mechanisms to learn optimization principles. In contrast, post-processing techniques that attempt to improve PPA metrics after generation are often inefficient because they operate externally without updating the LLM's parameters, thus failing to enhance the model's intrinsic design capabilities. To bridge this gap, we introduce ChipSeek-R1, a hierarchical reward-driven reinforcement learning framework to train LLMs to generate RTL code that achieves both functional correctness and optimized PPA metrics. ChipSeek-R1 employs a hierarchical reward system, which incorporates direct feedback on syntax, functional correctness (from simulators) and PPA metrics (from synthesis tools) during reinforcement learning. This enables the model to learn complex hardware design trade-offs via trial-and-error, generating RTL code that is both functionally correct and PPA-optimized. Evaluating ChipSeek-R1 on standard benchmarks (VerilogEval, RTLLM), we achieve state-of-the-art results in functional correctness. Notably, on the RTLLM benchmark, ChipSeek-R1 generated 27 RTL designs surpassing the PPA metrics of the original human-written code. Our findings demonstrate the effectiveness of integrating toolchain feedback into LLM training and highlight the potential for reinforcement learning to enable automated generation of human-surpassing RTL code. We open-source our code in anonymous github.
♻ ☆ Datalog with First-Class Facts
Datalog is a popular logic programming language for deductive reasoning tasks in a wide array of applications, including business analytics, program analysis, and ontological reasoning. However, Datalog's restriction to flat facts over atomic constants leads to challenges in working with tree-structured data, such as derivation trees or abstract syntax trees. To ameliorate Datalog's restrictions, popular extensions of Datalog support features such as existential quantification in rule heads (Datalog$^\pm$, Datalog$^\exists$) or algebraic data types (Souffl\'e). Unfortunately, these are imperfect solutions for reasoning over structured and recursive data types, with general existentials leading to complex implementations requiring unification, and ADTs unable to trigger rule evaluation and failing to support efficient indexing. We present DL$^{\exists!}$, a Datalog with first-class facts, wherein every fact is identified with a Skolem term unique to the fact. We show that this restriction offers an attractive price point for Datalog-based reasoning over tree-shaped data, demonstrating its application to databases, artificial intelligence, and programming languages. We implemented DL$^{\exists!}$ as a system \slog{}, which leverages the uniqueness restriction of DL$^{\exists!}$ to enable a communication-avoiding, massively-parallel implementation built on MPI. We show that Slog outperforms leading systems (Nemo, Vlog, RDFox, and Souffl\'e) on a variety of benchmarks, with the potential to scale to thousands of threads.
comment: arXiv admin note: text overlap with arXiv:2211.11573
High Energy Physics - Experimental
☆ da4ml: Distributed Arithmetic for Real-time Neural Networks on FPGAs
Neural networks with a latency requirement on the order of microseconds, like the ones used at the CERN Large Hadron Collider, are typically deployed on FPGAs fully unrolled and pipelined. A bottleneck for the deployment of such neural networks is area utilization, which is directly related to the required constant matrix-vector multiplication (CMVM) operations. In this work, we propose an efficient algorithm for implementing CMVM operations with distributed arithmetic (DA) on FPGAs that simultaneously optimizes for area consumption and latency. The algorithm achieves resource reduction similar to state-of-the-art algorithms while being significantly faster to compute. The proposed algorithm is open-sourced and integrated into the \texttt{hls4ml} library, a free and open-source library for running real-time neural network inference on FPGAs. We show that the proposed algorithm can reduce on-chip resources by up to a third for realistic, highly quantized neural networks while simultaneously reducing latency, enabling the implementation of previously infeasible networks.
☆ Probing KSVZ Axion Dark Matter near 5.9 GHz Using a 8-Cell Cavity Haloscope
We report on a search for axion dark matter in the frequency range near 5.9 GHz, conducted using the haloscope technique. The experiment employed an 8-cell microwave resonator designed to extend the accessible frequency range by a multi-fold factor relative to conventional single-cell configurations, while maintaining a large detection volume. To enhance sensitivity, a flux-driven Josephson parametric amplifier (JPA) operating near the quantum noise limit was utilized, together with a sideband-summing method that coherently combines mirrored spectral components generated by the JPA. Data were acquired over the frequency range 5.83-5.94 GHz. With no statistically significant excess observed, we exclude axion-photon couplings $g_{a\gamma\gamma}$ down to $1.2 \times 10^{-14}$ GeV$^{-1}$ at a 90% confidence level. The achieved sensitivity approaches the KSVZ benchmark prediction, setting the most stringent limits to date in this range.
comment: 6 pages, 4 figures
♻ ☆ Testing the RG Running of the Leptonic Dirac CP Phase with Reactor Neutrinos
We propose the possibility of using the near detector at reactor neutrino experiments to probe the renormalization group (RG) running effect on the leptonic Dirac CP phase $\delta_D$. Although the reactor neutrino oscillation cannot directly measure $\delta_D$, it can probe the deviation $\Delta \delta \equiv \delta_D(Q^2_d) - \delta_D(Q^2_p)$ caused by the RG running. Being a key element, the mismatched momentum transfers at neutrino production ($Q^2_p$) and detection ($Q^2_d$) processes can differ by two orders. We illustrate this concept with the upcoming Taishan Antineutrino Observatory (TAO, also known as JUNO-TAO) experiment and obtain the projected sensitivity to the CP RG running beta function $\beta_\delta$.
comment: 8 pages, 4 figures. Published version in PRD
Programming Languages
☆ Retargeting an Abstract Interpreter for a New Language by Partial Evaluation
It is well-known that abstract interpreters can be systematically derived from their concrete counterparts using a "recipe," but developing sound static analyzers remains a time-consuming task. Reducing the effort required and mechanizing the process of developing analyzers continues to be a significant challenge. Is it possible to automatically retarget an existing abstract interpreter for a new language? We propose a novel technique to automatically derive abstract interpreters for various languages from an existing abstract interpreter. By leveraging partial evaluation, we specialize an abstract interpreter for a source language. The specialization is performed using the semantics of target languages written in the source language. Our approach eliminates the need to develop analyzers for new targets from scratch. We show that our method can effectively retarget an abstract interpreter for one language into a correct analyzer for another language.
comment: Presented at the Student Research Competition (SRC) at PLDI 2025 (https://pldi25.sigplan.org/details/pldi-2025-src/1/)
☆ CCR 2.0: High-level Reasoning for Conditional Refinements
In recent years, great progress has been made in the field of formal verification for low-level systems. Many of them are based on one of two popular approaches: refinement or separation logic. These two approaches are very different in nature and offer complementary benefits in terms of compositionality. Recently, to fuse these benefits in a unified mechanism, a new approach called Conditional Contextual Refinement (CCR 1.0 for short) was proposed. In this paper, we advance the model of CCR 1.0 and provide novel and intuitive reasoning principles, resulting in: CCR 2.0. Specifically, CCR 2.0 (i) comes with a better compositionality theorem, having the practical benefit of facilitating more proof reuse, and (ii) provides a proof technique that hides model-level (i.e., resources of the separation logic) details from the user. Achieving this goal was challenging due to non-trivial counterexamples which necessitated us to devise novel notions. Our results are formalized in Coq.
♻ ☆ Qudit Quantum Programming with Projective Cliffords
This paper introduces a novel abstraction for programming quantum operations, specifically projective Cliffords, as functions over the qudit Pauli group. Generalizing the idea behind Pauli tableaux, we introduce a type system and lambda calculus for projective Cliffords called LambdaPC, which captures well-formed Clifford operations via a Curry-Howard correspondence with a particular encoding of the Clifford and Pauli groups. Importantly, the language captures not just qubit operations, but qudit operations for any dimension $d$. Throughout the paper we explore what it means to program with projective Cliffords through a number of examples and a case study focusing on stabilizer error correcting codes.
comment: 42 pages
Programming Languages
☆ Semantically Separating Nominal Wyvern for Usability and Decidability
The Dependent Object Types (DOT) calculus incorporates concepts from functional languages (e.g. modules) with traditional object-oriented features (e.g. objects, subtyping) to achieve greater expressivity (e.g. F-bounded polymorphism). However, this merger of paradigms comes at the cost of subtype decidability. Recent work on bringing decidability to DOT has either sacrificed expressiveness or ease of use. The unrestricted construction of recursive types and type bounds has made subtype decidability a much harder problem than in traditional object-oriented programming. Recognizing this, our paper introduces Nominal Wyvern, a DOT-like dependent type system that takes an alternative approach: instead of having a uniform structural syntax like DOT, Nominal Wyvern is designed around a "semantic separation" between the nominal declaration of recursive types on the one hand, and the structural refinement of those types when they are used on the other. This design naturally guides the user to avoid writing undecidably recursive structural types. From a technical standpoint, this separation also makes guaranteeing decidability possible by allowing for an intuitive adaptation of material/shape separation, a technique for achieving subtype decidability by separating types responsible for subtyping constraints from types that represent concrete data. The result is a type system with syntax and structure familiar to OOP users that achieves decidability without compromising the expressiveness of F-bounded polymorphism and module systems as they are used in practice.
Programming Languages
☆ RVISmith: Fuzzing Compilers for RVV Intrinsics
Modern processors are equipped with single instruction multiple data (SIMD) instructions for fine-grained data parallelism. Compiler auto-vectorization techniques that target SIMD instructions face performance limitations due to insufficient information available at compile time, requiring programmers to manually manipulate SIMD instructions. SIMD intrinsics, a type of built-in function provided by modern compilers, enable programmers to manipulate SIMD instructions within high-level programming languages. Bugs in compilers for SIMD intrinsics can introduce potential threats to software security, producing unintended calculation results, data loss, program crashes, etc. To detect bugs in compilers for SIMD intrinsics, we propose RVISmith, a randomized fuzzer that generates well-defined C programs that include various invocation sequences of RVV (RISC-V Vector Extension) intrinsics. We design RVISmith to achieve the following objectives: (i) achieving high intrinsic coverage, (ii) improving sequence variety, and (iii) without known undefined behaviors. We implement RVISmith based on the ratified RVV intrinsic specification and evaluate our approach with three modern compilers: GCC, LLVM, and XuanTie. Experimental results show that RVISmith achieves 11.5 times higher intrinsic coverage than the state-of-the-art fuzzer for RVV intrinsics. By differential testing that compares results across different compilers, optimizations, and equivalent programs, we detect and report 13 previously unknown bugs of the three compilers under test to date. Of these bugs, 10 are confirmed and another 3 are fixed by the compiler developers.
comment: To appear in ACM CCS 2025
☆ Specification-Guided Repair of Arithmetic Errors in Dafny Programs using LLMs
Formal verification offers strong assurances of software correctness. However, debugging and repairing the underlying faults can be complex and time-consuming when verification fails. Automated Program Repair (APR) aims to ease this by automatically identifying and fixing faults. Traditional APR techniques often depend on test suites for validation, but these may fail to capture all scenarios. In contrast, formal specifications provide stronger correctness criteria for effective repairs. We present an innovative APR tool for Dafny, a verification-aware programming language that uses formal specifications - including pre-conditions, post-conditions, and invariants - as oracles for fault localization and repair. Assuming the correctness of the specifications and focusing on arithmetic bugs, we localize faults through a series of steps, which include using Hoare Logic to determine the state of each statement within the program and state-of-the-art Large Language Models (LLMs) to synthesize candidate fixes. The chosen models were GPT-4o mini, Llama 3, Mistral 7B, and Llemma 7B. We evaluate our approach using DafnyBench, a benchmark of real-world Dafny programs. Our tool achieves 89.6% accuracy in fault localization, with GPT-4o mini yielding the highest repair success rate (74.18%). These results highlight the potential of combining formal reasoning with LLM-driven program synthesis for automated program repair.
☆ Towards Automatic Error Recovery in Parsing Expression
Error recovery is an essential feature for a parser that should be plugged in Integrated Development Environments (IDEs), which must build Abstract Syntax Trees (ASTs) even for syntactically invalid programs in order to offer features such as automated refactoring and code completion. Parsing Expressions Grammars (PEGs) are a formalism that naturally describes recursive top-down parsers using a restricted form of backtracking. Labeled failures are a conservative extension of PEGs that adds an error reporting mechanism for PEG parsers, and these labels can also be associated with recovery expressions to also be an error recovery mechanism. These expressions can use the full expressivity of PEGs to recover from syntactic errors. Manually annotating a large grammar with labels and recovery expressions can be difficult. In this work, we present an algorithm that automatically annotates a PEG with labels, and builds their corresponding recovery expressions. We evaluate this algorithm by adding error recovery to the parser of the Titan programming language. The results shown that with a small amount of manual intervention our algorithm can be used to produce error recovering parsers for PEGs where most of the alternatives are disjoint.
comment: arXiv admin note: substantial text overlap with arXiv:1905.02145
♻ ☆ Generically Automating Separation Logic by Functors, Homomorphisms and Modules
Foundational verification considers the functional correctness of programming languages with formalized semantics and uses proof assistants (e.g., Coq, Isabelle) to certify proofs. The need for verifying complex programs compels it to involve expressive Separation Logics (SLs) that exceed the scopes of well-studied automated proof theories, e.g., symbolic heap. Consequently, automation of SL in foundational verification relies heavily on ad-hoc heuristics that lack a systematic meta-theory and face scalability issues. To mitigate the gap, we propose a theory to specify SL predicates using abstract algebras including functors, homomorphisms, and modules over rings. Based on this theory, we develop a generic SL automation algorithm to reason about any data structures that can be characterized by these algebras. In addition, we also present algorithms for automatically instantiating the algebraic models to real data structures. The instantiation reuses the algebraic models of component structures and preserves their data abstractions. Case studies on formalized imperative semantics show our algorithm can instantiate the algebraic models automatically for a variety of complex data structures. Experimental results indicate the automatically instantiated reasoners from our generic theory show similar results to the state-of-the-art systems made of specifically crafted reasoning rules. The presented theories, proofs, and the verification framework are formalized in Isabelle/HOL.
comment: Accepted by POPL'25
Programming Languages
☆ DecoRTL: A Run-time Decoding Framework for RTL Code Generation with LLMs
As one of their many applications, large language models (LLMs) have recently shown promise in automating register transfer level (RTL) code generation. However, conventional LLM decoding strategies, originally designed for natural language, often fail to meet the structural and semantic demands of RTL, leading to hallucinated, repetitive, or invalid code outputs. In this paper, we first investigate the root causes of these decoding failures through an empirical analysis of token-level entropy during RTL generation. Our findings reveal that LLMs exhibit low confidence in regions of structural ambiguity or semantic complexity, showing that standard decoding strategies fail to differentiate between regions requiring determinism (syntax-critical regions) and those that benefit from creative exploratory variability (design-critical regions). Then, to overcome this, we introduce DecoRTL, a novel run-time decoding strategy, that is both syntax-aware and contrastive for RTL code generation. DecoRTL integrates two complementary components: (i) self-consistency sampling, which generates multiple candidates and re-ranks them based on token-level agreement to promote correctness while maintaining diversity; and (ii) syntax-aware temperature adaptation, which classifies tokens by their syntactical and functional roles and adjusts the sampling temperature accordingly, enforcing low temperature for syntax-critical tokens and higher temperature for exploratory ones. Our approach operates entirely at inference time without requiring any additional model fine-tuning. Through evaluations on multiple open-source LLMs using the VerilogEval benchmark, we demonstrate significant improvements in syntactic validity, functional correctness, and output diversity, while the execution overhead (performance overhead) is imperceptible.
comment: Accepted to the International Conference on Computer-Aided Design (ICCAD 2025)
♻ ☆ A Lightweight Method for Generating Multi-Tier JIT Compilation Virtual Machine in a Meta-Tracing Compiler Framework
Meta-compiler frameworks, such as RPython and Graal/Truffle, generate high-performance virtual machines (VMs) from interpreter definitions. Although they generate VMs with high-quality just-in-time (JIT) compilers, they still lack an important feature that dedicated VMs (i.e., VMs that are developed for specific languages) have, namely \emph{multi-tier compilation}. Multi-tier compilation uses light-weight compilers at early stages and highly-optimizing compilers at later stages in order to balance between compilation overheads and code quality. We propose a novel approach to enabling multi-tier compilation in the VMs generated by a meta-compiler framework. Instead of extending the JIT compiler backend of the framework, our approach drives an existing (heavyweight) compiler backend in the framework to quickly generate unoptimized native code by merely embedding directives and compile-time operations into interpreter definitions. As a validation of the approach, we developed 2SOM, a Simple Object Machine with a two-tier JIT compiler based on RPython. 2SOM first applies the tier-1 threaded code generator that is generated by our proposed technique, then, to the loops that exceed a threshold, applies the tier-2 tracing JIT compiler that is generated by the original RPython framework. Our performance evaluation that runs a program with a realistic workload showed that 2SOM improved, when compared against an RPython-based VM, warm-up performance by 15\%, with merely a 5\% reduction in peak performance.
comment: ECOOP 2025. Fixed DOI
Programming Languages
☆ Structural Code Search using Natural Language Queries
Searching code is a common task that developers perform to understand APIs, learn common code patterns, and navigate code. Currently, developers most commonly search using keywords and regular expressions that are easy to use and widely available. Beyond keywords and regular expressions, structural code search tools allow developers to search for code based on its syntactic structure. This has numerous applications ranging from bug finding to systematically refactoring code. However, these structural code search tools operate on queries expressed in domain-specific languages (DSL) that can be difficult to learn and write. We propose to allow developers to use natural language to search for code structurally. Expressing queries in natural language provides an intuitive way to search for code and lowers the barrier to entry. In this work, we develop a novel general approach that combines the reasoning capabilities of an LLM to interpret natural language search queries with the power of structural search tools to efficiently and accurately retrieve relevant code. We then instantiate this approach for two structural code search DSLs: Semgrep and GQL. In our evaluation, we construct a new benchmark for structural code search consisting of 400 queries over 10 Java projects. We show that our approach for structural code search based on translating NL queries to DSL queries using an LLM is effective and robust, achieving a high precision and recall ranging from 55% - 70%. Further, our approach significantly outperforms baselines based on semantic code search and LLM retrievals by up to 57% and 14% on F1 scores.
☆ LeanLTL: A unifying framework for linear temporal logics in Lean
We propose LeanLTL, a unifying framework for linear temporal logics in Lean 4. LeanLTL supports reasoning about traces that represent either infinite or finite linear time. The library allows traditional LTL syntax to be combined with arbitrary Lean expressions, making it straightforward to define properties involving numerical or other types. We prove that standard flavors of LTL can be embedded in our framework. The library also provides automation for reasoning about LeanLTL formulas in a way that facilitates using Lean's existing tactics. Finally, we provide examples illustrating the utility of the library in reasoning about systems that come from applications.
comment: 9 pages, 3 figures; for associated project files see https://github.com/UCSCFormalMethods/LeanLTL; to be published in LIPIcs for ITP '25
☆ Globality and Regions
We obtain a characterization of global variables by unifying abstraction with region abstraction in a region-based language. More precisely, in a previous work a language called global was presented, whose virtue is to provide a conceptually clear way of introducing imperative operations in a functional language. Memory safety is provided by the concept of linear protection, which connects the global system to a linear one. In this paper we show that the concept of global variable provided by the global language arises from the Tofte and Talping's region language through the unification of abstraction and region abstraction.
☆ Advanced LPeg techniques: A dual case study approach
This paper presents advanced optimization techniques for Lua Parsing Expression Grammars (LPeg) through two complementary case studies: a high-performance JSON parser and a sophisticated Glob-to-LPeg pattern converter. We demonstrate how strategic grammar construction can dramatically improve parsing performance without modifying the underlying LPeg library. For the JSON parser, we implement substitution capture and table construction optimization to reduce memory allocation overhead and improve object processing. For the Glob converter, we introduce segment-boundary separation, implement Cox's flattened search strategy, and develop optimized braced condition handling to prevent exponential backtracking. Comprehensive benchmarks demonstrate that our JSON parser achieves processing speeds up to 125 MB/s on complex documents, consistently outperforming dkjson and showing competitive results against rxi_json across most test cases. Our Glob-to-LPeg converter exhibits 14-92% better performance than Bun.Glob and runs 3-14 times faster than Minimatch across diverse pattern matching scenarios. This research provides practical optimization techniques for LPeg-based parsers, contributing valuable strategies to the text processing ecosystem.
♻ ☆ OblivIO: Securing reactive programs by oblivious execution with bounded traffic overheads
Traffic analysis attacks remain a significant problem for online security. Communication between nodes can be observed by network level attackers as it inherently takes place in the open. Despite online services increasingly using encrypted traffic, the shape of the traffic is not hidden. To prevent traffic analysis, the shape of a system's traffic must be independent of secrets. We investigate adapting the data-oblivious approach the reactive setting and present OblivIO, a secure language for writing reactive programs driven by network events. Our approach pads with dummy messages to hide which program sends are genuinely executed. We use an information-flow type system to provably enforce timing-sensitive noninterference. The type system is extended with potentials to bound the overhead in traffic introduced by our approach. We address challenges that arise from joining data-oblivious and reactive programming and demonstrate the feasibility of our resulting language by developing an interpreter that implements security critical operations as constant-time algorithms.
comment: 40 pages, 16 figures, Technical report for paper submitted to CSF 2023
♻ ☆ Quantifying the Importance of Data Alignment in Downstream Model Performance
Contrary to the conventional emphasis on dataset size, we explore the role of data alignment -- an often overlooked aspect of data quality -- in training capable Large Language Models (LLMs). To do so, we use the Task2Vec-based alignment coefficient, a quantitative measure of the similarity between two datasets, to quantify the impact of alignment between training data and evaluation data on downstream performance. In particular, we conduct controlled \textit{interventional} experiments for two settings: 1. the impact of increased alignment coefficients between various pre-training (pt) against evaluation datasets, and 2. the impact of increased alignment coefficients between domain specific fine-tuning (ft) against domain specific evaluation. The domain specific task we explore is Autoformalization -- the machine translation task between natural language and code for formal verification. In both settings, we find a strong, predictable negative correlation between the alignment coefficient of a model's training and evaluation data and the model's loss/perplexity on the respective downstream task. These findings suggest a re-evaluation of LLM training approaches, demonstrating the relevance of data alignment compared to data quantity, especially in specialized downstream tasks such as Autoformalization.
♻ ☆ Expressivity of AuDaLa: Turing Completeness and Possible Extensions
AuDaLa is a recently introduced programming language that follows the new data autonomous paradigm. In this paradigm, small pieces of data execute functions autonomously. Considering the paradigm and the design choices of AuDaLa, it is interesting to determine the expressivity of the language. In this paper, we implement Turing machines in AuDaLa and prove that implementation correct. This proves that AuDaLa is Turing complete, giving an initial indication of AuDaLa's expressivity. Additionally, we give examples of how to add extensions to AuDaLa to increase its practical expressivity and to better match conventional parallel languages, allowing for a more straightforward and performant implementation of algorithms.
comment: 30 pages, 1 figure, submitted to LMCS, extension of submission to FORTE (preprint at arXiv:2404.12934)