Workshop
2nd ICML Workshop on Machine Learning for Astrophysics
Francois Lanusse · Marc HuertasCompany · Brice Menard · Laurence PerreaultLevasseur · J. Xavier Prochaska · Uros Seljak · Francisco VillaescusaNavarro · Ashley Villar
Meeting Room 317 B
As modern astrophysical surveys deliver an unprecedented amount of data, from the imaging of hundreds of millions of distant galaxies to the mapping of cosmic radiation fields at ultrahigh resolution, conventional data analysis methods are reaching their limits in both computational complexity and optimality. Deep Learning has rapidly been adopted by the astronomical community as a promising way of exploiting these forthcoming bigdata datasets and of extracting the physical principles that underlie these complex observations. This has led to an unprecedented exponential growth of publications combining Machine Learning and astrophysics. Yet, many of these works remain at an exploratory level and have not been translated into real scientific breakthroughs.Following a successful initial iteration of this workshop at ICML 2022, our continued goal for this workshop series is to bring together Machine Learning researchers and domain experts in the field of Astrophysics to discuss the key open issues which hamper the use of Deep Learning for scientific discovery.
Schedule
Sat 12:00 p.m.  12:05 p.m.

Welcome
(
Opening Remarks
)
SlidesLive Video 
Francois Lanusse 🔗 
Sat 12:05 p.m.  12:35 p.m.

Keynote I: Detecting and Adapting to Distribution Shift
(
Keynote presentation
)
SlidesLive Video 
Chelsea Finn 🔗 
Sat 12:35 p.m.  12:50 p.m.

Shared Stochastic Gaussian process Decoders: A Probabilistic Generative model for Quasar Spectra
(
Oral
)
SlidesLive Video This work proposes a scalable probabilistic latent variable model based on Gaussian processes (Lawrence, 2004) in the context of multiple observation spaces. We focus on an application in astrophysics where it is typical for data sets to contain both observed spectral features as well as scientific properties of astrophysical objects such as galaxies or exoplanets. In our application, we study the spectra of very luminous galaxies known as quasars, and their properties, such as the mass of their central supermassive black hole, their accretion rate and their luminosity, and hence, there can be multiple observation spaces. A single data point is then characterised by different classes of observations which may have different likelihoods. Our proposed model extends the baseline stochastic variational Gaussian process latent variable model (GPLVM) to this setting, proposing a seamless generative model where the quasar spectra and the scientific labels can be generated simultaneously when modelled with a shared latent space acting as input to different sets of Gaussian process decoders, one for each observation space. Further, this framework allows training in the missing data setting where a large number of dimensions per data point may be unobserved. We demonstrate highfidelity reconstructions of the spectra and the scientific labels during testtime inference and briefly discuss the scientific interpretations of the results along with the significance of such a generative model. 
Vidhi Ramesh · AnnaChristina Eilers 🔗 
Sat 12:50 p.m.  1:05 p.m.

Disentangling gammaray observations of the Galactic Center using differentiable probabilistic programming
(
Oral
)
link
SlidesLive Video
We motivate the use of differentiable probabilistic programming techniques in order to account for the large modelspace inherent to astrophysical $\gamma$ray analyses. Targeting the longstanding Galactic Center $\gamma$ray Excess (GCE) puzzle, we construct a differentiable forward model and likelihood that makes liberal use of GPU acceleration and vectorization in order to simultaneously account for a continuum of possible spatial morphologies consistent with the Excess emission in a fully probabilistic manner. Our setup allows for efficient inference over the large model space using variational methods. Beyond application to $\gamma$ray data, a goal of this work is to showcase how differentiable probabilistic programming can be used as a tool to enable flexible analyses of astrophysical datasets.

Yitian Sun · Siddharth MishraSharma · Tracy Slatyer · Yuqing Wu 🔗 
Sat 1:05 p.m.  1:30 p.m.

Morning Coffee Break
(
Break
)

🔗 
Sat 1:30 p.m.  2:00 p.m.

Keynote II: Foundation Models for Radio Astronomy
(
Keynote presentation
)
SlidesLive Video The Square Kilometre Array (SKA) will be the world's largest radio telescope, producing data volumes approaching exascale within a few years of operation. Extracting scientific value from those data in a timely manner will be a challenge that quickly goes beyond traditional analyses and instead requires robust domainspecific AI solutions. Here I will discuss how we have been building foundation models that can be adapted across different SKA precursor instruments, by applying selfsupervised learning with instance differentiation to learn a multipurpose representation for use in radio astronomy. For a standard radio astronomy use case, our models exceed baseline supervised classification performance by a statistically significant margin for most label volumes in the indistribution classification case and for all label volumes in the outofdistribution case. I will also show how such learned representations can be more widely scientifically useful, for example in similarity searches that allow us to find hybrid radio galaxies without any prelabelled examples. 
Anna Scaife 🔗 
Sat 2:00 p.m.  2:15 p.m.

Positional Encodings for Light Curve Transformers: Playing with Positions and Attention
(
Oral
)
SlidesLive Video We conducted empirical experiments to assess the transferability of a light curve transformer to datasets with different cadences and flux distributions using various positional encodings (PEs). We proposed a new approach to incorporate the temporal information directly to the output of the last attention layer. Our results indicated that using trainable PEs lead to significant improvements in the transformer performances and training times. Our proposed PE on attention can be trained faster than the traditional nontrainable PE transformer while achieving competitive results when transfered to other datasets. 
Guillermo CabreraVives · Daniel MorenoCartagena · Pavlos Protopapas · Cristobal Donoso · Manuel PerezCarrasco · Martina CádizLeyton 🔗 
Sat 2:15 p.m.  2:30 p.m.

Detecting Tidal Features using SelfSupervised Learning
(
Oral
)
SlidesLive Video Low surface brightness substructures around galaxies, known as tidal features, are a valuable tool in the detection of past or ongoing galaxy mergers. Their properties can answer questions about the progenitor galaxies involved in the interactions. This paper presents promising results from a selfsupervised machine learning model, trained on data from the Ultradeep layer of the Hyper SuprimeCam Subaru Strategic Program optical imaging survey, designed to automate the detection of tidal features. We find that selfsupervised models are capable of detecting tidal features and that our model outperforms previous automated tidal feature detection methods. The previous state of the art method achieved 76% completeness for 22% contamination, while our model achieves considerably higher (96%) completeness for the same level of contamination. 
Alice Desmons · Sarah Brough · Francois Lanusse 🔗 
Sat 2:30 p.m.  2:45 p.m.

Flow Matching for Scalable SimulationBased Inference
(
Oral
)
SlidesLive Video Neural posterior estimation methods based on discrete normalizing flows have become establishedtools for simulationbased inference (SBI), butscaling them to highdimensional problems can bechallenging. Building on recent advances in generative modeling, we here present flow matchingposterior estimation (FMPE), a technique for SBIusing continuous normalizing flows. Like diffusion models, and in contrast to discrete flows, flowmatching allows for unconstrained architectures,providing enhanced flexibility for complex datamodalities. Flow matching, therefore, enablesexact density evaluation, fast training, and seamless scalability to large architectures—making itideal for SBI. To showcase the improved scalability of our approach, we apply it to a challengingastrophysics problem: for gravitationalwave inference, FMPE outperforms methods based oncomparable discrete flows, reducing training timeby 30% with substantially improved accuracy 
Jonas Wildberger · Maximilian Dax · Simon Buchholz · Stephen R. Green · Jakob Macke · Bernhard Schölkopf 🔗 
Sat 2:45 p.m.  3:00 p.m.

Time Delay Cosmography with a Neural Ratio Estimator
(
Oral
)
SlidesLive Video
We explore the use of a Neural Ratio Estimator (NRE) to determine the Hubble constant ($H_0$) in the context of time delay cosmography. Assuming a Singular Isothermal Ellipsoid (SIE) mass profile for the deflector, we simulate time delay measurements, image position measurements, and modeled lensing parameters. We train the NRE to output the posterior distribution of $H_0$ given the time delay measurements, the relative Fermat potentials (calculated from the modeled parameters and the measured image positions), the deflector redshift, and the source redshift. We compare the accuracy and precision of the NRE with traditional explicit likelihood methods in the limit where the latter is tractable and reliable, using Gaussian noise to emulate measurement uncertainties in the input parameters. The NRE posteriors track closely the ones from the conventional method and, while they show a slight tendency to overestimate uncertainties for the quads lensing configuration, they can be combined in a population inference without bias.

Ève CampeauPoirier · Laurence PerreaultLevasseur · Adam Coogan · Yashar Hezaveh 🔗 
Sat 3:00 p.m.  4:00 p.m.

Lunch Break
(
Break
)

🔗 
Sat 4:00 p.m.  4:30 p.m.

Keynote III: Astrophysics Meets MLOps
(
Keynote presentation
)
SlidesLive Video Harnessing the power of machine learning (ML) for astrophysical discovery necessitates not only sophisticated models but also the implementation of robust operations, or MLOps. This talk will highlight the potential of MLOps to streamline the deployment of ML in astrophysics. We'll delve into the iterative cycle of data acquisition, model retraining, evaluation, deployment, and monitoring/telemetry — collectively forming the engine of successful AI ventures. We'll explore key MLOps practices from industry, emphasizing the critical role of experiment tracking, reproducibility, data/model provenance and versioning, and effective collaboration in this process. 
Dmitry Duev 🔗 
Sat 4:30 p.m.  4:45 p.m.

Diffusion generative modeling for galaxy surveys: emulating clustering for inference at the field level
(
Oral
)
SlidesLive Video We introduce a diffusiongenerative model to describe the distribution of galaxies in our Universe directly as a collection of points in 3D space, without resorting to binning or voxelization. The custom diffusion model, which employs graph neural networks as the backbone score function, can be used as an emulator that accurately reproduces essential summary statistics of the galaxy distribution and enables cosmological parameter estimation using gradientbased inference techniques. This approach allows for a comprehensive analysis of cosmological data by circumventing limitations inherent to summary statisticsbased as well as likelihoodfree methods. 
Carolina Cuesta · Siddharth MishraSharma 🔗 
Sat 4:45 p.m.  5:00 p.m.

FieldLevel Inference with Microcanonical Langevin Monte Carlo
(
Oral
)
SlidesLive Video
Fieldlevel inference provides a means to optimally extract information from upcoming cosmological surveys, but requires efficient sampling of a highdimensional parameter space.This work applies Microcanonical Langevin Monte Carlo (MCLMC) to sample the initial conditions of the Universe, as well as the cosmological parameters $\sigma_8$ and $\Omega_m$, from simulations of cosmic structure.MCLMC is shown to be over an order of magnitude more efficient than traditional Hamiltonian Monte Carlo (HMC) for a $\sim 2.6 \times 10^5$ dimensional problem. Moreover, the efficiency of MCLMC compared to HMC greatly increases as the dimensionality increases, suggesting gains of many orders of magnitude for the dimensionalities required by upcoming cosmological surveys.

Adrian Bayer · Uros Seljak · Chirag Modi 🔗 
Sat 5:00 p.m.  5:15 p.m.

Spotting Hallucinations in Inverse Problems with DataDriven Priors
(
Oral
)
SlidesLive Video Hallucinations are an inescapable consequence of solving inverse problems with deep neural networks. The expressiveness of recent generative models is the reason why they can yield results far superior to conventional regularizers; it can also lead to realisticlooking but incorrect features, potentially undermining the trust in important aspects of the reconstruction. We present a practical and computationally efficient method to determine, which regions in the solutions of inverse problems with datadriven priors are prone to hallucinations. By computing the diagonal elements of the Fisher information matrix of the likelihood and the datadriven prior separately, we can flag regions where the information is priordominated. Our diagnostic can directly be compared to the reconstructed solutions and enables users to decide if measurements in such regions are robust for their application. Our method scales linearly with the number of parameters and is thus applicable in highdimensional settings, allowing it to be rolled out broadly for the largevolume data products of future widefield surveys. 
Matt Sampson · Peter Melchior 🔗 
Sat 5:15 p.m.  5:45 p.m.

Keynote IV: Teaching LLMs to Reason
(
Keynote presentation
)
SlidesLive Video What is required to make LLMs useful scientific assistants? In this talk we cover how LLMs perform on scientific reasoning tasks, including some recent results. 
Ross Taylor 🔗 
Sat 5:45 p.m.  7:00 p.m.

Poster session
(
Poster session
)

🔗 
Sat 7:00 p.m.  7:55 p.m.

Panel: How will new technologies such as foundation models/generative models/LLMs change the way we do scientific discoveries?
(
Discussion Panel
)
SlidesLive Video 
Peter Melchior · Yashar Hezaveh · Megan Ansdell · YuanSen Ting · David W. Hogg · Irina Rish 🔗 
Sat 7:55 p.m.  8:00 p.m.

Workshop Wrap Up
(
Closing Remarks
)

🔗 


Learning the galaxyenvironment connection with graph neural networks
(
Poster
)
Galaxies coevolve with their host dark matter halos. Models of the galaxyhalo connection, calibrated using cosmological hydrodynamic simulations, can be used to populate dark matter halo catalogs with galaxies. We present a new method for inferring baryonic properties from dark matter subhalo properties using messagepassing graph neural networks (GNNs). After training on subhalo catalog data from the Illustris TNG3001 hydrodynamic simulation, our GNN can infer stellar mass from the host and neighboring subhalo positions, kinematics, masses, and maximum circular velocities. We find that GNNs can also robustly estimate stellar mass from subhalo properties in 2d projection. While other methods typically model the galaxyhalo connection in isolation, our GNN incorporates information from galaxy environments, leading to more accurate stellar mass inference. 
John F. Wu · Christian Jespersen 🔗 


Multifidelity Emulator for Cosmological Large Scale 21 cm Lightcone Images: a Fewshot Transfer Learning Approach with GAN
(
Poster
)
Largescale numerical simulations ($\gtrsim 500\rm{Mpc}$) of cosmic reionization are required to match the large survey volume for the upcoming Square Kilometre Array (SKA). We present a multifidelity emulation technique for generating largescale lightcone images of cosmic reionization. We first train generative adversarial networks (GAN) on smallscale simulations and transfer that knowledge to largescale simulations with hundreds of training images. Our method achieves high accuracy in generating lightcone images, as measured by various statistics with errors mostly below 10\%. This approach saves computational resources by 90\% compared to conventional training methods. Our technique enables efficient and accurate emulation of largescale images of the Universe.

Kangning Diao · Yi Mao 🔗 


PPDONet: Deep Operator Networks for Fast Prediction of SteadyState Solutions in DiskPlanet Systems
(
Poster
)
We have created a tool called the Protoplanetary Disk Operator Network (PPDONet) that quickly predicts diskplanet interactions in protoplanetary disks. Our tool uses Deep Operator Networks (DeepONets), a type of neural network that learns nonlinear operators to accurately represent both deterministic and stochastic differential equations. PPDONet maps three key parameters in a diskplanet system  the Shakura \& Sunyaev viscosity $\alpha$, the disk aspect ratio $h_\mathrm{0}$, and the planetstar mass ratio $q$  to the steadystate solutions for disk surface density, radial velocity, and azimuthal velocity. We've validated the accuracy of PPDONet's solutions with an extensive array of tests. Our tool can calculate the result of a diskplanet interaction for a given system in under a second using a standard laptop. PPDONet is publicly accessible for use.

Shunyuan Mao · Ruobing Dong · Lu Lu · Kwang Moo Yi · Sifan Wang · Paris Perdikaris 🔗 


Cosmology with Galaxy Photometry Alone
(
Poster
)
We present the first cosmological constraints from only the observed photometry of galaxies using neural density estimation (NDE). VillaescusaNavarro et al. \yrcite{villaescusanavarro2022} recently demonstrated that the internal physical properties of a single galaxy contain a significant amount of cosmological information. These physical properties, however, cannot be directly measured from observations. In this work, we present how we can go beyond theoretical demonstrations to infer cosmological constraints from actual galaxy observables. We use ensembled NDE and the CAMELS suite of hydrodynamical simulations to infer cosmological parameters from galaxy photometry. We find that the cosmological information in the photometry of a single galaxy is severely limited. However, since NDE dramatically reduces the cost of evaluating the posterior, we can feasibly combine the constraining power of photometry from many galaxies using hierarchical population inference and place significant cosmological constraints. With the observed photometry of $\sim$15,000 NASASloan Atlas galaxies, we constrain $\Omega_m = 0.310^{+0.080}_{0.098}$ and $\sigma_8 = 0.792^{+0.099}_{0.090}$.

ChangHoon Hahn · Peter Melchior · Francisco VillaescusaNavarro · Romain Teyssier 🔗 


Cosmological Data Compression and Inference with SelfSupervised Machine Learning
(
Poster
)
The influx of massive amounts of new data from current and upcoming cosmological surveys necessitates compression schemes that can efficiently summarize the data with minimal loss of information. We investigate the potential of selfsupervised machine learning to construct optimal summaries of cosmological datasets. Using a particular selfsupervised machine learning method, VICReg (VarianceInvarianceCovariance Regularization) deployed on lognormal random fields as well as hydrodynamical cosmological simulations, we find that selfsupervised learning can deliver highly informative summaries which can be used for downstream tasks, including providing precise and accurate constraints when used for parameter inference. Our results indicate that selfsupervised machine learning techniques offer a promising new approach for cosmological data compression and analysis. 
Aizhan Akhmetzhanova · Siddharth MishraSharma · Cora Dvorkin 🔗 


Neural Astrophysical Wind Models
(
Poster
)
The bulk kinematics and thermodynamics of hot supernovaedriven galactic winds is critically dependent on both the amount of swept up cool clouds and nonspherical collimated flow geometry. However, accurately parameterizing these physics is difficult because their functional forms are often unknown, and because the coupled nonlinear flow equations contain singularities. We show that deep neural networks embedded as individual terms in the governing coupled ordinary differential equations (ODEs) can robustly discover both of these physics, without any prior knowledge of the true function structure, as a supervised learning task. We optimize a loss function based on the Mach number, rather than the explicitly solvedfor 3 conserved variables, and apply a penalty term towards neardiverging solutions. The same neural network architecture is used for learning both the hidden massloading and surface area expansion rates. This work further highlights the feasibility of neural ODEs as a promising discovery tool with mechanistic interpretability for nonlinear inverse problems. 
Dustin Nguyen 🔗 


Assessing Summary Statistics with Mutual Information for Cosmological Inference
(
Poster
)
The ability to compress observational data and accurately estimate physical parameters relies heavily on informative summary statistics. In this paper, we introduce the use of mutual information (MI) as a means of evaluating the quality of summary statistics in inference tasks. MI can assess the sufficiency of summaries, and provide a quantitative basis for comparison. We show that commonly adopted metrics for comparing statistics can be considered as processes of MI estimation, but with different assumptions. Based on this, we propose to estimate MI using the BarberAgakov lower bound and normalizing flow based variational distributions. To demonstrate the effectiveness of our method, we compare three different summary statistics (namely the power spectrum, bispectrum, and scattering transform) in the context of inferring reionization parameters from mock SKA images. We find that this approach is able to correctly assess the informativeness of different summary statistics and allows us to select the optimal statistic for our inference task. 
Ce Sui · xiaosheng zhao · Tao Jing · Yi Mao 🔗 


MultiClass Deep SVDD: Anomaly Detection Approach in Astronomy with Distinct Inlier Categories
(
Poster
)
With the increasing volume of astronomical data generated by modern survey telescopes, automated pipelines and machine learning techniques have become crucial for analyzing and extracting knowledge from these datasets. Anomaly detection, the task of identifying irregular or unexpected patterns in the data, is a complex challenge in astronomy. In this paper, we propose MultiClass Deep SVDD (MCDSVDD), an extension of the stateoftheart anomaly detection algorithm Oneclass Deep SVDD, specifically designed to handle different inlier categories with distinct data distributions. MCDSVDD uses a neural network to map the data into hyperspheres, where each hypersphere represents a specific inlier category. The distance of each sample from the centers of these hyperspheres determines the anomaly score. We evaluate the effectiveness of MCDSVDD by comparing its performance with several anomaly detection algorithms on a large dataset of astronomical lightcurves obtained from the Zwicky Transient Facility (ZTF). Our results demonstrate the efficacy of MCDSVDD in detecting anomalous sources while leveraging the presence of different inlier categories. 
Manuel PerezCarrasco · Guillermo CabreraVives · Lorena HernandezGarcía · Paula SanchezSaez · Amelia Bayo · Alejandra MuñozArancibia · Nicolás Astorga 🔗 


Bayesian Uncertainty Quantification in Highdimensional Stellar Magnetic Field Models
(
Poster
)
Spectropolarimetric inversion techniques, known as Zeeman Doppler imaging (ZDI), have become the standard tools for reconstructing surface magnetic field maps of stars. Accurate and efficient uncertainty quantification of such magnetic field maps is an open problem in current research, and the high dimensionality of the sphericalharmonic magnetic field parameterization makes inference inherently difficult. We propose a probabilistic machine learning framework for stellar surface magnetic field reconstruction using a gradientbased Metropolisadjusted Langevin algorithm. By efficient implementation in JAX, our framework allows for reliable uncertainty quantification of the global stellar magnetic field topology. We test the proposed scheme on the bright, massive star Tau Scorpii, and show that our approach enables accurate computation of the posterior magnetic field distribution with fast convergence. 
Jennifer Andersson · Oleg Kochukhov · Zheng Zhao · Jens Sjölund 🔗 


A Comparative Study on Generative Models for High Resolution Solar Observation Imaging
(
Poster
)
Solar activity is one of the main drivers of variability in our solar system and the key source of space weather phenomena that affect Earth and near Earth space. The extensive record of high resolution extreme ultraviolet (EUV) observations from the Solar Dynamics Observatory (SDO) offers an unprecedented, very large dataset of solar images. In this work, we make use of this comprehensive dataset to investigate capabilities of current stateoftheart generative models to accurately capture the data distribution behind the observed solar activity states. Starting from StyleGANbased methods, we uncover severe deficits of this model family in handling finescale details of solar images when training on high resolution samples, contrary to training on natural face images. When switching to the diffusion based generative model family, we observe strong improvements of finescale detail generation. For the GAN family, we are able to achieve similar improvements in finescale generation when turning to ProjectedGANs, which uses multiscale discriminators with a pretrained frozen feature extractor. We conduct ablation studies to clarify mechanisms responsible for proper finescale handling. Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts, as suggested by the evaluation we conduct. We make all code, models and workflows used in this study publicly available. 
Mehdi Cherti · Alexander Czernik · Stefan Kesselheim · Frederic Effenberger · Jenia Jitsev 🔗 


3D ScatterNet: Inference from 21~cm Lightcones
(
Poster
)
The Square Kilometre Array (SKA) will have the sensitivity to take the 3D lightcones of the 21~cm signal from the epoch of reionization. This signal, however, is highly nonGaussian and can not be fully interpreted by the traditional power spectrum. In this work, we introduce the {\tt 3D ScatterNet} that combines the normalizing flows with solid harmonic wavelet scattering transform, a 3D CNN featurizer with inductive bias, to perform implicit likelihood inference (ILI) from 21~cm lightcones. We show that {\tt 3D ScatterNet} outperforms the ILI with 3D CNN in the literature. It also reaches better performance than ILI with the power spectrum for varied lightcone effects and varied signal contaminations. 
Xiaosheng Zhao · Yi Mao 🔗 


WeisfeilerLehman Graph Kernel Method: A New Approach to Weak Chemical Tagging
(
Poster
)
Stars' chemical signatures provide invaluable insights into stellar cluster formation. This study utilized the WeisfeilerLehman (WL) Graph Kernel to examine a 15dimensional elemental abundance space. Through simulating chemical distributions using normalizing flows, the effectiveness of our algorithm was affirmed. The results highlight the capability of the WL algorithm, coupled with Gaussian Process Regression, to identify patterns within elemental abundance point clouds correlated with various cluster mass functions. Notably, the WL algorithm exhibits superior interpretability, efficacy and robustness compared to deep sets and graph convolutional neural networks and enables optimal training with significantly fewer simulations (O(10)), a reduction of at least two orders of magnitude relative to graph neural networks. 
YuanSen Ting · Bhavesh Sharma 🔗 


PopulationLevel Inference for Galaxy Properties from Broadband Photometry
(
Poster
)
We present a method to infer galaxy properties and redshifts at the population level from photometric data using normalizing flows. Our method PopSED can reliably recover the redshift and stellar mass distribution of $10^{5}$ galaxies using SDSS ugriz photometry with <1 GPUhour, being $10^{6}$ times faster than the traditional SED modeling method. The approach can also be applied to spectroscopic data including DESI and Gaia XP spectra. Our method provides an efficient and selfconsistent way to learn the population posterior without deriving the posteriors for every individual object and then combining them.

Jiaxuan Li · Peter Melchior · ChangHoon Hahn · Song Huang 🔗 


A crossmodal adversarial learning method for estimating photometric redshift of quasars
(
Poster
)
Quasars play a crucial role in studying various important physical processes. We propose a crossmodal contrast learning method for estimating the photometric redshifts of quasars. Our model utilizes adversarial training to enable the conversion between photometric data features (magnitudes, colors, etc.) and photometric image features in five bands (u, g, r, i, z), in order to extract modalityinvariant features. We used $\Delta z=(z_{photo}z_{spec})/(1+z_{spec})$ as evaluation metric. The latest SOTA method, which implements crossmodal generation of simulated spectra from photometric data, has been chosen as the baseline. Firstly the proposed method was tested on the same SDSS DR17 dataset of 415,930 quasars$(1 \le z_{spec} \le 5)$ as the baseline method. Compared to the baseline, the RMSE of our $\Delta z$ decreased from 0.1235 to 0.1031. Further evaluation on a larger dataset of 465,292 quasars achieved a lower RMSE of $\Delta z$ of 0.0861. This method also can be generalized to other tasks such as galaxy classification and redshift estimation.

Chen Zhang · Yanxia Zhang · Bin Jiang · Meixia Qu · Wenyu Wang 🔗 


Harnessing the Power of Adversarial Prompting and Large Language Models for Robust Hypothesis Generation in Astronomy
(
Poster
)
This study investigates the application of Large Language Models (LLMs), specifically GPT4, within Astronomy. We employ incontext prompting, supplying the model with up to 1000 papers from the NASA Astrophysics Data System, to explore the extent to which performance can be improved by immersing the model in domainspecific literature. Our findings point towards a substantial boost in hypothesis generation when using incontext prompting, a benefit that is further accentuated by adversarial prompting. We illustrate how adversarial prompting empowers GPT4 to extract essential details from a vast knowledge base to produce meaningful hypotheses, signaling an innovative step towards employing LLMs for scientific research in Astronomy. 
Ioana Ciuca · YuanSen Ting · Sandor Kruk · Kartheik Iyer 🔗 


Diffusion Models for Probabilistic Deconvolution of Galaxy Images
(
Poster
)
Telescopes capture images with a particular pointspread function (PSF). Inferring what an image would have looked like with a much sharper PSF, a problem known as PSF deconvolution, is illposed because PSF convolution is not an invertible transformation. Deep generative models are appealing for PSF deconvolution because they can infer a posterior distribution over candidate images that, if convolved with the PSF, could have generated the observation. However, classical deep generative models such as VAEs and GANs often provide inadequate sample diversity. As an alternative, we propose a classifierfree conditional diffusion model for PSF deconvolution of galaxy images. We empirically demonstrate that this diffusion model captures a greater diversity of possible deconvolutions compared to a conditional VAE. 
Zhiwei Xue · Yuhang Li · Yash Patel · Jeffrey Regier 🔗 


Closing the stellar labels gap: An unsupervised, generative model for Gaia BP/RP spectra
(
Poster
)
The recent release of 220+ million BP/RP spectra in Gaia DR3 presents an opportunity to apply deep learning models to an unprecedented number of stellar spectra, at extremely lowresolution. The BP/RP dataset is so massive that no previous spectroscopic survey can provide enough stellar labels to cover the BP/RP parameter space. We present an unsupervised, deep, generative model for BP/RP spectra: a scatter variational autoencoder. We design a nontraditional variational autoencoder which is capable of modeling both (i) BP/RP coefficients and (ii) intrinsic scatter. Our model learns a latent space from which to generate BP/RP spectra (scatter) directly from the data itself without requiring any stellar labels. We demonstrate that our model accurately reproduces BP/RP spectra in regions of parameter space where supervised learning fails or cannot be implemented. 
Alex Laroche · Joshua Speagle 🔗 


Graph Representation of the Magnetic Field Topology in HighFidelity Plasma Simulations for Machine Learning Applications
(
Poster
)
Topological analysis of the magnetic field in simulated plasmas allows the study of various physical phenomena in a wide range of settings. One such application is magnetic reconnection, a phenomenon related to the dynamics of the magnetic field topology, which is difficult to detect and characterize in three dimensions. We propose a scalable pipeline for topological data analysis and spatiotemporal graph representation of threedimensional magnetic vector fields. We demonstrate our methods on simulations of the Earth's magnetosphere produced by Vlasiator, a supercomputerscale Vlasov theorybased simulation for nearEarth space. The purpose of this work is to challenge the machine learning community to explore graphbased machine learning approaches to address a largely open scientific problem with wideranging potential impact. 
Ioanna Bouri · Fanni Franssila · Markku J. Alho · Giulia Cozzani · Ivan Zaitsev · Minna Palmroth · Teemu Roos 🔗 


SimBIG: Fieldlevel Simulationbased Inference of Largescale Structure
(
Poster
)
Traditional methods for cosmological parameter inference from Large Scale Structure (LSS) rely on summary statistics, such as power spectra, which may not fully capture the complex nonlinear and nongaussian features of the LSS. SBI, which uses forward models of the observables and machine learning to learn a posterior distribution over the parameters, can provide more robust inferences. This work presents novel constraints using SBI on LSS at fieldlevel using Convolutional Neural Networks (CNNs) and Bayesian Neural Networks. We use the SimBIG forward modeling pipeline to generate realistic mock observations of the Baryon Oscillation Spectroscopic Survey (BOSS) at different cosmologies. We show that our method provides tighter constraints on cosmological parameters than methods based on compressing the data to the power spectrum, likely due to the CNN's ability to exploit nonGaussin information. Furthermore, we validate our pipeline on outofdistribution data generated using different forward models and show that our constraints generalize well, providing some robustness against model misspecification. This paper not only presents fieldlevel parameter constraints from real LSS observations, but also introduces methods that will be useful for future analyses on larger boxes and smaller scales with SDSS data and future surveys like DESI. 
Pablo Lemos · Liam Parker · ChangHoon Hahn · Bruno RégaldoSaint Blancard · Elena Massara · Shirley Ho · David Spergel · Chirag Modi · Azadeh Moradinezhad Dizgah · Michael Eickenberg · Jiamin Hou



Domain Adaptation via Minimax Entropy for Real/Bogus Classification of Astronomical Alerts
(
Poster
)
Time domain astronomy is advancing towards the analysis of multiple massive datasets in real time, prompting the development of multistream machine learning models. In this work, we studyDomain Adaptation (DA) for real/bogus classification of astronomical alerts using four different datasets: HiTS, DES, ATLAS, and ZTF. We study the domain shift between these datasets, and improve a naive deep learning classification model by using a fine tuning approach and semisupervised deep DA via Minimax Entropy (MME). We compare the balanced accuracy of these models for different sourcetarget scenarios. We find that both the fine tuning and MME models improve significantly the base model with as few as 1 labeled item per class coming from the target dataset, but that the MME does not compromise its performance on the source dataset. 
Guillermo CabreraVives · César Bolívar · Francisco Förster · Alejandra MuñozArancibia · Manuel PérezCarrasco · esteban reyes · Larry Denneau 🔗 


SimBIG: Galaxy Clustering beyond the Power Spectrum
(
Poster
)
The study of the Universe revolves around understanding the fundamental parameters that describe the model of our Universe. These fundamental parameters are usually constrained by analyzing what we can observe from the sky such as galaxy distributions, the cosmic microwave background, etc. The paper uses the SIMBIG framework, which leverages machine learning techniques and simulationbased inference to improve the constraints on these fundamental parameters by an analyzing galaxy clustering. When we apply SimBIG to a fraction of the BOSS galaxy survey, we achieve significantly (1.2 and 2.7×) tighter constraints on cosmological parameters such as Ωm and σ8 compared to standard power spectrum analyses. Using only 10% of the BOSS volume, we obtain constraints on H0 and S8 that are competitive with those from other probes. Future work will extend SimBIG to upcoming galaxy surveys for even stronger constraints. 
ChangHoon Hahn · Pablo Lemos · Bruno RégaldoSaint Blancard · Liam Parker · Michael Eickenberg · Shirley Ho · Jiamin Hou · Elena Massara · Chirag Modi · Azadeh Moradinezhad Dizgah · David Spergel



FLORAH: A generative model for halo assembly histories
(
Poster
)
Dark matter accounts for 85% of the matter in our Universe. The mass assembly history (MAH) of dark matter halos plays a leading role in shaping the formation and evolution of galaxies. MAHs are used extensively in semianalytic models of galaxy formation, yet current analytical methods to generate them are unable to capture their relationship with the halo internal structure and largescale environment. This paper introduces FLORAH, a machinelearning framework for generating assembly histories of dark matter halos. We train FLORAH on the assembly histories from the MultiDark Nbody simulations and demonstrate its ability to recover key properties such as the time evolution of mass and dark matter concentration. By applying the Santa Cruz semianalytic model on FLORAHgenerated assembly histories, we show that FLORAH correctly captures assembly bias, which cannot be reproduced with current analytical methods. FLORAH is the first step towards a machine learningbased framework for planting merger trees; this will allow the exploration of different galaxy formation scenarios with great computational efficiency at unprecedented accuracy. 
Tri Nguyen · Chirag Modi · Rachel Somerville · L. Y. Aaron Yung 🔗 


A Multiinput Convolutional Neural Network to Automate and Expedite Bright Transient Identification for the Zwicky Transient Facility
(
Poster
)
link
The Bright Transient Survey (BTS) relies on visual inspection ("scanning") to select sources for accomplishing its mission of spectroscopically classifying all bright extragalactic transients found by the Zwicky Transient Facility (ZTF). We present a multiinput convolutional neural network to provide a bright transient score to individual ZTF detections using their image data and 16 extracted features. Our model has the ability to eliminate the need for human scanning by automatically identifying and requesting spectroscopic observations of new bright (m<18.5 mag) transient candidates. In validation, the model is 92% pure and 97% complete, outperforming human scanners. The model is now running in realtime on all new ZTF alert packets allowing for realtime and realworld validation. 
Nabeel Rehemtulla · Adam Miller · Michael Coughlin · Theophile Jegou Du Laz 🔗 


A Novel Application of Conditional Normalizing Flows: Stellar Age Inference with Gyrochronology
(
Poster
)
Age data are critical building blocks of stellar evolutionary models, but challenging to measure for low mass main sequence (MS) stars. An unexplored solution in this regime is the application of probabilistic machine learning methods to gyrochronology, a stellar dating technique that is uniquely well suited for these stars. While accurate analytical gyrochronological models have eluded the field, here we demonstrate that a datadriven approach can be successful, by applying conditional normalizing flows to photometric data from open star clusters. We evaluate the flow results in the context of a Bayesian framework, and show that our inferred ages recover literature values well. This work demonstrates the potential of a probabilistic datadriven solution to significantly improve the effectiveness of gyrochronological stellar dating. 
Phil VanLane · Joshua Speagle 🔗 


A Hierarchy of Normalizing Flows for Modelling the GalaxyHalo Relationship
(
Poster
)
Using a large sample of galaxies taken from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project, a suite of hydrodynamic simulations varying both cosmological and astrophysical parameters, we train a normalizing flow (NF) to map the probability of various galaxy and halo properties conditioned on astrophysical and cosmological parameters. By leveraging the learnt conditional relationships we can explore a wide range of interesting questions, whilst enabling simple marginalization over nuisance parameters. We demonstrate how the model can be used as a generative model for arbitrary values of our conditional parameters; we generate halo masses and matched galaxy properties, and produce realizations of the halo mass function as well as a number of galaxy scaling relations and distribution functions. The model represents a unique and flexible approach to modelling the galaxyhalo relationship. 
Chris Lovell · Sultan Hassan · Francisco VillaescusaNavarro · Shy Genel · Chang Hoon Hahn · Daniel AnglesAlcazar · James Kwon · Natalí Soler Matubaro de Santi · Kartheik Iyer · Giulio Fabbian · Greg Bryan



Toward a Spectral Foundation Model: An AttentionBased Approach with DomainInspired FineTuning and Wavelength Parameterization
(
Poster
)
Astrophysical explorations are underpinned by largescale stellar spectroscopy surveys, necessitating a paradigm shift in spectral fitting techniques. Our study proposes tri enhancements to transcend the limitations of the current spectral emulation models. We implement an attentionbased emulator, adept at unveiling longrange information between wavelength pixels. We leverage a domainspecific finetuning strategy where the model is pretrained on spectra with fixed stellar parameters and variable elemental abundances, followed by finetuning on the entire domain. Moreover, by treating wavelength as an autonomous model parameter, akin to neural radiance fields, the model can generate spectra on any wavelength grid. In the case with a training set of O(1000), our approach exceeds current leading methods by a factor of 510 across all metrics. 
Tomasz Różański · YuanSen Ting · Maja Jablonska 🔗 


Using Multiple Vector Channels Improves $E(n)$Equivariant Graph Neural Networks
(
Poster
)
We present a natural extension to $E(n)$equivariant graph neural networks that uses multiple equivariant vectors per node. We formulate the extension and show that it improves performance across different physical systems benchmark tasks, with minimal differences in runtime or number of parameters. The proposed multichannel EGNN outperforms the standard singlechannel EGNN on Nbody charged particle dynamics, molecular property predictions, and predicting the trajectories of solar system bodies. Given the additional benefits and minimal additional cost of multichannel EGNN, we suggest that this extension may be of practical use to researchers working in machine learning for astrophysics and cosmology.

Daniel Levy · SékouOumar Kaba · Carmelo Gonzales · Santiago Miret · Siamak Ravanbakhsh 🔗 


Multiscale Flow for Robust and Optimal Cosmological Analysis
(
Poster
)
We propose Multiscale Flow, a generative Normalizing Flow that creates samples and models the fieldlevel likelihood of two dimensional cosmological data such as weak lensing, thus enabling Simulation Based Likelihood Inference. Multiscale Flow uses hierarchical decomposition of cosmological fields via a wavelet basis, and then models different wavelet components separately as Normalizing Flows. This decomposition allows us to separate the information from different scales and identify distribution shifts in the data such as unknown scaledependent systematics. The resulting likelihood analysis can not only identify these types of systematics, but can also be made optimal, in the sense that the Multiscale Flow can learn the full likelihood at the field without any dimensionality reduction. 
Biwei Dai · Uros Seljak 🔗 


Towards Unbiased GravitationalWave Parameter Estimation using ScoreBased Likelihood Characterization
(
Poster
)
Gravitational wave (GW) parameter estimation has conventionally relied on the assumption of Gaussian and stationary noise. However, noise from realworld detectors, such as LIGO, Virgo and KAGRA, often deviates considerably from these assumptions. In this paper, we use scorebased diffusion models to learn an empirical noise distribution directly from detector data, which can then be combined with the forward simulator of the physical model to provide an unbiased model of the likelihood function. We validate the method by performing inference on a simulated gravitational wave event injected in real detector noise from LIGO, demonstrating its potential for providing accurate and scalable GW parameter estimation. 
Ronan Legin · Kaze Wong · Maximiliano Isi · Alexandre Adam · Laurence PerreaultLevasseur · Yashar Hezaveh 🔗 


nbi: the Astronomer's Package for Neural Posterior Estimation
(
Poster
)
Despite the growing popularity of Neural Posterior Estimation (NPE) methods in astronomy, the adaptation of such technique into routine data analysis has been slow. We identify three critical issues: the steep learning curve of NPE for domain scientists, the inference inexactness, and the underspecification of physical forward models. To address the first two issues, we introduce a new framework and opensource software \textit{nbi}: Neural Bayesian Inference, which implements both amortized and sequential NPE.First, \textit{nbi} provides builtin ``featurizer'' networks with demonstrated efficacy on sequential data, such as light curve and spectra, thus eliminating the need for customization on the user end. Second, we introduce a modified algorithm SNPEIS, which facilities asymptotically exact inference by using the surrogate posterior only as a proposal distribution for importance sampling.These features allow \textit{nbi} to be applied offtheshelf to astronomical inference problems involving light curves and spectra, which would otherwise be tackled with MCMC and Nested Sampling. Our package\footnote{Anonymized for review.} is at \url{https://github.com/nbireview/nbi}. An application paper is concurrently submitted to this workshop and included in the appendix for reviewing purposes. 
Keming Zhang · Josh Bloom 🔗 


RealTime Stellar Spectra Fitting with Amortized Neural Posterior Estimation
(
Poster
)
In this paper, we demonstrate the utility of Amortized Neural Posterior Estimation (ANPE) for the problem of stellar spectra fitting. We introduce an effective approach to handle the measurement noise properties inherent in spectral data. This allows trainingtime data to resemble actual observed data, an aspect that is crucial for ANPE applications. We apply this approach to train an ANPE model for the APOGEE survey that observed over 2 million spectra, and demonstrate its efficacy on both mock and real APOGEE spectra.To train the ANPE model, we applied our new NPE frameworkNeural Bayesian Inference (\textit{nbi})that is concurrently submitted to this workshop as an NPE framework optimized for stressfree astronomical applications. Application of this framework allowed us to train the ANPE model with minimal customization and coding efforts.Given the association of spectral data properties with the observing instrument, we propose the idea of an ANPE ``model zoo,'' where models are trained for specific instruments and distributed with the \textit{nbi} framework to facilitate realtime stellar parameter inference. 
Keming Zhang · Tharindu Jayasinghe · Josh Bloom 🔗 