As modern astrophysical surveys deliver an unprecedented amount of data, from the imaging of hundreds of millions of distant galaxies to the mapping of cosmic radiation fields at ultrahigh resolution, conventional data analysis methods are reaching their limits in both computational complexity and optimality. Deep Learning has rapidly been adopted by the astronomical community as a promising way of exploiting these forthcoming bigdata datasets and of extracting the physical principles that underlie these complex observations. This has led to an unprecedented exponential growth of publications with in the last year alone about 500 astrophysics papers mentioning deep learning or neural networks in their abstract. Yet, many of these works remain at an exploratory level and have not been translated into real scientific breakthroughs.The goal of this workshop is to bring together Machine Learning researchers and domain experts in the field of Astrophysics to discuss the key open issues which hamper the use of Deep Learning for scientific discovery. Rather than focusing on the benefits of deep learning for astronomy, the proposed workshop aims at overcoming its limitations.Topics that we aim to cover include, but are not limited to, highdimensional Bayesian inference, simulationbased inference, uncertainty quantification and robustness to covariate shifts, anomaly and outlier detection, symmetries and equivariance. In addition, we plan on hosting metaresearch panel discussions on successfully bringing ML to Astrophysics.
Fri 5:45 a.m.  6:00 a.m.

Welcome and Workshop Introduction
(Introduction)
SlidesLive Video » 
🔗 
Fri 6:00 a.m.  6:50 a.m.

Simulationbased inference and the places it takes us
(Invited Keynote Presentation)
SlidesLive Video » Many fields of science make extensive use of mechanistic forward models which are implemented through numerical simulators, requiring the use of simulationbased approaches to statistical inference. I will talk about our recent work on developing simulation based inference methods using flexible density estimators parameterised with neural networks, our efforts on benchmarking these approaches, and applications to modelling problems in astrophysics, neuroscience and computational imaging. 
Jakob Macke 🔗 
Fri 6:50 a.m.  7:00 a.m.

Q&A
(Keynote talk live Q&A)

🔗 
Fri 7:00 a.m.  7:30 a.m.

Break

🔗 
Fri 7:30 a.m.  7:45 a.m.

GaMPEN: An ML Framework for Estimating Galaxy Morphological Parameters and Quantifying Uncertainty
(Oral)
link »
SlidesLive Video » We introduce a novel machine learning framework for estimating the Bayesian posteriors of morphological parameters for arbitrarily large numbers of galaxies. The Galaxy Morphology Posterior Estimation Network (GaMPEN) estimates values and uncertainties for a galaxy's bulgetototal light ratio, effective radius, and flux. GaMPEN also uses a Spatial Transformer Network (STN) to automatically crop input galaxy frames to an optimal size before determining their morphology. Training and testing GaMPEN on galaxies simulated to match z < 0.75 galaxies in Hyper SuprimeCam Wide images, we demonstrate that GaMPEN can accurately quantify uncertainties and estimate parameters. GaMPEN is the first machine learning framework for determining posterior distributions of multiple morphological parameters and is also the first application of an STN to optical imaging in astronomy. 
Aritra Ghosh · C. Megan Urry · Amrit Rau · Laurence PerreaultLevasseur 🔗 
Fri 7:45 a.m.  8:00 a.m.

Unsupervised Learning for Stellar Spectra with Deep Normalizing Flows
(Oral)
link »
SlidesLive Video » Stellar spectra encode detailed information about the stars. However, most machine learning approaches in stellar spectroscopy focus on supervised learning. We introduce Mendis, an unsupervised learning method, which adopts normalizing flows consisting of Neural Spline Flows and GLOW to describe the complex distribution of spectral space. A key advantage of Mendis is that we can describe the conditional distribution of spectra, conditioning on stellar parameters, to unveil the underlying structures of the spectra further. In particular, our study demonstrates that Mendis can robustly capture the pixel correlations in the spectra leading to the possibility of detecting unknown atomic transitions from stellar spectra. The probabilistic nature of Mendis also enables a rigorous determination of outliers in extensive spectroscopic surveys without the need to measure elemental abundances through existing analysis pipelines beforehand. 
Ioana Ciuca · YuanSen Ting 🔗 
Fri 8:00 a.m.  8:15 a.m.

Strong Lensing Source Reconstruction Using Continuous Neural Fields
(Oral)
link »
SlidesLive Video » From the nature of dark matter to the rate of expansion of our Universe, observations of distant galaxies distorted through strong gravitational lensing have the potential to answer some of the major open questions in astrophysics. Modeling galaxygalaxy strong lensing observations presents a number of challenges as the exact configuration of both the background source and foreground lens galaxies is unknown. A timely call, prompted by a number of upcoming surveys promising highresolution lensing images, demands methods that can efficiently model lenses at their full complexity. In this work, we introduce a novel method that uses continuous neural fields to reconstruct the complex morphology of a source galaxy while simultaneously inferring a distribution over foreground lens configurations. We demonstrate the efficacy of our method through experiments on simulated data targeting highresolution lensing images similar to those anticipated in nearfuture astrophysical surveys. 
Siddharth MishraSharma · Ge Yang 🔗 
Fri 8:15 a.m.  9:05 a.m.

Capturing the First Portrait of Our Milky Way's Black Hole & Beyond
(Invited Keynote Presentation)
SlidesLive Video » This talk will present the methods and procedures used to produce the first image of Sagittarius A*  the black hole at the heart of the Milky Way galaxy. It has been theorized for decades that a black hole will leave a "shadow" on a background of hot gas. Taking a picture of this black hole shadow could help to address a number of important scientific questions, both on the nature of black holes and the validity of general relativity. Unfortunately, due to its small size, traditional imaging approaches require an Earthsized radio telescope. In this talk, I discuss techniques we have developed to photograph the M87* and Sagittarius A* black holes using the Event Horizon Telescope, a network of telescopes scattered across the globe. Imaging Sagittarius A* proved even more challenging than M87, due to the timevariability and interstellar scattering that had to be accounted for. I will summarize how the data from the 2017 observations were imaged and highlight the challenges that had to be addressed in order to capture an image of Sagittarius A, including newly developed methods we used to characterize the morphology and uncertainty. Although we have learned a lot from these images already, remaining scientific questions motivate us to improve this computational telescope to see black hole phenomena still invisible to us. This talk will also discuss how we are developing techniques that will allow us to extract the evolving structure of our own Milky Way's black hole over the course of a night in the future, perhaps even in three dimensions. 
Katherine Bouman 🔗 
Fri 9:05 a.m.  9:15 a.m.

Q&A
(Keynote talk live Q&A)

🔗 
Fri 9:15 a.m.  10:30 a.m.

Lunch Break
(Break)

🔗 
Fri 10:30 a.m.  11:20 a.m.

Uncertainty Quantification in Deep Learning
(Invited Keynote Presentation)
SlidesLive Video » Deep learning models are bad at signalling failure: They can make predictions with high confidence, and this is problematic in realworld applications such as healthcare, selfdriving cars, and natural language systems, where there are considerable safety implications, or where there are discrepancies between the training data and data that the model makes predictions on. There is a pressing need both for understanding when models should not make predictions and improving model robustness to natural changes in the data. We'll give an overview of this problem setting. We also highlight promising avenues from recent work, including methods which average over multiple neural network predictions such as Bayesian neural nets and ensembles; as well as the recent surge in large pretrained models. 
Dustin Tran 🔗 
Fri 11:20 a.m.  11:30 a.m.

Q&A
(Keynote talk live Q&A)

🔗 
Fri 11:30 a.m.  11:45 a.m.

Reconstructing the Universe with Variational selfBoosted Sampling
(Oral)
link »
SlidesLive Video » Forward modeling approaches in cosmology seek to reconstruct the initial conditions at the beginning of the Universe from the observed survey data.However the high dimensionality of the parameter space poses a challenge to explore the full posterior with traditional algorithms such as Hamiltonian Monte Carlo (HMC) and variational inference (VI).Here we develop a hybrid scheme called variational selfboosted sampling (VBS)that learns a variational approximation for the proposal distribution of HMC with samples generated on the fly, and in turn generates independent samples as proposals for MCMC chain to reduce their autocorrelation length. We use a normalizing flow with Fourier space convolutions as our variational distribution to scale to high dimensions of interest.We show that after a short initial warmup and training phase, VBS generates better quality of samples than simple VI and reduces the correlation length in the sampling phase by a factor of 1050 over using only HMC. 
Chirag Modi · Yin Li · David Blei 🔗 
Fri 11:45 a.m.  12:00 p.m.

TNT: Vision Transformer for Turbulence Simulations
(Oral)
link »
SlidesLive Video » Turbulent dynamics is difficult to predict due to its multiscale nature and sensitivity to small perturbations. Classical solvers of turbulence simulation generally operate on finer grids and are computationally inefficient. In this paper, we propose a Turbulence Neural Transformer (TNT), which is a learned machine learning (ML) simulator based on the Transformer architecture to predict turbulent dynamics on coarse grids. TNT extends the positional embeddings of vanilla transformer to a spatiotemporal setting to learn the representation in the 3D timeseries domain, and applies Temporal Mutual SelfAttention (TMSA), which captures adjacent dependencies, to extract deep and dynamic features. TNT is capable of generating comparatively longrange predictions stably and accurately, and we show that TNT outperforms the stateoftheart Unetbased simulator on all metrics evaluated. We also test the model performance with different components removed and evaluate robustness to different initial conditions. Although more experiments are needed, we conclude that TNT has great potential to outperform existing solvers and generalize to most simulation datasets. 
Yuchen Dang · Zheyuan Hu · Miles Cranmer · Michael Eickenberg · Shirley Ho 🔗 
Fri 12:00 p.m.  12:30 p.m.

Break

🔗 
Fri 12:30 p.m.  1:20 p.m.

Equivariant machine learning, structured like classical physics
(Invited Keynote Presentation)
SlidesLive Video » In this talk we give an introduction to equivariant machine learning: a methodology that restricts the learning to the space of functions that obey the symmetries of classical physics. We will mention how we can apply this methodology to learn properties of detailed cosmological simulations from more approximate simulations. 
Soledad Villar 🔗 
Fri 1:20 p.m.  1:30 p.m.

Q&A
(Keynote talk live Q&A)

🔗 
Fri 1:30 p.m.  1:45 p.m.

Galaxy Merger Reconstruction with Equivariant Graph Normalizing Flows
(Oral)
link »
SlidesLive Video » A key yet unresolved question in modernday astronomy is how galaxies formed and evolved under the paradigm of the ΛCDM model. A critical limiting factor lies in the lack of robust tools to describe the merger history through a statistical model. In this work, we employ a generative graph network, E(n) Equivariant Graph Normalizing Flows Model. We demonstrate that, by treating the progenitors as a graph, our model robustly recovers their distributions, including their masses, merging redshifts and pairwise distances at redshift z = 2 conditioned on their z = 0 properties. The generative nature of the model enables other downstream tasks, including likelihoodfree inference, detecting anomalies and identifying subtle correlations of progenitor features. 
Kwok Sun Tang · YuanSen Ting 🔗 
Fri 1:45 p.m.  2:00 p.m.

Hybrid PhysicalNeural ODEs for Fast Nbody Simulations
(Oral)
link »
SlidesLive Video » We present a new scheme to compensate for the smallscales approximations resulting from ParticleMesh (PM) schemes for cosmological Nbody simulations. This kind of simulations are fast and low computational cost realizations of the large scale structures, but lack resolution on small scale and cannot give accurate halo matter profiles.To recover this missing accuracy, we employ a Neural network as a Fourierspace filter to parameterize the correction terms and to compensate for the PM approximations.We compare the results obtained to the ones obtained by the PGD scheme (Potential gradient descent scheme).We find that our approach outperforms PGD to the smaller scales and turns out more robust to the changes of simulation setting used during training. 
Denise Lanzieri · Francois Lanusse · JeanLuc Starck 🔗 
Fri 2:00 p.m.  2:15 p.m.

Uncovering dark matter density profiles in dwarf galaxies with graph neural networks
(Oral)
link »
SlidesLive Video » Dwarf galaxies are small dark matterdominated galaxies, some of which are embedded within the Milky Way; their lack of baryonic matter (stars and gas) makes them perfect testbeds for dark matter detection. Understanding the distribution of dark matter in these systems can be used to pin down microphysical dark matter interactions that influence the formation and evolution of structures in our Universe. We introduce a new approach for inferring the dark matter density profiles of dwarf galaxies from the observable kinematics of stars bound to these systems using graphbased machine learning. Our approach aims to address some of the limitations of established methods based on dynamical Jeans modeling such as the necessity of assuming equilibrium and reliance on secondorder moments of the stellar velocity distribution. We show that by leveraging more information about the available phase space of bound stars, this method can place stronger constraints on dark matter profiles in dwarf galaxies and has the potential to resolve some of the ongoing puzzles associated with the smallscale structure of dark matter halos. 
Tri Nguyen · Siddharth MishraSharma · Lina Necib 🔗 
Fri 2:15 p.m.  2:30 p.m.

Short Break, no coffee provided
(Break)

🔗 
Fri 2:30 p.m.  3:30 p.m.

Machine Learning for Scientific Discovery
(Discussion Panel)
SlidesLive Video » 
Josh Bloom · Daniela Huppenkothen · Laurence PerreaultLevasseur · George Stein · Francisco VillaescusaNavarro 🔗 
Fri 3:30 p.m.  5:00 p.m.

Poster Session

🔗 


Parameter Estimation in Realistic Binary Microlensing Light Curves with Neural Controlled Differential Equation
(Poster)
link »
SlidesLive Video » Machine learning method has been suggested and applied to the parameter estimation in binary microlensing events, as a replacement of the timeconsuming, samplingbased approach. However, the equalstep time series that is required by existing attempts are rarely realized in groundbased surveys. In this work, we apply the neural controlled differential equation (neural CDE) to handle microlensing light curves of realistic data quality. Our method can infer binary parameters efficiently and accurately out of light curves with irregular timesteps (including large gaps). Our work also demonstrates the power of neural CDE and other advanced machine learning methods in identifying and characterizing transient events in ongoing and future groundbased time domain surveys, given that it is common for astronomical time series from the ground to have irregular sampling and data gaps. The extended journal paper can be found at arXiv:2206.08199. 
Haimeng Zhao · Wei Zhu 🔗 


FullSky Gravitational Lensing Simulations Using Generative Adversarial Networks
(Poster)
link »
We present a new method that uses a generative adversarial network to learn how to locally redistribute the mass in lognormal mass maps to achieve Nbody quality fullsky weak lensing maps. Our mass maps reproduce a broad range of weak lensing summary statistics with percent level accuracy. Producing a single fullsky map requires ~10 seconds on an average compute node with no GPU acceleration. Relative to running a dark matter simulation, our algorithm reduces run time by more than four orders of magnitude. 
Pier Fiedorowicz · Eduardo Rozo · Supranta Boruah · William Coulton · Shirley Ho · Giulio Fabbian 🔗 


An Unsupervised Learning Approach for Quasar Continuum Prediction
(Poster)
link »
SlidesLive Video » Modeling quasar spectra is a fundamental task in astrophysics as quasars are the telltale sign of cosmic evolution. We introduce a novel unsupervised learning algorithm, Quasar Factor Analysis (QFA), for recovering the intrinsic quasar continua from noisy quasar spectra. QFA assumes that the Lyα forest can be approximated as a Gaussian process, and the continuum can be well described as a latent factor model. We show that QFA can learn, through unsupervised learning and directly from the quasar spectra, the quasar continua and Lyα forest simultaneously. Compared to previous methods, QFA achieves stateofthe art performance for quasar continuum prediction robustly but without the need for predefined training continua. In addition, the generative and probabilistic nature of QFA paves the way to understanding the evolution of black holes as well as performing outofdistribution detection and other Bayesian downstream inferences. 
Zechang Sun · YuanSen Ting · Zheng Cai 🔗 


Astroconformer: Inferring Surface Gravity of Stars from Stellar Light Curves with Transformer
(Poster)
link »
SlidesLive Video » We introduce Astroconformer, a Transformerbased model to analyze stellar light curves from the Kepler mission. Astrconformer embeds light curves to lowdimension representation, and we demonstrate that the model can robustly infer the stellar surface gravity as a downstream task. Importantly, as Transformer captures longrange information in the time series, it outperforms the stateoftheart datadriven method in the field, and the critical role of selfattention is proved through ablation experiments. Futhermore, the attention map from Astroconformer exemplifies the longrange correlation information learned by the model, leading to a more interpretable deep learning approach for asteroseismology. Besides data from Kepler, we also show that the method can generalize to sparse cadence light curves from the Rubin Observatory, paving the way for the new era of asteroseismology, harnessing information from longcadence groundbased observations. 
Jiashu Pan · YuanSen Ting · Jie Yu 🔗 


Don't Pay Attention to the Noise: Learning Selfsupervised Representations of Light Curves with a Denoising Time Series Transformer
(Poster)
link »
Astrophysical light curves are particularly challenging data objects due to the intensity and variety of noise contaminating them. Yet, despite the astronomical volumes of light curves available, the majority of algorithms used to process them are still operating on a persample basis. To remedy this, we propose a simple Transformer model called Denoising Time Series Transformer (DTST) and show that it excels at removing the noise and outliers in datasets of time series when trained with a masked objective, even when no clean targets are available. Moreover, the use of selfattention enables rich and illustrative queries into the learned representations. We present experiments on real stellar light curves from the Transiting Exoplanet Space Satellite (TESS), showing advantages of our approach compared to traditional denoising techniques. 
Mario Morvan · Nikolaos Nikolaou · Kai Yip · Ingo Waldmann 🔗 


Fast Estimation of Physical Galaxy Properties using SimulationBased Inference
(Poster)
link »
SlidesLive Video » Astrophysical surveys present the challenge of scaling up accurate simulation based inference to billions of different examples. We develop a method to train fast, accurate and amortised approximate posteriors that avoids the biases of e.g. variational inference. To train our approximate posterior, we first sample from it, then we do a few steps of an MCMC method (we use HMC), then we update the approximate posterior parameters to maximize the probability of the resulting MCMC samples. This allows us to amortise the posterior implied by any MCMC procedure. On our astrophysical samples, the amortised approximate posterior is very close to the true MCMC posterior, yet is approximately five orders of magnitude faster. 
Maxime Robeyns · Mike Walmsley · Sotiria Fotopoulou · Laurence Aitchison 🔗 


Reduced Order Model for Chemical Kinetics: A case study with Primordial Chemical Network
(Poster)
link »
SlidesLive Video » Chemical kinetics plays an important role in governing the thermal evolution in reactive flows problems. The possible interactions between chemical species increase drastically with the number of species considered in the system. Various ways have been proposed before to simplify chemical networks with an aim to reduce the computational complexity of the chemical network. These techniques oftentimes require domainknowledge experts to handcraftedly identify important reaction pathways and possible simplifications. Here, we propose a combination ofautoencoder and neural ordinary differential equation to model the temporal evolution of chemical kinetics in a reduced subspace. We demonstrated that our model has achieved a 10fold speedup compared to commonly used astrochemistry solver for a 9species primordial network, while maintaining 1 percent accuracy across a wide range of density and temperature. 
Kwok Sun Tang · Matthew Turk 🔗 


Robust SimulationBased Inference with Bayesian Neural Networks
(Poster)
link »
SlidesLive Video » Simulationbased inference is quickly becoming a standard technique to analyse data from cosmological surveys. While there has been significant recent advances in core density estimation models, applications of such techniques to real data are entirely reliant on the generalization power of neural networks, which is largely unconstrained outside the training distribution. Due to our inability to generate simulations which perfectly approximate real observations, and the large computational expense of simulating all the parameter space, simulationbased inference methods in Cosmology are vulnerable to such generalization issues.Here, we discuss the effects of both of these issues, and show how using Bayesian neural networks can mitigate biases and lead to more reliable inference outside the training set. We introduce cosmoSWAG, the first application of Stochastic Weight Averaging to cosmology, and apply it to simulationbased inference from the cosmic microwave background. 
Pablo Lemos · Miles Cranmer · Muntazir Abidi · Chang Hoon Hahn · Michael Eickenberg · Elena Massara · David Yallup · Shirley Ho 🔗 


Galaxies on graph neural networks: towards robust synthetic galaxy catalogs with deep generative models
(Poster)
link »
SlidesLive Video » The future astronomical imaging surveys are set to provide precise constraints on cosmological parameters, such as dark energy. However, production of synthetic data for these surveys, to test and validate analysis methods, suffers from a very high computational cost. In particular, generating mock galaxy catalogs at sufficiently large volume and high resolution will soon become computationally unreachable. In this paper, we address this problem with a Deep Generative Model to create robust mock galaxy catalogs that may be used to test and develop the analysis pipelines of future weak lensing surveys. We build our model on a custom built Graph Convolutional Networks, by placing each galaxy on a graph node and then connecting the graphs within each gravitationally bound system. We train our model on a cosmological simulation with realistic galaxy populations to capture the 2D and 3D orientations of galaxies. The samples from the model exhibit comparable statistical properties to those in the simulations. To the best of our knowledge, this is the first instance of a generative model on graphs in an astrophysical/cosmological context 
Yesukhei Jagvaral · Rachel Mandelbaum · Francois Lanusse · Siamak Ravanbakhsh · Sukhdeep Singh · Duncan Campbell 🔗 


Estimating Cosmological Constraints from Galaxy Cluster Abundance using SimulationBased Inference
(Poster)
link »
SlidesLive Video » Inferring the values and uncertainties of cosmological parameters in a cosmology model is of paramount importance for modern cosmic observations. In this paper, we apply simulationbased inference (SBI) approach to estimate cosmological constraints from a simplified galaxy cluster observation analysis. Using data generated from the Quijote simulation suite and analytical models, we train a machine learning algorithm to learn the probability function between cosmological parameters and the possible galaxy cluster observables. The posterior distribution of the cosmological parameters at a given observation is then obtained by sampling the predictions from the trained algorithm. Our results show that the SBI method can successfully recover the truth values of the cosmological parameters within the 2σ limit for this simplified galaxy cluster analysis, and results in similar posterior constraints obtained with a likelihoodbased Markov Chain Monte Carlo method, the current stateof theart method used in similar cosmological studies. 
Moonzarin Reza · Yuanyuan Zhang 🔗 


Bayesian Neural Networks for classification tasks in the Rubin big data era
(Poster)
link »
SlidesLive Video » Upcoming surveys such as the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST) will detect up to 10 million timevarying sources in the sky every night for ten years. This information will be transmitted in a continuous stream to brokers that will select the most promising events for a variety of science cases using machine learning algorithms. In this work, we study the benefits and challenges of Bayesian Neural Networks (BNNs) for this type of classification tasks. BNNs are found to be accurate classifiers which also provide additional information: they quantify the classification uncertainty which can be harnessed to analyse this upcoming data avalanche more efficiently. 
Anais Möller · Thibault Main de Boissiere 🔗 


Inferring Structural Parameters of LowSurfaceBrightnessGalaxies with Uncertainty Quantification using Bayesian Neural Networks
(Poster)
link »
Measuring the structural parameters (size, total brightness, light concentration, etc.) of galaxies is a significant first step towards a quantitative description of different galaxy populations. In this work, we demonstrate that a Bayesian Neural Network (BNN) can be used for the inference, with uncertainty quantification, of such morphological parameters from simulated lowsurfacebrightness galaxy images. Compared to traditional profilefitting methods, we show that the uncertainties obtained using BNNs are comparable, in magnitude, wellcalibrated and the point estimates of the parameters are closer to the true values. Our method is also significantly faster, which is very important with the advent of the era of large galaxy surveys and big data in astrophysics. 
Dimitrios Tanoglidis · Alex DrlicaWanger · Aleksandra Ciprijanovic 🔗 


SIMBIG: LikelihoodFree Inference of Galaxy Clustering
(Poster)
link »
We present SIMBIG, a likelihoodfree inference framework for analyzing galaxy clustering using a fully simulationbased approach. We apply SIMBIG to the BOSS CMASS galaxy sample using an $N$body simulationbased forward model that includes a flexible galaxyhalo model, detailed survey geometry, and realistic observational systematics. As demonstration and validation, we use SIMBIG to analyze the galaxy power spectrum out to $k_{\rm max} = 0.5\,h/{\rm Mpc}$. We derive constraints on $\Omega_m$ and $\sigma_8$ that are a factor of 1.1 and 3, respectively, tighter than previous results. This improvement comes from the extra cosmological information available on nonlinear scales that we can extract with our simulationbased approach. Furthermore, we use a suite of test simulations to confirm that our LFI approach produces conservative estimates of the true posterior. In subsequent work, we will apply SIMBIG to analyze higherorder statistics and nonstandard observables such as the bispectrum, marked power spectrum, and wavelet scatteringlike statistics.

ChangHoon Hahn · Muntazir Abidi · Michael Eickenberg · Shirley Ho · Pablo Lemos · Elena Massara · Azadeh Moradinezhad Dizgah · Bruno RégaldoSaint Blancard 🔗 


Automated discovery of interpretable gravitationalwave population models
(Poster)
link »
We present an approach to automatically discoveranalytic population models for gravitationalwave(GW) events from data. As more gravitationalwave (GW) events are detected, flexible modelssuch as Gaussian Mixture Models have becomemore important in fitting the distribution of GWproperties due to their expressivity. However, flexible models come with a cost: a large number ofparameters that lack physical motivation, making interpreting the implication of these modelsvery difficult. In this work, we demonstrate theuse of symbolic regression to distill such flexible models into interpretable analytic expressions.We recover common GW population models suchas a powerlawplusGaussian, and find a newempirical population model which combines accuracy and simplicity. Our example shows using flexible models together with symbolic regression is a promising pathway to automaticallydiscover physicallyinsightful descriptions of theevergrowing GW catalog 
Kaze Wong · Miles Cranmer 🔗 


Accelerated Galaxy SED Modeling using Amortized Neural Posterior Estimation
(Poster)
link »
Stateoftheart spectral energy distribution (SED) analyses use Bayesian inference to derive physical properties of galaxies from observed photometry or spectra. They require sampling from a highdimensional space of model parameters and take $>10100$ CPU hours per galaxy. This renders them practically infeasible for analyzing the {\em billions} of galaxies that will be observed by upcoming galaxy surveys (e.g. DESI, PFS, Rubin, Webb, and Roman). In this work, we present an alternative approach using Amortized Neural Posterior Estimation (ANPE). ANPE is a likelihoodfree inference method that employs neural networks to estimate the posterior over the full range of observations. Once trained, it requires no additional model evaluations to estimate the posterior. We present SEDflow, an ANPE method to produce posteriors of the recent Hahn et al. (2022) SED model from optical photometry and spectra. SEDflow takes ${\sim}1$ second per galaxy to obtain the posterior distributions of 12 model parameters, all of which are in excellent agreement with traditional Markov Chain Monte Carlo sampling results.

ChangHoon Hahn · Peter Melchior 🔗 


Scalable Bayesian Inference for Detection and Deblending in Astronomical Images
(Poster)
link »
SlidesLive Video » We present a new probabilistic method for detecting, deblending, and cataloging astronomical sources called the Bayesian Light Source Separator (BLISS). BLISS is based on deep generative models, which embed neural networks within a Bayesian model. For posterior inference, BLISS uses a new form of variational inference known as Forward Amortized Variational Inference. The BLISS inference routine is fast, requiring a single forward pass of the encoder networks on a GPU once the encoder networks are trained. BLISS can perform fully Bayesian inference on megapixel images in seconds, and produces highly accurate catalogs. BLISS is highly extensible, and has the potential to directly answer downstream scientific questions in addition to producing probabilistic catalogs. 
Ismael Mendoza · · Runjing Liu · Ziteng Pang · Zhe Zhao · Camille Avestruz · Jeffrey Regier 🔗 


Toward Galaxy Foundation Models with Hybrid Contrastive Learning
(Poster)
link »
SlidesLive Video » New astronomical tasks are often related to earlier tasks for which labels have already been collected. We adapt the contrastive framework BYOL to leverage those labels as a pretraining task while also enforcing augmentation invariance. For largescale pretraining, we introduce GZEvo, a set of 96.5M volunteer responses for 552k galaxy images plus a further 1.34M comparable unlabelled galaxies. Most of the 206 possible GZEvo answers are unknown for any given galaxy, and so our pretraining task uses a Dirichlet loss that naturally handles missing answers. Our hybrid pretraining/contrastive method achieves higher accuracy on our downstream task (classifying ringed galaxies) than both direct training and the purelycontrastive equivalent. Surprisingly, the simple approach of purelysupervised pretraining performs best, achieving a relative error reduction of 17\% vs. direct training on 50k+ labels. 
Mike Walmsley · Inigo Slijepcevic · Micah Bowles · Anna Scaife 🔗 


DeepBench: A library for simulating benchmark datasets for scientific analysis
(Poster)
link »
SlidesLive Video » The astronomy community is experiencing a lack of benchmark datasets tailored towards machine learning and computer vision problems. The overall goal of this software is to fill this need. We introduce the python library DeepBench, which is designed to provide a method of producing highly reproducible datasets at varying levels of complexity, size, and content. The software includes simulation of basic geometric shapes and astronomical structures, such as stars and ellipse galaxies, as well as tools to collect and store the dataset for consumption by a machine learning algorithm. We also present a trained ResNet50 model as an illustration of the expected use of the software as a benchmarking tool for different architectures’ suitability to scientifically motivated problems. We envision this tool to be useful in a suite of contexts at the intersection of astronomy and machine learning. For example, this could be useful for those new to machine learning principles and software as a way to build their skills and tools with a toymodel data set that looks like astronomical data. Also, experts can use this tool to build simple data sets that allow them to check their models. Finally, the geometric/polygon images can be used a highly simplified version of astronomical objects: this could be used for addressing a spectrum of problems from object classification to deblending. 
Maggie Voetberg · Ashia Lewis · Brian Nord 🔗 


Calibrated Predictive Distributions for Photometric Redshifts
(Poster)
link »
SlidesLive Video »
Many astrophysical analyses depend on estimates of redshifts (a proxy for distance) determined from photometric (i.e., imaging) data alone. Inaccurate estimates of photometric redshift uncertainties can result in large systematic errors. However, probability distribution outputs from many photometric redshift methods do not follow the frequentist definition of a Probability Density Function (PDF) for redshift  i.e., the fraction of times the true redshift falls between two limits $z_{1}$ and $z_{2}$ should be equal to the integral of the PDF between these limits. Previous works have used the global distribution of Probability Integral Transform (PIT) values to recalibrate PDFs, but offsetting inaccuracies in different regions of feature space can conspire to limit the efficacy of the method. We leverage a recently developed regression technique that characterizes the local PIT distribution at any location in feature space to perform a local recalibration of photometric redshift PDFs resulting in calibrated predictive distributions. Though we focus on an example from astrophysics, our method can produce predictive distributions which are calibrated at all locations in feature space for any use case.

Biprateep Dey · David Zhao · Brett Andrews · Jeff Newman · Rafael Izbicki · Ann Lee 🔗 


Learnable wavelet neural networks for cosmological inference
(Poster)
link »
Convolutional neural networks (CNNs) have been shown to both extract more information than the traditional twopoint statistics from cosmological fields, and marginalise over astrophysical effects extremely well. However, CNNs require large amounts of training data, which is potentially problematic in the domain of expensive cosmological simulations, and it is difficult to interpret the network. In this work we apply the learnable scattering transform, a kind of convolutional neural network that uses trainable wavelets as filters, to the problem of cosmological inference and marginalisation over astrophysical effects. We present two models based on the scattering transform, one constructed for performance, and one constructed for interpretability, and perform a comparison with a CNN. We find that scattering architectures are able to outperform a CNN, significantly in the case of small training data samples. Additionally we present a lightweight scattering network that is highly interpretable. 
Chris Pedersen 🔗 


Learning Galaxy Properties from Merger Trees
(Poster)
link »
SlidesLive Video » Efficiently mapping baryonic properties onto dark matter is a major challenge in astrophysics. Although semianalytic models (SAMs) and hydrodynamical simulations have made impressive advances in reproducing galaxy observables across large cosmological volumes, these methods still require significant computation times, representing a barrier to many applications.However, with Machine Learning, simulations and SAMs can now be emulated in seconds.Graph Neural Networks (GNNs) are a powerful class of learning algorithms which can naturally incorporate the very structure of data, and have been shown to perform extremely well on physical modeling, and among the most inherently graphlike structures found in astrophysics are the dark matter merger trees used by SAMs. In this paper we show that several baryonic targetsas predicted by a SAMcan be emulated to unprecedented accuracy using a trained GNN, four orders of magnitude faster than the SAM. The GNN accurately predicts stellar masses for a range of redshifts, and interpolates successfully at redshifts where it was not trained. We compare our results to the current state of the art in the field, and show improvements in reconstruction RMSE of up to a factor of two. 
Miles Cranmer · Christian Jespersen · Peter Melchior · Shirley Ho · Rachel Somerville · Austen Gabrielpillai 🔗 


LINNA: Likelihood Inference Neural Network Accelerator
(Poster)
link »
SlidesLive Video » Bayesian posterior inference of modern multiprobe cosmological analyses incurs massive computational costs. These computational costs have severe environmental impacts and the long wallclock time slows scientific productivity. To address these difficulties, we introduce LINNA: the Likelihood Inference Neural Network Accelerator. Relative to the baseline of modern survey cosmological analyses, LINNA reduces the computational cost associated with posterior inference by a factor of 8–50. To accomplish these reductions, LINNA automatically builds training data sets, creates neural network emulators, and produces a Markov chain that samples the posterior. We explicitly verify that LINNA accurately reproduces the firstyear DES cosmological constraints derived from a variety of different data vectors with our default code settings, without needing to retune the algorithm every time. Further, we find that LINNA is sufficient for enabling accurate and efficient sampling for LSST Y10 multiprobe analyses. 
ChunHao To · Eduardo Rozo 🔗 


PopulationLevel Inference of Strong Gravitational Lenses with Neural NetworkBased Selection Correction
(Poster)
link »
A new generation of sky surveys is poised to provide unprecedented volumes of data containing hundreds of thousands of new strong lensing systems in the coming years. Convolutional neural networks are currently the only stateoftheart method that can handle the onslaught of data to discover and infer the parameters of individual systems. However, many important measurements that involve strong lensing require populationlevel inference of these systems. In this work, we propose a hierarchical inference framework that uses the inference of individual lensing systems in combination with the selection function to estimate populationlevel parameters. In particular, we show that it is possible to model the selection function of a CNNbased lens finder with a neural network classifier, enabling fast inference of populationlevel parameters without the need for expensive Monte Carlo simulations. 
Ronan Legin 🔗 


Pixelated Reconstruction of Gravitational Lenses using Recurrent Inference Machines
(Poster)
link »
Modeling strong gravitational lenses in order to quantify the distortions in the images of background sources and to reconstruct the mass density in the foreground lenses has traditionally been a difficult computational challenge.As the quality of gravitational lens images increases, the task of fully exploiting the information they contain becomes computationally and algorithmically more difficult. In this work, we use a neural network based on the Recurrent Inference Machine (RIM) to simultaneously reconstruct an undistorted image of the background source and the lens mass density distribution as pixelated maps. The method we present iteratively reconstructs the model parameters (the source and density map pixels) by learning the process of optimization of their likelihood given the data using the physical model (a raytracing simulation), regularized by a prior implicitly learned by the neural network through its training data. When compared to more traditional parametric models, the proposed method is significantly more expressive and can reconstruct complex mass distributions, which we demonstrate by using realistic lensing galaxies taken from the cosmological hydrodynamic simulation IllustrisTNG. 
Alexandre Adam 🔗 


Autoencoding Galaxy Spectra
(Poster)
link »
We introduce a generative model for galaxy spectra based on an autoencoder architecture. Our encoder combines convolutional and attentive elements to identify important spectral features. The decoder is a fullyconnected network, tasked with generating restframe spectra, which are then explicitly redshifted to the observed redshifts and resampled to match the spectral resolution and coverage of the instrument. The architecture thus reflects the astrophysical dependencies of a datagenerating process that exhibits two fundamental degrees of freedom for each galaxy, namely its redshift and the characteristics of its restframe spectrum, and learns a compressed datadriven parameterization of the latter. We train this model on 100,000 optical spectra from SDSS, and find that it generates highly realistic galaxy spectra and excellent representations of the inputs. However, the desired redshiftindependent encoding is possible only by augmenting the training spectra with artificially altered redshifts. Doing so establishes redshift invariance at the price of restricting the utilized spectral features to a consensus set that is accessible for any redshift covered by the training data, thereby limiting the information extracted from all spectra. 
Peter Melchior · ChangHoon Hahn · Yan Liang 🔗 


A Convolutional Neural Network for Supernova TimeSeries Classification
(Poster)
link »
SlidesLive Video » One of the brightest objects in the universe, supernovae (SNe) are powerful explosions marking the end of a star's lifetime. Supernova (SN) type is defined by spectroscopic emission lines, but obtaining spectroscopy is often logistically unfeasible. Thus, the ability to identify SNe by type using timeseries image data alone is crucial, especially in light of the increasing breadth and depth of upcoming telescopes. We present a convolutional neural network method for fast supernova timeseries classification, with observed brightness data smoothed in both the wavelength and time directions with Gaussian process regression. We apply this method to full duration and truncated SN timeseries, to simulate retrospective as well as realtime classification performance. Retrospective classification is used to differentiate cosmologically useful Type Ia SNe from other SN types, and this method achieves >99% accuracy on this task. We are also able to differentiate between 6 SN types with 60% accuracy given only two nights of data and 98% accuracy retrospectively. 
Helen Qu 🔗 


Neural Posterior Estimation with Differentiable Simulator
(Poster)
link »
SlidesLive Video » SimulationBased Inference (SBI) is a promising Bayesian inference framework that alleviates the need for analytic likelihoods to estimate parameter distributions. Recent advances using neural density estimators in SBI algorithms have demonstrated the ability to achieve highfidelity posteriors, at the expense of a large number of simulations ; which makes their application potentially very timeconsuming when using complex physical simulations. In this work we focus on boosting the posterior density estimation using the gradients of the simulator. We present a new method to perform Neural Posterior Estimation (NPE) with a differentiable simulator and demonstrate the accelerated convergence of the NPE, up to a speedup factor of 2 below 100 simulations on classical SBI benchmark problems. 
Justine Zeghal · Francois Lanusse · Alexandre Boucaud · Benjamin Remy · Eric Aubourg 🔗 


Learning useful representations for radio astronomy “in the wild” with contrastive learning
(Poster)
link »
SlidesLive Video » Unknown class distributions in unlabelled astrophysical training data have previously been shown to detrimentally affect model performance due to dataset shift between training and validation sets. For radio galaxy classification, we demonstrate in this work that removing low angular extent sources from the unlabelled data before training produces qualitatively different training dynamics for a contrastive model. By applying the model on an unlabelled dataset with unknown class balance and subpopulation distribution to generate a representation space of radio galaxies, we show that with an appropriate cut threshold we can find a representation with FRI/FRII class separation approaching that of a supervised baseline explicitly trained to separate radio galaxies into these two classes. Furthermore we show that an excessively conservative cut threshold blocks any increase in validation accuracy. We then use the learned representation for the downstream task of performing a similarity search on rare hybrid sources, finding that the contrastive model can reliably return semantically similar samples, with the added bonus of finding duplicates which remain after preprocessing. 
Inigo Val Slijepcevic 🔗 


Probabilistic Dalek  Emulator framework with probabilistic prediction for supernova tomography
(Poster)
link »
SlidesLive Video » Supernova spectral time series can be used to reconstruct a spatially resolved model of the explosion known as supernova tomography. In addition to an observed spectral time series, a supernova tomography requires a radiative transfer model to perform the inverse problem with uncertainty quantification for a reconstruction. The smallest parametrizations of supernova tomography models are roughly a dozen parameters with a realistic one requiring more than 100. Realistic radiative transfer models require tens of CPU minutes for a single evaluation making the problem computationally intractable with traditional means requiring millions of MCMC samples for such a problem. A new method for accelerating simulations known as surrogate models or emulators using machine learning techniques offers to provides a solution for such problems and a way to understand progenitors/explosions from spectral time series. There exist emulators for the \textsc{tardis} supernova radiative transfer code but they only perform well on simplistic lowdimensional models (roughly a dozen parameters) with a small number of applications for knowledge gain in the supernova field. In this work, we present a new emulator for the radiative transfer code \textsc{tardis} that not only outperforms existing emulators but also provides uncertainties in its prediction. It presents the foundation for a future active learning based machinery that will be able to emulate very high dimensional spaces of hundreds of parameters crucial for unraveling urgent questions in supernova and related fields. 
Wolfgang Kerzendorf · Nutan Chen · Patrick van der Smagt 🔗 


On Estimating ROC Arc Length and Lower Bounding Maximal AUC for Imbalanced Classification
(Poster)
link »
SlidesLive Video »
Many astrophysical datasets have extremely imbalanced classes, and ROC curves are often used to measure the performance of classifiers on imbalanced datasets due to their insensitivity to class distributions. This paper studies the arc length of ROC curves and provides a novel way of lower bounding the maximal AUC. We show that when the data likelihood ratio is used as the score function, the arc length of the corresponding ROC curve gives rise to a novel $f$divergence. This $f$divergence can be expressed using a variational objective and estimated only using samples from the positive and negative data distributions. Moreover, we show the space below the optimal ROC curve can be expressed as a similar variational objective depending on the arctangent likelihood ratio. These new insights lead to a novel twostep procedure for finding a good score function by lower bounding the maximal AUC. Experiments on RRLyrae datasets show the proposed twostep procedure achieves good AUC performance in imbalanced binary classification tasks while being less computationally demanding.

Song Liu 🔗 