As modern astrophysical surveys deliver an unprecedented amount of data, from the imaging of hundreds of millions of distant galaxies to the mapping of cosmic radiation fields at ultra-high resolution, conventional data analysis methods are reaching their limits in both computational complexity and optimality. Deep Learning has rapidly been adopted by the astronomical community as a promising way of exploiting these forthcoming big-data datasets and of extracting the physical principles that underlie these complex observations. This has led to an unprecedented exponential growth of publications combining Machine Learning and astrophysics. Yet, many of these works remain at an exploratory level and have not been translated into real scientific breakthroughs.Following a successful initial iteration of this workshop at ICML 2022, our continued goal for this workshop series is to bring together Machine Learning researchers and domain experts in the field of Astrophysics to discuss the key open issues which hamper the use of Deep Learning for scientific discovery.
Sat 12:00 p.m. - 12:05 p.m.
|
Welcome
(
Opening Remarks
)
SlidesLive Video » |
Francois Lanusse 🔗 |
Sat 12:05 p.m. - 12:35 p.m.
|
Keynote I: Detecting and Adapting to Distribution Shift
(
Keynote presentation
)
SlidesLive Video » |
Chelsea Finn 🔗 |
Sat 12:35 p.m. - 12:50 p.m.
|
Shared Stochastic Gaussian process Decoders: A Probabilistic Generative model for Quasar Spectra
(
Oral
)
SlidesLive Video » This work proposes a scalable probabilistic latent variable model based on Gaussian processes (Lawrence, 2004) in the context of multiple observation spaces. We focus on an application in astrophysics where it is typical for data sets to contain both observed spectral features as well as scientific properties of astrophysical objects such as galaxies or exoplanets. In our application, we study the spectra of very luminous galaxies known as quasars, and their properties, such as the mass of their central supermassive black hole, their accretion rate and their luminosity, and hence, there can be multiple observation spaces. A single data point is then characterised by different classes of observations which may have different likelihoods. Our proposed model extends the baseline stochastic variational Gaussian process latent variable model (GPLVM) to this setting, proposing a seamless generative model where the quasar spectra and the scientific labels can be generated simultaneously when modelled with a shared latent space acting as input to different sets of Gaussian process decoders, one for each observation space. Further, this framework allows training in the missing data setting where a large number of dimensions per data point may be unobserved. We demonstrate high-fidelity reconstructions of the spectra and the scientific labels during test-time inference and briefly discuss the scientific interpretations of the results along with the significance of such a generative model. |
Vidhi Ramesh · Anna-Christina Eilers 🔗 |
Sat 12:50 p.m. - 1:05 p.m.
|
Disentangling gamma-ray observations of the Galactic Center using differentiable probabilistic programming
(
Oral
)
link »
SlidesLive Video »
We motivate the use of differentiable probabilistic programming techniques in order to account for the large model-space inherent to astrophysical $\gamma$-ray analyses. Targeting the longstanding Galactic Center $\gamma$-ray Excess (GCE) puzzle, we construct a differentiable forward model and likelihood that makes liberal use of GPU acceleration and vectorization in order to simultaneously account for a continuum of possible spatial morphologies consistent with the Excess emission in a fully probabilistic manner. Our setup allows for efficient inference over the large model space using variational methods. Beyond application to $\gamma$-ray data, a goal of this work is to showcase how differentiable probabilistic programming can be used as a tool to enable flexible analyses of astrophysical datasets.
|
Yitian Sun · Siddharth Mishra-Sharma · Tracy Slatyer · Yuqing Wu 🔗 |
Sat 1:05 p.m. - 1:30 p.m.
|
Morning Coffee Break
|
🔗 |
Sat 1:30 p.m. - 2:00 p.m.
|
Keynote II: Foundation Models for Radio Astronomy
(
Keynote presentation
)
SlidesLive Video » The Square Kilometre Array (SKA) will be the world's largest radio telescope, producing data volumes approaching exa-scale within a few years of operation. Extracting scientific value from those data in a timely manner will be a challenge that quickly goes beyond traditional analyses and instead requires robust domain-specific AI solutions. Here I will discuss how we have been building foundation models that can be adapted across different SKA precursor instruments, by applying self-supervised learning with instance differentiation to learn a multi-purpose representation for use in radio astronomy. For a standard radio astronomy use case, our models exceed baseline supervised classification performance by a statistically significant margin for most label volumes in the in-distribution classification case and for all label volumes in the out-of-distribution case. I will also show how such learned representations can be more widely scientifically useful, for example in similarity searches that allow us to find hybrid radio galaxies without any pre-labelled examples. |
Anna Scaife 🔗 |
Sat 2:00 p.m. - 2:15 p.m.
|
Positional Encodings for Light Curve Transformers: Playing with Positions and Attention
(
Oral
)
SlidesLive Video » We conducted empirical experiments to assess the transferability of a light curve transformer to datasets with different cadences and flux distributions using various positional encodings (PEs). We proposed a new approach to incorporate the temporal information directly to the output of the last attention layer. Our results indicated that using trainable PEs lead to significant improvements in the transformer performances and training times. Our proposed PE on attention can be trained faster than the traditional non-trainable PE transformer while achieving competitive results when transfered to other datasets. |
Guillermo Cabrera-Vives · Daniel Moreno-Cartagena · Pavlos Protopapas · Cristobal Donoso · Manuel Perez-Carrasco · Martina Cádiz-Leyton 🔗 |
Sat 2:15 p.m. - 2:30 p.m.
|
Detecting Tidal Features using Self-Supervised Learning
(
Oral
)
SlidesLive Video » Low surface brightness substructures around galaxies, known as tidal features, are a valuable tool in the detection of past or ongoing galaxy mergers. Their properties can answer questions about the progenitor galaxies involved in the interactions. This paper presents promising results from a self-supervised machine learning model, trained on data from the Ultradeep layer of the Hyper Suprime-Cam Subaru Strategic Program optical imaging survey, designed to automate the detection of tidal features. We find that self-supervised models are capable of detecting tidal features and that our model outperforms previous automated tidal feature detection methods. The previous state of the art method achieved 76% completeness for 22% contamination, while our model achieves considerably higher (96%) completeness for the same level of contamination. |
Alice Desmons · Sarah Brough · Francois Lanusse 🔗 |
Sat 2:30 p.m. - 2:45 p.m.
|
Flow Matching for Scalable Simulation-Based Inference
(
Oral
)
SlidesLive Video » Neural posterior estimation methods based on dis-crete normalizing flows have become establishedtools for simulation-based inference (SBI), butscaling them to high-dimensional problems can bechallenging. Building on recent advances in gen-erative modeling, we here present flow matchingposterior estimation (FMPE), a technique for SBIusing continuous normalizing flows. Like diffu-sion models, and in contrast to discrete flows, flowmatching allows for unconstrained architectures,providing enhanced flexibility for complex datamodalities. Flow matching, therefore, enablesexact density evaluation, fast training, and seam-less scalability to large architectures—making itideal for SBI. To showcase the improved scalabil-ity of our approach, we apply it to a challengingastrophysics problem: for gravitational-wave in-ference, FMPE outperforms methods based oncomparable discrete flows, reducing training timeby 30% with substantially improved accuracy |
Jonas Wildberger · Maximilian Dax · Simon Buchholz · Stephen R. Green · Jakob Macke · Bernhard Schölkopf 🔗 |
Sat 2:45 p.m. - 3:00 p.m.
|
Time Delay Cosmography with a Neural Ratio Estimator
(
Oral
)
SlidesLive Video »
We explore the use of a Neural Ratio Estimator (NRE) to determine the Hubble constant ($H_0$) in the context of time delay cosmography. Assuming a Singular Isothermal Ellipsoid (SIE) mass profile for the deflector, we simulate time delay measurements, image position measurements, and modeled lensing parameters. We train the NRE to output the posterior distribution of $H_0$ given the time delay measurements, the relative Fermat potentials (calculated from the modeled parameters and the measured image positions), the deflector redshift, and the source redshift. We compare the accuracy and precision of the NRE with traditional explicit likelihood methods in the limit where the latter is tractable and reliable, using Gaussian noise to emulate measurement uncertainties in the input parameters. The NRE posteriors track closely the ones from the conventional method and, while they show a slight tendency to overestimate uncertainties for the quads lensing configuration, they can be combined in a population inference without bias.
|
Ève Campeau-Poirier · Laurence Perreault-Levasseur · Adam Coogan · Yashar Hezaveh 🔗 |
Sat 3:00 p.m. - 4:00 p.m.
|
Lunch Break
|
🔗 |
Sat 4:00 p.m. - 4:30 p.m.
|
Keynote III: Astrophysics Meets MLOps
(
Keynote presentation
)
SlidesLive Video » Harnessing the power of machine learning (ML) for astrophysical discovery necessitates not only sophisticated models but also the implementation of robust operations, or MLOps. This talk will highlight the potential of MLOps to streamline the deployment of ML in astrophysics. We'll delve into the iterative cycle of data acquisition, model re-training, evaluation, deployment, and monitoring/telemetry — collectively forming the engine of successful AI ventures. We'll explore key MLOps practices from industry, emphasizing the critical role of experiment tracking, reproducibility, data/model provenance and versioning, and effective collaboration in this process. |
Dmitry Duev 🔗 |
Sat 4:30 p.m. - 4:45 p.m.
|
Diffusion generative modeling for galaxy surveys: emulating clustering for inference at the field level
(
Oral
)
SlidesLive Video » We introduce a diffusion-generative model to describe the distribution of galaxies in our Universe directly as a collection of points in 3-D space, without resorting to binning or voxelization. The custom diffusion model, which employs graph neural networks as the backbone score function, can be used as an emulator that accurately reproduces essential summary statistics of the galaxy distribution and enables cosmological parameter estimation using gradient-based inference techniques. This approach allows for a comprehensive analysis of cosmological data by circumventing limitations inherent to summary statistics-based as well as likelihood-free methods. |
Carolina Cuesta · Siddharth Mishra-Sharma 🔗 |
Sat 4:45 p.m. - 5:00 p.m.
|
Field-Level Inference with Microcanonical Langevin Monte Carlo
(
Oral
)
SlidesLive Video »
Field-level inference provides a means to optimally extract information from upcoming cosmological surveys, but requires efficient sampling of a high-dimensional parameter space.This work applies Microcanonical Langevin Monte Carlo (MCLMC) to sample the initial conditions of the Universe, as well as the cosmological parameters $\sigma_8$ and $\Omega_m$, from simulations of cosmic structure.MCLMC is shown to be over an order of magnitude more efficient than traditional Hamiltonian Monte Carlo (HMC) for a $\sim 2.6 \times 10^5$ dimensional problem. Moreover, the efficiency of MCLMC compared to HMC greatly increases as the dimensionality increases, suggesting gains of many orders of magnitude for the dimensionalities required by upcoming cosmological surveys.
|
Adrian Bayer · Uros Seljak · Chirag Modi 🔗 |
Sat 5:00 p.m. - 5:15 p.m.
|
Spotting Hallucinations in Inverse Problems with Data-Driven Priors
(
Oral
)
SlidesLive Video » Hallucinations are an inescapable consequence of solving inverse problems with deep neural networks. The expressiveness of recent generative models is the reason why they can yield results far superior to conventional regularizers; it can also lead to realistic-looking but incorrect features, potentially undermining the trust in important aspects of the reconstruction. We present a practical and computationally efficient method to determine, which regions in the solutions of inverse problems with data-driven priors are prone to hallucinations. By computing the diagonal elements of the Fisher information matrix of the likelihood and the data-driven prior separately, we can flag regions where the information is prior-dominated. Our diagnostic can directly be compared to the reconstructed solutions and enables users to decide if measurements in such regions are robust for their application. Our method scales linearly with the number of parameters and is thus applicable in high-dimensional settings, allowing it to be rolled out broadly for the large-volume data products of future wide-field surveys. |
Matt Sampson · Peter Melchior 🔗 |
Sat 5:15 p.m. - 5:45 p.m.
|
Keynote IV: Teaching LLMs to Reason
(
Keynote presentation
)
SlidesLive Video » What is required to make LLMs useful scientific assistants? In this talk we cover how LLMs perform on scientific reasoning tasks, including some recent results. |
Ross Taylor 🔗 |
Sat 5:45 p.m. - 7:00 p.m.
|
Poster session
|
🔗 |
Sat 7:00 p.m. - 7:55 p.m.
|
Panel: How will new technologies such as foundation models/generative models/LLMs change the way we do scientific discoveries?
(
Discussion Panel
)
SlidesLive Video » |
Peter Melchior · Yashar Hezaveh · Megan Ansdell · Yuan-Sen Ting · David W. Hogg · Irina Rish 🔗 |
Sat 7:55 p.m. - 8:00 p.m.
|
Workshop Wrap Up
(
Closing Remarks
)
|
🔗 |
-
|
Learning the galaxy-environment connection with graph neural networks
(
Poster
)
Galaxies co-evolve with their host dark matter halos. Models of the galaxy-halo connection, calibrated using cosmological hydrodynamic simulations, can be used to populate dark matter halo catalogs with galaxies. We present a new method for inferring baryonic properties from dark matter subhalo properties using message-passing graph neural networks (GNNs). After training on subhalo catalog data from the Illustris TNG300-1 hydrodynamic simulation, our GNN can infer stellar mass from the host and neighboring subhalo positions, kinematics, masses, and maximum circular velocities. We find that GNNs can also robustly estimate stellar mass from subhalo properties in 2d projection. While other methods typically model the galaxy-halo connection in isolation, our GNN incorporates information from galaxy environments, leading to more accurate stellar mass inference. |
John F. Wu · Christian Jespersen 🔗 |
-
|
Multi-fidelity Emulator for Cosmological Large Scale 21 cm Lightcone Images: a Few-shot Transfer Learning Approach with GAN
(
Poster
)
Large-scale numerical simulations ($\gtrsim 500\rm{Mpc}$) of cosmic reionization are required to match the large survey volume for the upcoming Square Kilometre Array (SKA). We present a multi-fidelity emulation technique for generating large-scale lightcone images of cosmic reionization. We first train generative adversarial networks (GAN) on small-scale simulations and transfer that knowledge to large-scale simulations with hundreds of training images. Our method achieves high accuracy in generating lightcone images, as measured by various statistics with errors mostly below 10\%. This approach saves computational resources by 90\% compared to conventional training methods. Our technique enables efficient and accurate emulation of large-scale images of the Universe.
|
Kangning Diao · Yi Mao 🔗 |
-
|
PPDONet: Deep Operator Networks for Fast Prediction of Steady-State Solutions in Disk-Planet Systems
(
Poster
)
We have created a tool called the Protoplanetary Disk Operator Network (PPDONet) that quickly predicts disk-planet interactions in protoplanetary disks. Our tool uses Deep Operator Networks (DeepONets), a type of neural network that learns non-linear operators to accurately represent both deterministic and stochastic differential equations. PPDONet maps three key parameters in a disk-planet system -- the Shakura \& Sunyaev viscosity $\alpha$, the disk aspect ratio $h_\mathrm{0}$, and the planet-star mass ratio $q$ -- to the steady-state solutions for disk surface density, radial velocity, and azimuthal velocity. We've validated the accuracy of PPDONet's solutions with an extensive array of tests. Our tool can calculate the result of a disk-planet interaction for a given system in under a second using a standard laptop. PPDONet is publicly accessible for use.
|
Shunyuan Mao · Ruobing Dong · Lu Lu · Kwang Moo Yi · Sifan Wang · Paris Perdikaris 🔗 |
-
|
Cosmology with Galaxy Photometry Alone
(
Poster
)
We present the first cosmological constraints from only the observed photometry of galaxies using neural density estimation (NDE). Villaescusa-Navarro et al. \yrcite{villaescusa-navarro2022} recently demonstrated that the internal physical properties of a single galaxy contain a significant amount of cosmological information. These physical properties, however, cannot be directly measured from observations. In this work, we present how we can go beyond theoretical demonstrations to infer cosmological constraints from actual galaxy observables. We use ensembled NDE and the CAMELS suite of hydrodynamical simulations to infer cosmological parameters from galaxy photometry. We find that the cosmological information in the photometry of a single galaxy is severely limited. However, since NDE dramatically reduces the cost of evaluating the posterior, we can feasibly combine the constraining power of photometry from many galaxies using hierarchical population inference and place significant cosmological constraints. With the observed photometry of $\sim$15,000 NASA-Sloan Atlas galaxies, we constrain $\Omega_m = 0.310^{+0.080}_{-0.098}$ and $\sigma_8 = 0.792^{+0.099}_{-0.090}$.
|
ChangHoon Hahn · Peter Melchior · Francisco Villaescusa-Navarro · Romain Teyssier 🔗 |
-
|
Cosmological Data Compression and Inference with Self-Supervised Machine Learning
(
Poster
)
The influx of massive amounts of new data from current and upcoming cosmological surveys necessitates compression schemes that can efficiently summarize the data with minimal loss of information. We investigate the potential of self-supervised machine learning to construct optimal summaries of cosmological datasets. Using a particular self-supervised machine learning method, VICReg (Variance-Invariance-Covariance Regularization) deployed on lognormal random fields as well as hydrodynamical cosmological simulations, we find that self-supervised learning can deliver highly informative summaries which can be used for downstream tasks, including providing precise and accurate constraints when used for parameter inference. Our results indicate that self-supervised machine learning techniques offer a promising new approach for cosmological data compression and analysis. |
Aizhan Akhmetzhanova · Siddharth Mishra-Sharma · Cora Dvorkin 🔗 |
-
|
Neural Astrophysical Wind Models
(
Poster
)
The bulk kinematics and thermodynamics of hot supernovae-driven galactic winds is critically dependent on both the amount of swept up cool clouds and non-spherical collimated flow geometry. However, accurately parameterizing these physics is difficult because their functional forms are often unknown, and because the coupled non-linear flow equations contain singularities. We show that deep neural networks embedded as individual terms in the governing coupled ordinary differential equations (ODEs) can robustly discover both of these physics, without any prior knowledge of the true function structure, as a supervised learning task. We optimize a loss function based on the Mach number, rather than the explicitly solved-for 3 conserved variables, and apply a penalty term towards near-diverging solutions. The same neural network architecture is used for learning both the hidden mass-loading and surface area expansion rates. This work further highlights the feasibility of neural ODEs as a promising discovery tool with mechanistic interpretability for non-linear inverse problems. |
Dustin Nguyen 🔗 |
-
|
Assessing Summary Statistics with Mutual Information for Cosmological Inference
(
Poster
)
The ability to compress observational data and accurately estimate physical parameters relies heavily on informative summary statistics. In this paper, we introduce the use of mutual information (MI) as a means of evaluating the quality of summary statistics in inference tasks. MI can assess the sufficiency of summaries, and provide a quantitative basis for comparison. We show that commonly adopted metrics for comparing statistics can be considered as processes of MI estimation, but with different assumptions. Based on this, we propose to estimate MI using the Barber-Agakov lower bound and normalizing flow based variational distributions. To demonstrate the effectiveness of our method, we compare three different summary statistics (namely the power spectrum, bispectrum, and scattering transform) in the context of inferring reionization parameters from mock SKA images. We find that this approach is able to correctly assess the informativeness of different summary statistics and allows us to select the optimal statistic for our inference task. |
Ce Sui · xiaosheng zhao · Tao Jing · Yi Mao 🔗 |
-
|
Multi-Class Deep SVDD: Anomaly Detection Approach in Astronomy with Distinct Inlier Categories
(
Poster
)
With the increasing volume of astronomical data generated by modern survey telescopes, automated pipelines and machine learning techniques have become crucial for analyzing and extracting knowledge from these datasets. Anomaly detection, the task of identifying irregular or unexpected patterns in the data, is a complex challenge in astronomy. In this paper, we propose Multi-Class Deep SVDD (MCDSVDD), an extension of the state-of-the-art anomaly detection algorithm One-class Deep SVDD, specifically designed to handle different inlier categories with distinct data distributions. MCDSVDD uses a neural network to map the data into hyperspheres, where each hypersphere represents a specific inlier category. The distance of each sample from the centers of these hyperspheres determines the anomaly score. We evaluate the effectiveness of MCDSVDD by comparing its performance with several anomaly detection algorithms on a large dataset of astronomical light-curves obtained from the Zwicky Transient Facility (ZTF). Our results demonstrate the efficacy of MCDSVDD in detecting anomalous sources while leveraging the presence of different inlier categories. |
Manuel Perez-Carrasco · Guillermo Cabrera-Vives · Lorena Hernandez-García · Paula Sanchez-Saez · Amelia Bayo · Alejandra Muñoz-Arancibia · Nicolás Astorga 🔗 |
-
|
Bayesian Uncertainty Quantification in High-dimensional Stellar Magnetic Field Models
(
Poster
)
Spectropolarimetric inversion techniques, known as Zeeman Doppler imaging (ZDI), have become the standard tools for reconstructing surface magnetic field maps of stars. Accurate and efficient uncertainty quantification of such magnetic field maps is an open problem in current research, and the high dimensionality of the spherical-harmonic magnetic field parameterization makes inference inherently difficult. We propose a probabilistic machine learning framework for stellar surface magnetic field reconstruction using a gradient-based Metropolis-adjusted Langevin algorithm. By efficient implementation in JAX, our framework allows for reliable uncertainty quantification of the global stellar magnetic field topology. We test the proposed scheme on the bright, massive star Tau Scorpii, and show that our approach enables accurate computation of the posterior magnetic field distribution with fast convergence. |
Jennifer Andersson · Oleg Kochukhov · Zheng Zhao · Jens Sjölund 🔗 |
-
|
A Comparative Study on Generative Models for High Resolution Solar Observation Imaging
(
Poster
)
Solar activity is one of the main drivers of variability in our solar system and the key source of space weather phenomena that affect Earth and near Earth space. The extensive record of high resolution extreme ultraviolet (EUV) observations from the Solar Dynamics Observatory (SDO) offers an unprecedented, very large dataset of solar images. In this work, we make use of this comprehensive dataset to investigate capabilities of current state-of-the-art generative models to accurately capture the data distribution behind the observed solar activity states. Starting from StyleGAN-based methods, we uncover severe deficits of this model family in handling fine-scale details of solar images when training on high resolution samples, contrary to training on natural face images. When switching to the diffusion based generative model family, we observe strong improvements of fine-scale detail generation. For the GAN family, we are able to achieve similar improvements in fine-scale generation when turning to ProjectedGANs, which uses multi-scale discriminators with a pre-trained frozen feature extractor. We conduct ablation studies to clarify mechanisms responsible for proper fine-scale handling. Using distributed training on supercomputers, we are able to train generative models for up to 1024x1024 resolution that produce high quality samples indistinguishable to human experts, as suggested by the evaluation we conduct. We make all code, models and workflows used in this study publicly available. |
Mehdi Cherti · Alexander Czernik · Stefan Kesselheim · Frederic Effenberger · Jenia Jitsev 🔗 |
-
|
3D ScatterNet: Inference from 21~cm Light-cones
(
Poster
)
The Square Kilometre Array (SKA) will have the sensitivity to take the 3D light-cones of the 21~cm signal from the epoch of reionization. This signal, however, is highly non-Gaussian and can not be fully interpreted by the traditional power spectrum. In this work, we introduce the {\tt 3D ScatterNet} that combines the normalizing flows with solid harmonic wavelet scattering transform, a 3D CNN featurizer with inductive bias, to perform implicit likelihood inference (ILI) from 21~cm light-cones. We show that {\tt 3D ScatterNet} outperforms the ILI with 3D CNN in the literature. It also reaches better performance than ILI with the power spectrum for varied light-cone effects and varied signal contaminations. |
Xiaosheng Zhao · Yi Mao 🔗 |
-
|
Weisfeiler-Lehman Graph Kernel Method: A New Approach to Weak Chemical Tagging
(
Poster
)
Stars' chemical signatures provide invaluable insights into stellar cluster formation. This study utilized the Weisfeiler-Lehman (WL) Graph Kernel to examine a 15-dimensional elemental abundance space. Through simulating chemical distributions using normalizing flows, the effectiveness of our algorithm was affirmed. The results highlight the capability of the WL algorithm, coupled with Gaussian Process Regression, to identify patterns within elemental abundance point clouds correlated with various cluster mass functions. Notably, the WL algorithm exhibits superior interpretability, efficacy and robustness compared to deep sets and graph convolutional neural networks and enables optimal training with significantly fewer simulations (O(10)), a reduction of at least two orders of magnitude relative to graph neural networks. |
Yuan-Sen Ting · Bhavesh Sharma 🔗 |
-
|
Population-Level Inference for Galaxy Properties from Broadband Photometry
(
Poster
)
We present a method to infer galaxy properties and redshifts at the population level from photometric data using normalizing flows. Our method PopSED can reliably recover the redshift and stellar mass distribution of $10^{5}$ galaxies using SDSS ugriz photometry with <1 GPU-hour, being $10^{6}$ times faster than the traditional SED modeling method. The approach can also be applied to spectroscopic data including DESI and Gaia XP spectra. Our method provides an efficient and self-consistent way to learn the population posterior without deriving the posteriors for every individual object and then combining them.
|
Jiaxuan Li · Peter Melchior · ChangHoon Hahn · Song Huang 🔗 |
-
|
A cross-modal adversarial learning method for estimating photometric redshift of quasars
(
Poster
)
Quasars play a crucial role in studying various important physical processes. We propose a cross-modal contrast learning method for estimating the photometric redshifts of quasars. Our model utilizes adversarial training to enable the conversion between photometric data features (magnitudes, colors, etc.) and photometric image features in five bands (u, g, r, i, z), in order to extract modality-invariant features. We used $|\Delta z|=|(z_{photo}-z_{spec})/(1+z_{spec})|$ as evaluation metric. The latest SOTA method, which implements cross-modal generation of simulated spectra from photometric data, has been chosen as the baseline. Firstly the proposed method was tested on the same SDSS DR17 dataset of 415,930 quasars$(1 \le z_{spec} \le 5)$ as the baseline method. Compared to the baseline, the RMSE of our $\Delta z$ decreased from 0.1235 to 0.1031. Further evaluation on a larger dataset of 465,292 quasars achieved a lower RMSE of $\Delta z$ of 0.0861. This method also can be generalized to other tasks such as galaxy classification and redshift estimation.
|
Chen Zhang · Yanxia Zhang · Bin Jiang · Meixia Qu · Wenyu Wang 🔗 |
-
|
Harnessing the Power of Adversarial Prompting and Large Language Models for Robust Hypothesis Generation in Astronomy
(
Poster
)
This study investigates the application of Large Language Models (LLMs), specifically GPT-4, within Astronomy. We employ in-context prompting, supplying the model with up to 1000 papers from the NASA Astrophysics Data System, to explore the extent to which performance can be improved by immersing the model in domain-specific literature. Our findings point towards a substantial boost in hypothesis generation when using in-context prompting, a benefit that is further accentuated by adversarial prompting. We illustrate how adversarial prompting empowers GPT-4 to extract essential details from a vast knowledge base to produce meaningful hypotheses, signaling an innovative step towards employing LLMs for scientific research in Astronomy. |
Ioana Ciuca · Yuan-Sen Ting · Sandor Kruk · Kartheik Iyer 🔗 |
-
|
Diffusion Models for Probabilistic Deconvolution of Galaxy Images
(
Poster
)
Telescopes capture images with a particular point-spread function (PSF). Inferring what an image would have looked like with a much sharper PSF, a problem known as PSF deconvolution, is ill-posed because PSF convolution is not an invertible transformation. Deep generative models are appealing for PSF deconvolution because they can infer a posterior distribution over candidate images that, if convolved with the PSF, could have generated the observation. However, classical deep generative models such as VAEs and GANs often provide inadequate sample diversity. As an alternative, we propose a classifier-free conditional diffusion model for PSF deconvolution of galaxy images. We empirically demonstrate that this diffusion model captures a greater diversity of possible deconvolutions compared to a conditional VAE. |
Zhiwei Xue · Yuhang Li · Yash Patel · Jeffrey Regier 🔗 |
-
|
Closing the stellar labels gap: An unsupervised, generative model for Gaia BP/RP spectra
(
Poster
)
The recent release of 220+ million BP/RP spectra in Gaia DR3 presents an opportunity to apply deep learning models to an unprecedented number of stellar spectra, at extremely low-resolution. The BP/RP dataset is so massive that no previous spectroscopic survey can provide enough stellar labels to cover the BP/RP parameter space. We present an unsupervised, deep, generative model for BP/RP spectra: a scatter variational auto-encoder. We design a non-traditional variational auto-encoder which is capable of modeling both (i) BP/RP coefficients and (ii) intrinsic scatter. Our model learns a latent space from which to generate BP/RP spectra (scatter) directly from the data itself without requiring any stellar labels. We demonstrate that our model accurately reproduces BP/RP spectra in regions of parameter space where supervised learning fails or cannot be implemented. |
Alex Laroche · Joshua Speagle 🔗 |
-
|
Graph Representation of the Magnetic Field Topology in High-Fidelity Plasma Simulations for Machine Learning Applications
(
Poster
)
Topological analysis of the magnetic field in simulated plasmas allows the study of various physical phenomena in a wide range of settings. One such application is magnetic reconnection, a phenomenon related to the dynamics of the magnetic field topology, which is difficult to detect and characterize in three dimensions. We propose a scalable pipeline for topological data analysis and spatiotemporal graph representation of three-dimensional magnetic vector fields. We demonstrate our methods on simulations of the Earth's magnetosphere produced by Vlasiator, a supercomputer-scale Vlasov theory-based simulation for near-Earth space. The purpose of this work is to challenge the machine learning community to explore graph-based machine learning approaches to address a largely open scientific problem with wide-ranging potential impact. |
Ioanna Bouri · Fanni Franssila · Markku J. Alho · Giulia Cozzani · Ivan Zaitsev · Minna Palmroth · Teemu Roos 🔗 |
-
|
SimBIG: Field-level Simulation-based Inference of Large-scale Structure
(
Poster
)
Traditional methods for cosmological parameter inference from Large Scale Structure (LSS) rely on summary statistics, such as power spectra, which may not fully capture the complex non-linear and non-gaussian features of the LSS. SBI, which uses forward models of the observables and machine learning to learn a posterior distribution over the parameters, can provide more robust inferences. This work presents novel constraints using SBI on LSS at field-level using Convolutional Neural Networks (CNNs) and Bayesian Neural Networks. We use the SimBIG forward modeling pipeline to generate realistic mock observations of the Baryon Oscillation Spectroscopic Survey (BOSS) at different cosmologies. We show that our method provides tighter constraints on cosmological parameters than methods based on compressing the data to the power spectrum, likely due to the CNN's ability to exploit non-Gaussin information. Furthermore, we validate our pipeline on out-of-distribution data generated using different forward models and show that our constraints generalize well, providing some robustness against model misspecification. This paper not only presents field-level parameter constraints from real LSS observations, but also introduces methods that will be useful for future analyses on larger boxes and smaller scales with SDSS data and future surveys like DESI. |
Pablo Lemos · Liam Parker · ChangHoon Hahn · Bruno Régaldo-Saint Blancard · Elena Massara · Shirley Ho · David Spergel · Chirag Modi · Azadeh Moradinezhad Dizgah · Michael Eickenberg · Jiamin Hou
|
-
|
Domain Adaptation via Minimax Entropy for Real/Bogus Classification of Astronomical Alerts
(
Poster
)
Time domain astronomy is advancing towards the analysis of multiple massive datasets in real time, prompting the development of multi-stream machine learning models. In this work, we studyDomain Adaptation (DA) for real/bogus classification of astronomical alerts using four different datasets: HiTS, DES, ATLAS, and ZTF. We study the domain shift between these datasets, and improve a naive deep learning classification model by using a fine tuning approach and semi-supervised deep DA via Minimax Entropy (MME). We compare the balanced accuracy of these models for different source-target scenarios. We find that both the fine tuning and MME models improve significantly the base model with as few as 1 labeled item per class coming from the target dataset, but that the MME does not compromise its performance on the source dataset. |
Guillermo Cabrera-Vives · César Bolívar · Francisco Förster · Alejandra Muñoz-Arancibia · Manuel Pérez-Carrasco · esteban reyes · Larry Denneau 🔗 |
-
|
SimBIG: Galaxy Clustering beyond the Power Spectrum
(
Poster
)
The study of the Universe revolves around understanding the fundamental parameters that describe the model of our Universe. These fundamental parameters are usually constrained by analyzing what we can observe from the sky such as galaxy distributions, the cosmic microwave background, etc. The paper uses the SIMBIG framework, which leverages machine learning techniques and simulation-based inference to improve the constraints on these fundamental parameters by an- analyzing galaxy clustering. When we apply SimBIG to a fraction of the BOSS galaxy survey, we achieve significantly (1.2 and 2.7×) tighter constraints on cosmological parameters such as Ωm and σ8 compared to standard power spectrum analyses. Using only 10% of the BOSS volume, we obtain constraints on H0 and S8 that are competitive with those from other probes. Future work will extend SimBIG to upcoming galaxy surveys for even stronger constraints. |
ChangHoon Hahn · Pablo Lemos · Bruno Régaldo-Saint Blancard · Liam Parker · Michael Eickenberg · Shirley Ho · Jiamin Hou · Elena Massara · Chirag Modi · Azadeh Moradinezhad Dizgah · David Spergel
|
-
|
FLORAH: A generative model for halo assembly histories
(
Poster
)
Dark matter accounts for 85% of the matter in our Universe. The mass assembly history (MAH) of dark matter halos plays a leading role in shaping the formation and evolution of galaxies. MAHs are used extensively in semi-analytic models of galaxy formation, yet current analytical methods to generate them are unable to capture their relationship with the halo internal structure and large-scale environment. This paper introduces FLORAH, a machine-learning framework for generating assembly histories of dark matter halos. We train FLORAH on the assembly histories from the MultiDark N-body simulations and demonstrate its ability to recover key properties such as the time evolution of mass and dark matter concentration. By applying the Santa Cruz semi-analytic model on FLORAH-generated assembly histories, we show that FLORAH correctly captures assembly bias, which cannot be reproduced with current analytical methods. FLORAH is the first step towards a machine learning-based framework for planting merger trees; this will allow the exploration of different galaxy formation scenarios with great computational efficiency at unprecedented accuracy. |
Tri Nguyen · Chirag Modi · Rachel Somerville · L. Y. Aaron Yung 🔗 |
-
|
A Multi-input Convolutional Neural Network to Automate and Expedite Bright Transient Identification for the Zwicky Transient Facility
(
Poster
)
link »
The Bright Transient Survey (BTS) relies on visual inspection ("scanning") to select sources for accomplishing its mission of spectroscopically classifying all bright extragalactic transients found by the Zwicky Transient Facility (ZTF). We present a multi-input convolutional neural network to provide a bright transient score to individual ZTF detections using their image data and 16 extracted features. Our model has the ability to eliminate the need for human scanning by automatically identifying and requesting spectroscopic observations of new bright (m<18.5 mag) transient candidates. In validation, the model is 92% pure and 97% complete, outperforming human scanners. The model is now running in real-time on all new ZTF alert packets allowing for real-time and real-world validation. |
Nabeel Rehemtulla · Adam Miller · Michael Coughlin · Theophile Jegou Du Laz 🔗 |
-
|
A Novel Application of Conditional Normalizing Flows: Stellar Age Inference with Gyrochronology
(
Poster
)
Age data are critical building blocks of stellar evolutionary models, but challenging to measure for low mass main sequence (MS) stars. An unexplored solution in this regime is the application of probabilistic machine learning methods to gyrochronology, a stellar dating technique that is uniquely well suited for these stars. While accurate analytical gyrochronological models have eluded the field, here we demonstrate that a data-driven approach can be successful, by applying conditional normalizing flows to photometric data from open star clusters. We evaluate the flow results in the context of a Bayesian framework, and show that our inferred ages recover literature values well. This work demonstrates the potential of a probabilistic data-driven solution to significantly improve the effectiveness of gyrochronological stellar dating. |
Phil Van-Lane · Joshua Speagle 🔗 |
-
|
A Hierarchy of Normalizing Flows for Modelling the Galaxy-Halo Relationship
(
Poster
)
Using a large sample of galaxies taken from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project, a suite of hydrodynamic simulations varying both cosmological and astrophysical parameters, we train a normalizing flow (NF) to map the probability of various galaxy and halo properties conditioned on astrophysical and cosmological parameters. By leveraging the learnt conditional relationships we can explore a wide range of interesting questions, whilst enabling simple marginalization over nuisance parameters. We demonstrate how the model can be used as a generative model for arbitrary values of our conditional parameters; we generate halo masses and matched galaxy properties, and produce realizations of the halo mass function as well as a number of galaxy scaling relations and distribution functions. The model represents a unique and flexible approach to modelling the galaxy-halo relationship. |
Chris Lovell · Sultan Hassan · Francisco Villaescusa-Navarro · Shy Genel · Chang Hoon Hahn · Daniel Angles-Alcazar · James Kwon · Natalí Soler Matubaro de Santi · Kartheik Iyer · Giulio Fabbian · Greg Bryan
|
-
|
Toward a Spectral Foundation Model: An Attention-Based Approach with Domain-Inspired Fine-Tuning and Wavelength Parameterization
(
Poster
)
Astrophysical explorations are underpinned by large-scale stellar spectroscopy surveys, necessitating a paradigm shift in spectral fitting techniques. Our study proposes tri enhancements to transcend the limitations of the current spectral emulation models. We implement an attention-based emulator, adept at unveiling long-range information between wavelength pixels. We leverage a domain-specific fine-tuning strategy where the model is pre-trained on spectra with fixed stellar parameters and variable elemental abundances, followed by fine-tuning on the entire domain. Moreover, by treating wavelength as an autonomous model parameter, akin to neural radiance fields, the model can generate spectra on any wavelength grid. In the case with a training set of O(1000), our approach exceeds current leading methods by a factor of 5-10 across all metrics. |
Tomasz Różański · Yuan-Sen Ting · Maja Jablonska 🔗 |
-
|
Using Multiple Vector Channels Improves $E(n)$-Equivariant Graph Neural Networks
(
Poster
)
We present a natural extension to $E(n)$-equivariant graph neural networks that uses multiple equivariant vectors per node. We formulate the extension and show that it improves performance across different physical systems benchmark tasks, with minimal differences in runtime or number of parameters. The proposed multi-channel EGNN outperforms the standard single-channel EGNN on N-body charged particle dynamics, molecular property predictions, and predicting the trajectories of solar system bodies. Given the additional benefits and minimal additional cost of multi-channel EGNN, we suggest that this extension may be of practical use to researchers working in machine learning for astrophysics and cosmology.
|
Daniel Levy · Sékou-Oumar Kaba · Carmelo Gonzales · Santiago Miret · Siamak Ravanbakhsh 🔗 |
-
|
Multiscale Flow for Robust and Optimal Cosmological Analysis
(
Poster
)
We propose Multiscale Flow, a generative Normalizing Flow that creates samples and models the field-level likelihood of two dimensional cosmological data such as weak lensing, thus enabling Simulation Based Likelihood Inference. Multiscale Flow uses hierarchical decomposition of cosmological fields via a wavelet basis, and then models different wavelet components separately as Normalizing Flows. This decomposition allows us to separate the information from different scales and identify distribution shifts in the data such as unknown scale-dependent systematics. The resulting likelihood analysis can not only identify these types of systematics, but can also be made optimal, in the sense that the Multiscale Flow can learn the full likelihood at the field without any dimensionality reduction. |
Biwei Dai · Uros Seljak 🔗 |
-
|
Towards Unbiased Gravitational-Wave Parameter Estimation using Score-Based Likelihood Characterization
(
Poster
)
Gravitational wave (GW) parameter estimation has conventionally relied on the assumption of Gaussian and stationary noise. However, noise from real-world detectors, such as LIGO, Virgo and KAGRA, often deviates considerably from these assumptions. In this paper, we use score-based diffusion models to learn an empirical noise distribution directly from detector data, which can then be combined with the forward simulator of the physical model to provide an unbiased model of the likelihood function. We validate the method by performing inference on a simulated gravitational wave event injected in real detector noise from LIGO, demonstrating its potential for providing accurate and scalable GW parameter estimation. |
Ronan Legin · Kaze Wong · Maximiliano Isi · Alexandre Adam · Laurence Perreault-Levasseur · Yashar Hezaveh 🔗 |
-
|
nbi: the Astronomer's Package for Neural Posterior Estimation
(
Poster
)
Despite the growing popularity of Neural Posterior Estimation (NPE) methods in astronomy, the adaptation of such technique into routine data analysis has been slow. We identify three critical issues: the steep learning curve of NPE for domain scientists, the inference inexactness, and the under-specification of physical forward models. To address the first two issues, we introduce a new framework and open-source software \textit{nbi}: Neural Bayesian Inference, which implements both amortized and sequential NPE.First, \textit{nbi} provides built-in ``featurizer'' networks with demonstrated efficacy on sequential data, such as light curve and spectra, thus eliminating the need for customization on the user end. Second, we introduce a modified algorithm SNPE-IS, which facilities asymptotically exact inference by using the surrogate posterior only as a proposal distribution for importance sampling.These features allow \textit{nbi} to be applied off-the-shelf to astronomical inference problems involving light curves and spectra, which would otherwise be tackled with MCMC and Nested Sampling. Our package\footnote{Anonymized for review.} is at \url{https://github.com/nbi-review/nbi}. An application paper is concurrently submitted to this workshop and included in the appendix for reviewing purposes. |
Keming Zhang · Josh Bloom 🔗 |
-
|
Real-Time Stellar Spectra Fitting with Amortized Neural Posterior Estimation
(
Poster
)
In this paper, we demonstrate the utility of Amortized Neural Posterior Estimation (ANPE) for the problem of stellar spectra fitting. We introduce an effective approach to handle the measurement noise properties inherent in spectral data. This allows training-time data to resemble actual observed data, an aspect that is crucial for ANPE applications. We apply this approach to train an ANPE model for the APOGEE survey that observed over 2 million spectra, and demonstrate its efficacy on both mock and real APOGEE spectra.To train the ANPE model, we applied our new NPE framework---Neural Bayesian Inference (\textit{nbi})---that is concurrently submitted to this workshop as an NPE framework optimized for stress-free astronomical applications. Application of this framework allowed us to train the ANPE model with minimal customization and coding efforts.Given the association of spectral data properties with the observing instrument, we propose the idea of an ANPE ``model zoo,'' where models are trained for specific instruments and distributed with the \textit{nbi} framework to facilitate real-time stellar parameter inference. |
Keming Zhang · Tharindu Jayasinghe · Josh Bloom 🔗 |