Workshop
The Synergy of Scientific and Machine Learning Modelling (SynS & ML) Workshop
Antoine Wehenkel · Jörn Jacobsen · Emily Fox · Anuj Karpatne · Victoriya Kashtanova · Xuan Di · Emmanuel de Bézenac · Naoya Takeishi · Gilles Louppe
Meeting Room 320
The Synergy of Scientific and Machine Learning Modeling Workshop (“SynS & ML”) is an interdisciplinary forum for researchers and practitioners interested in the challenges of combining scientific and machine-learning models. The goal of the workshop is to gather together machine learning researchers eager to include scientific models into their pipelines, domain experts working on augmenting their scientific models with machine learning, and researchers looking for opportunities to incorporate ML in widely-used scientific models.
The power of machine learning (ML), its ability to build models by leveraging real-world data is also a big limitation; the quality and quantity of training data bound the validity domain of ML models. On the other hand, expert models are designed from first principles or experiences and labelled scientific if validated on curated real-world data, often even harvested for this specific purpose, as advised by the scientific method since Galileo. Expert models only describe idealized versions of the world which may hinder their deployment for important tasks such as accurate forecasting or parameter inference. This workshop focuses on the combination of two modelling paradigms: scientific and ML modelling. Sometimes called hybrid learning or grey-box modelling, this combination should 1) unlock new applications for expert models, and 2) leverage the data compressed within scientific models to improve the quality of modern ML models. In this spirit, the workshop focuses on the symbiosis between these two complementary modelling approaches; it aims to be a “rendezvous” between the involved communities, spanning sub-fields of science, engineering and health, and encompassing ML, to allow them to present their respective problems and solutions and foster new collaborations. The workshop invites researchers to contribute to such topics; see Call for Papers and Call for Scientific Models for more details.
Schedule
Fri 12:00 p.m. - 12:30 p.m.
|
Two for One: Diffusion Models and Force Fields for Coarse-Grained Molecular Dynamics - Rianne van den Berg
(
Talk
)
SlidesLive Video Coarse-grained (CG) molecular dynamics enables the study of biological processes at temporal and spatial scales that would be intractable at an atomistic resolution. However, accurately learning a CG force field remains a challenge. In this work, we leverage connections between score-based generative models, force fields and molecular dynamics to learn a CG force field without requiring any force inputs during training. Specifically, we train a diffusion generative model on protein structures from molecular dynamics simulations, and we show that its score function approximates a force field that can directly be used to simulate CG molecular dynamics. While having a vastly simplified training setup compared to previous work, we demonstrate that our approach leads to improved performance across several small- to medium-sized protein simulations, reproducing the CG equilibrium distribution, and preserving dynamics of all-atom simulations such as protein folding events. |
🔗 |
Fri 12:30 p.m. - 1:00 p.m.
|
The Domain Generalization Issue in Data-Based Dynamical Models - Patrick Gallinari
(
Talk
)
SlidesLive Video Generalizing beyond the training domain remains a critical challenge in the adoption of machine learning (ML) for modeling the physical world. While explicit physical models offer guarantees and can be applied in any valid environment, establishing causal relationships between model variables, statistical models learn correlations solely from samples, limiting their applicability to the training domain context. In this presentation, we will discuss the main challenges posed by this issue when modeling spatio-temporal phenomena and highlight recent advancements aimed at resolving it. |
🔗 |
Fri 1:00 p.m. - 1:30 p.m.
|
Coffee Break
(
Break
)
|
🔗 |
Fri 1:30 p.m. - 2:00 p.m.
|
Climate modeling with AI: Hype or Reality? - Laure Zanna
(
Talk
)
SlidesLive Video Climate simulations remain one of the best tools to understand and predict global and regional climate change. Yet, the accuracy of numerical climate models is constrained by computing power. Uncertainties in climate predictions originate partly from a poor or lacking representation of processes, such as ocean turbulence and clouds, that are not resolved in global climate models but impact the large-scale temperature, rainfall, sea level, etc. Representing these unresolved processes has been a bottleneck in improving climate simulations and projections. The explosion of climate data and the power of machine learning (ML) algorithms are suddenly offering new opportunities: can we deepen our understanding of these unresolved processes and simultaneously improve their representation in climate models to reduce climate projections uncertainty? This talk discusses the advantages and challenges of using machine learning for climate projections. The focus will be on recent work in which we leverage ML tools to learn representations of unresolved ocean processes – in particular, learning symbolic expressions. Some of this work suggests that machine learning could open the door to discovering new physics from data and enhance climate predictions. Yet, many questions remain unanswered, making the next decade exciting and challenging for ML + climate modeling for robust and actionable climate projections. |
🔗 |
Fri 2:00 p.m. - 3:00 p.m.
|
Poster Session 1
(
Poster Session
)
|
🔗 |
Fri 3:00 p.m. - 4:00 p.m.
|
Lunch
(
Break
)
|
🔗 |
Fri 4:00 p.m. - 4:30 p.m.
|
AI-Augmented Epidemiology for Covid-19 - Sercan Arik
(
Talk
)
SlidesLive Video The COVID-19 pandemic has highlighted the global need for reliable models of disease spread. We propose a novel AI-augmented forecast modeling framework that is based on integrating machine-learned mapping of informative covariates into the compartmental models. Via prospective evaluation, we demonstrate that our framework yields very accurate forecasts, outperforming other models, as well as explainable insights into the disease dynamics and what-if simulation capabilities. |
🔗 |
Fri 4:30 p.m. - 5:00 p.m.
|
Underspecification, inductive bias, and hybrid modeling - Andrew C. Miller
(
Talk
)
SlidesLive Video Modern machine learning models can be so flexible that multiple parameter configurations — sometimes with diverging properties — can explain the observed data equally well. This "underspecification" leads to unpredictable and sometimes undesirable behavior. We review examples of underspecified pipelines and their consequences in a variety of domains. We survey approaches to resolve this problem, which typically introduce an inductive bias to constrain the solution space and better control model behavior. We compare these techniques, with a particular focus on hybrid models — approaches that blend highly structured models (i.e., built from first principles) with flexible pattern recognition components (i.e., deep, data-driven) to produce expressive models with sensible generalization behavior. |
🔗 |
Fri 5:00 p.m. - 5:15 p.m.
|
ADEPT - Automatic Differentiation Enabled Plasma Transport
(
Spotlight
)
link
SlidesLive Video Fusion and astrophysical plasmas are often modeled as charged fluids. To understand their dynamical behavior, the Euler partial-differential-equations for a charged fluid can be solved as an initial value problem or as an externally driven system. However, the fluid equations do not always capture the full richness of the plasma dynamics, for example, in scenarios where microphysics governs macroscopic behavior. Here, we present ADEPT, an Automatic Differentiation Enabled Plasma Transport code written in JAX that has been tested to reproduce known physics. ADEPT provides the user with the ability to train deep models for missing microphysics that improves the solvers ability to reproduce experimental data and/or first-principles simulations. Other applications include the ability to learn improved numerical methods, to perform parameter estimation and parameter discovery [1], and to perform sensitivity analyses. The GitHub repo includes the source code, installation and testing instructions, and an ab-initio simulation generated dataset on which we have trained a microphysics model [2]. [1] - A. S. Joglekar and A. G. R. Thomas - Unsupervised Discovery of Nonlinear Plasma Physics using Differentiable Kinetic Simulations - Journal of Plasma Physics - Dec 2022 [2] - A. S. Joglekar and A. G. R. Thomas - IoP Machine Learning Science & Technology - In Preparation |
Archis Joglekar 🔗 |
Fri 5:15 p.m. - 5:30 p.m.
|
ChemGymRL: An Interactive Framework for Reinforcement Learning for Digital Chemistry
(
Spotlight
)
link
SlidesLive Video (arXiv paper: https://arxiv.org/abs/2305.14177) The ChemGymRL Open Source Library enables the use of Reinforcement Learning (RL) algorithms to train agents towards the target of operating individual chemistry benches given specific material targets. The environment can be thought of as a virtual chemistry laboratory consisting of different stations (or benches) where a variety of tasks can be completed. The laboratory consists of three basic elements: vessels, shelves, and benches. Vessels contain materials, in pure or mixed form, with each vessel tracking the hidden internal state of their contents. Whether an agent can determine this state, through measurement or reasoning, is up to the design of each bench and the user’s goals. A shelf can hold any vessels not currently in use, as well as the resultants (or output vessels) of previous experiments. Benches are sub-environments which enact various physical or chemical processes on the vessels. Each bench recreates a simplified version of one task in a material design pipeline and has an observation and action space specific to the task at hand. ChemGymRL is designed in a modular fashion so that new benches can be added or modified with minimal difficulty or changes to the source code. A bench must be able to receive a set of initial experimental supplies, possibly including vessels, and return the results of the intended experiment, also including modified vessels. The details and methods of how the benches interact with the vessels between these two points are completely up to the user, including the goal of the bench. In this initial version of ChemGymRL we have implemented some core benches, which we describe in the following sections and which will allow us to demonstrate an example workflow. |
🔗 |
Fri 5:30 p.m. - 5:45 p.m.
|
Repurposing Density Functional Theory to Suit Deep Learning
(
Spotlight
)
link
SlidesLive Video Density Functional Theory (DFT) accurately pre- dicts the properties of molecules given their atom types and positions, and often serves as ground truth for molecular property prediction tasks. Neu- ral Networks (NN) are popular tools for such tasks and are trained on DFT datasets, with the aim to approximate DFT at a fraction of the com- putational cost. Research in other areas of ma- chine learning has shown that generalisation per- formance of NNs tends to improve with increased dataset size, however, the computational cost of DFT limits the size of DFT datasets. We present PySCFIPU, a DFT library that allows us to iterate on both dataset generation and NN training. We create QM10X, a dataset with 108 conformers, in 13 hours, on which we subsequently train SchNet in 12 hours. We show that the predictions of SchNet improve solely by increasing training data without incorporating further inductive biases. |
Alexander Mathiasen 🔗 |
Fri 5:45 p.m. - 6:00 p.m.
|
ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback
(
Spotlight
)
SlidesLive Video |
Daniel McNeela 🔗 |
Fri 6:00 p.m. - 6:30 p.m.
|
Coffee Break
(
Break
)
|
🔗 |
Fri 6:30 p.m. - 6:45 p.m.
|
ClimaX: A Foundation Model for Weather and Climate
(
Spotlight
)
link
SlidesLive Video Recent data-driven approaches based on machine learning aim to directly solve a downstream fore- casting or projection task by learning a data- driven functional mapping using deep neural net- works. However, these networks are trained us- ing curated and homogeneous climate datasets for specific spatiotemporal tasks, and thus lack the generality of currently used physics-informed numerical models for weather and climate mod- eling. We develop and demonstrate ClimaX, a flexible and generalizable deep learning model for weather and climate science that can be trained using heterogeneous datasets spanning different variables, spatiotemporal coverage, and physical groundings. ClimaX extends the Transformer ar- chitecture with novel encoding and aggregation blocks that allow effective use of available com- pute and data while maintaining general utility. ClimaX is pretrained with a self-supervised learn- ing objective on climate datasets derived from CMIP6. The pretrained ClimaX can then be fine- tuned to address a breadth of climate and weather tasks, including those that involve atmospheric variables and spatiotemporal scales unseen during pretraining. Compared to existing data-driven baselines, we show that this generality in Cli- maX results in superior performance on bench- marks for weather forecasting and climate pro- jections. Our source code is available at https: //github.com/microsoft/ClimaX. |
Tung Nguyen 🔗 |
Fri 6:45 p.m. - 7:00 p.m.
|
Titanium 3D Microstructure for Physics-based Generative Models: A Dataset and Primer
(
Spotlight
)
link
SlidesLive Video When engineers design components, they rely on accurate property descriptions of the materials be- ing used to predict performance. Most materials used for engineering applications are composed of an arrangement of atomic constituents into crystalline phases, which control the properties of that material. The crystal orientations embedded in this microstructural information differ from the information in conventional light optical images, and are critical for developing and designing ma- terials for a range of applications. However, col- lecting microstructure information through experi- mental methods is expensive and time-consuming, especially when 3D information is needed. In or- der to model material properties under different material processing conditions (resulting in differ- ent microstructural arrangements), physics-based generative models are needed to create realistic synthetic microstructures. This research releases microstructural data of a titanium alloy, Ti-6Al- 4V, and discusses their information modalities and the physics needed to be incorporated to enable the design of physics-based generative models for generating synthetic microstructures. |
Devendra Jangid 🔗 |
Fri 7:00 p.m. - 8:00 p.m.
|
Poster Session 2
(
Poster Session
)
|
🔗 |
-
|
Knowledge-Guided Additive Modeling For Supervised Regression
(
Poster
)
link
Several hybrid approaches, incorporating prior domain knowledge within machine learning (ML), have recently been introduced to improve generalization and robustness. However, such hybrid methods were mostly tested on dynamical systems, with only limited study of the influence of each model component on global performance and parameter identification. In this work, we assess the performance of hybrid modeling on standard regression problems: we compare, on synthetic problems, several approaches for training such hybrid models, focusing on model-agnostic methods that additively combine a parametric physical term with an ML term. We also introduce a new hybrid approach based on partial dependence functions. Experiments are carried out with different types of ML models, including tree-based models and neural networks. |
Yann Claes · Van Anh Huynh-Thu · Pierre Geurts 🔗 |
-
|
A language-based recommendation system for material discovery
(
Poster
)
link
Data-driven approaches for material discovery have been accelerated by emerging efforts in machine learning. We introduce a material discovery framework that uses natural language embeddings derived from pretrained language models as generalized representations of inorganic materials. The discovery framework consists of a joint scheme that first recalls relevant candidates, and next ranks the candidates based on multiple target properties. Leveraging the contextual knowledge encoded in language representations, the discovery framework enables both representational similarity analysis for candidate generation, and multi-task learning to share information across related properties for ranking. Our language-based framework provides a generalized means of embedding structure for effective material recommendation, which is task-agnostic and can be applied to various material systems. |
Jiaxing Qu · Yuxuan Xie · Elif Ertekin 🔗 |
-
|
RANS-PINN based Simulation Surrogates for Predicting Turbulent Flows
(
Poster
)
link
Physics-informed neural networks (PINNs) provide a framework to build surrogate models for dynamical systems governed by differential equations. During the learning process, PINNs incorporate a physics-based regularization term within the loss function to enhance generalization performance. Since simulating dynamics controlled by partial differential equations (PDEs) can be computationally expensive, PINNs have gained popularity in learning parametric surrogates for fluid flow problems governed by Navier-Stokes equations. In this work, we introduce RANS-PINN, a modified PINN framework, to predict flow fields (i.e., velocity and pressure) in high Reynolds number turbulent flow regime. To account for the additional complexity introduced by turbulence, RANS-PINN employs a 2-equation eddy viscosity model based on a Reynolds-averaged Navier-Stokes (RANS) formulation. Furthermore, we adopt a novel training approach that ensures effective initialization and balance among the various components of the loss function. The effectiveness of RANS-PINN framework is then demonstrated using a parametric PINN. |
Shinjan Ghosh · Amit Chakraborty · Georgia Olympia Brikis · Biswadip Dey 🔗 |
-
|
Multi-Objective PSO-PINN
(
Poster
)
link
PSO-PINN is a class of algorithms for training physics-informed neural networks (PINN) using particle swarm optimization (PSO). PSO-PINN can mitigate the well-known difficulties presented by gradient descent training of PINNs when dealing with PDEs with irregular solutions. Additionally, PSO-PINN is an ensemble approach to PINN that yields reproducible predictions with quantified uncertainty. In this paper, we introduce Multi-Objective PSO-PINN, which treats PINN training as a multi-objective problem. The proposed multi-objective PSO-PINN represents a new paradigm in PINN training, which thus far has relied on scalarizations of the multi-objective loss function. A full multi-objective approach allows on-the-fly compromises in the trade-off among the various components of the PINN loss function. Experimental results with a diffusion PDE problem demonstrate the promise of this methodology. |
Caio Davi · Ulisses Braga-Neto 🔗 |
-
|
How to Select Physics-Informed Neural Networks in the Absence of Ground Truth: A Pareto Front-Based Strategy
(
Poster
)
link
Physics-informed neural networks (PINNs) are a promising method that have been recently proposed as a potential mesh-free alternative to conventional numerical methods for solving partial differential equations (PDEs) that are key to our model of the physical world. However, these problems typically lack ground truth, making selection of more accurate PINN models difficult, especially as PINN training involves processes such as hyper-parameter tuning, as is common in machine learning. This is exacerbated as PINNs need to balance multiple objectives, comprising the governing PDE, and associated boundary or initial conditions. Hence, a Pareto front-based model selection strategy is proposed to guide the selection of better performing models. In this strategy, an approximation to the Pareto set of solutions with minimal PINN loss is first constructed for different balances of loss weights. A loss weight located on the convex part of the Pareto front is then selected to rescale the training loss across all solutions. Across our experiments, this rescaling demonstrates a strong correlation between the rescaled PINN loss and mean squared error (MSE) relative to simulated ground truth, thereby illustrating the effectiveness of this proposed strategy for PINN model selection. |
Zhao Wei · Jian Cheng Wong · Nicholas Sung · Abhishek Gupta · Chin Chun Ooi · Pao-Hsiung Chiu · My Ha Dao · Yew Soon ONG 🔗 |
-
|
Estimation of Physical Coefficients for CO$_2$ Sequestration using Deep Generative Priors based Inverse Modeling Framework
(
Poster
)
link
Estimation of permeability plays a crucial role in the forecast and risk evaluation of carbon storage operations. In real-world scenarios, direct measurements of permeability and CO$_2$ plume extent are typically sparse due to the high cost. Although inverse modeling approaches allow to estimate the subsurface properties including permeability using observations of other indirect data such as pressure, saturation, and measurements from geophysics, it suffers from expensive computation for large-scale problems. In this work, we test a deep generative prior to sample 3D permeability realizations from a low-dimensional latent space. Then we incorporate the constructed deep generative model to the inverse modeling framework and use observations of saturation to reconstruct the permeability field.
|
Jiawei Shen · Harry Lee · Hongkyu Yoon 🔗 |
-
|
Reliable coarse-grained turbulent simulations through combined offline learning and neural emulation
(
Poster
)
link
Integration of machine learning (ML) models of unresolved dynamics into numerical simulations of fluid dynamics has been demonstrated to improve the accuracy of coarse resolution simulations. However, when trained in a purely offline mode, integrating ML models into the numerical scheme can lead to instabilities. In the context of a 2D, quasi-geostrophic turbulent system, we demonstrate that including an additional network in the loss function, which emulates the state of the system into the future, produces offline-trained ML models that capture important subgrid processes, with improved stability properties. |
Chris Pedersen · Laure Zanna · Joan Bruna · Pavel Perezhogin 🔗 |
-
|
Learning from Topology: Cosmological Parameter Estimation from the Large-scale Structure
(
Poster
)
link
The topology of the large-scale structure of the universe contains valuable information on the underlying cosmological parameters. While persistent homology can be applied to extract this topological information, the optimal method for parameter estimation from this tool remains an open question. To address this, we propose a neural network model to map persistence images to cosmological parameters. Through a parameter recovery test, we demonstrate that our model provides accurate and precise estimates, considerably outperforming a Bayesian inference approach. |
Jacky H. T. Yip · Adam Rouhiainen · Gary Shiu 🔗 |
-
|
Coupling Self-Attention Generative Adversarial Network and Bayesian Inversion for Carbon Storage System
(
Poster
)
link
Characterization of geologic heterogeneity at a geological carbon storage (GCS) system is crucial for cost-effective carbon injection planning and reliable carbon storage. With recent advances in computational power and sensor technology, large-scale fine-resolution simulations of multiphase flow and reactive transport processes have been available. However, traditional large-scale inversion approaches have limited utility for sites with complex subsurface structures such as faults and microfractures within the host rock matrix. In this work, we present a Bayesian inversion method with deep generative priors tailored for the computationally efficient and accurate characterization of GCS sites. Self-attention generative adversarial network (SAGAN) is used to learn the approximate subsurface property (e.g., permeability and porosity) distribution from discrete fracture network models as a prior and accelerated stochastic inversion is performed on the low-dimensional latent space in a Bayesian framework. Numerical examples with a synthetic fracture field with pressure and heat tracer data sets are presented to test the accuracy, speed, and uncertainty quantification capability of our proposed joint data inversion method. |
Jichao Bao · Harry Lee · Hongkyu Yoon 🔗 |
-
|
NuCLR: Nuclear Co-Learned Representations
(
Poster
)
link
We introduce Nuclear Co-Learned Representations (NuCLR), a deep learning model that predicts various nuclear observables, including binding and decay energies, and nuclear charge radii. The model is trained using a multi-task approach with shared representations and obtains state-of-the-art performance, achieving levels of precision that are crucial for understanding fundamental phenomena in nuclear (astro)physics. We also report an intriguing finding that the learned representation of NuCLR exhibits the prominent emergence of crucial aspects of the nuclear shell model, namely the shell structure, including the well-known magic numbers, and the Pauli Exclusion Principle. This suggests that the model is capable of capturing the underlying physical principles, and that our approach has the potential to offer valuable insights into nuclear theory. |
Niklas Nolte · Ouail Kitouni · Mike Williams · Sokratis Trifinopoulos · Subhash Kantamneni 🔗 |
-
|
What if We Enrich day-ahead Solar Irradiance Time Series Forecasting with Spatio-Temporal Context?
(
Poster
)
link
The global integration of solar power into the electrical grid could have a crucial impact on climate change mitigation, yet poses a challenge due to solar irradiance variability. We present a deep learning architecture which uses spatio-temporal context from satellite data for highly accurate day-ahead time-series forecasting, in particular Global Horizontal Irradiance (GHI). We provide a multi-quantile variant which outputs a prediction interval for each time-step, serving as a measure of forecasting uncertainty. In addition, we suggest a testing scheme that separates easy and difficult scenarios, which appears useful to evaluate model performance in varying cloud conditions. Our approach exhibits robust performance in solar irradiance forecasting, including zero-shot generalization tests at unobserved solar stations, and holds great promise in promoting the effective use of solar power and the resulting reduction of CO$_{2}$ emissions.
|
Oussama Boussif · Ghait Boukachab · Dan Assouline · Stefano Massaroli · Tianle Yuan · Loubna Benabbou · Yoshua Bengio 🔗 |
-
|
How important are specialized transforms in Neural Operators?
(
Poster
)
link
Computational forward simulations of physical systems constrained by system of PDEs, Initial and Boundary values are proving to provide tremendous value for a variety of industrial domains. Transform-based Neural Operators like Fourier Neural Operators and Wavelet Neural Operators have received a lot of attention for their potential in providing fast scale-free simulations. In traditional analysis of signals, an optimal choice of a transform is critically dependent on the nature of the data. Therefore ideally all transformations should be learnable. Given that most of the considered transforms are linear. In this work, we seek to investigate what could be the cost in performance, if any, if all the transform layers are replaced by simple linear layers. We make a surprising observation that linear layers suffice to provide performance comparable to best-known transform-based Operators and seem to do so at possibly a compute time advantage as well. This raises the question about the importance of transform-based Operators. |
Ritam Majumdar · Shirish Karande · Lovekesh Vig 🔗 |
-
|
Accelerating Molecular Graph Neural Networks via Knowledge Distillation
(
Poster
)
link
Recent advances in graph neural networks (GNNs) have allowed molecular simulations with accuracy on par with conventional gold-standard methods at a fraction of the computational cost. Nonetheless, as the field has been progressing to bigger and more complex architectures, state-of-the-art GNNs have become largely prohibitive for many large-scale applications. In this paper, we, for the first time, explore the utility of knowledge distillation (KD) for accelerating molecular GNNs. To this end, we devise KD strategies that facilitate the distillation of hidden representations in directional and equivariant GNNs and evaluate their performance on the regression task of energy and force prediction. We validate our protocols across different teacher-student configurations and demonstrate that they can boost the predictive accuracy of student models without altering their architecture. Using our KD protocols, we manage to close as much as 59\% of the gap in predictive accuracy between models like GemNet-OC and PaiNN with zero additional cost at inference. |
Filip Ekström Kelvinius · Dimitar Georgiev · Artur Toshev · Johannes Gasteiger 🔗 |
-
|
Hybrid Diffusions for Stable Molecular Structure Generation via Explicit Energy-based Model
(
Poster
)
link
Generation of 3D molecules utilizing diffusion models often encounters difficulties in producing stable structures, primarily due to the emergence of unstable intermediate structures during diffusion steps. To account for this issue, we introduce a diffusion-based molecule generation model that incorporates an energy-based model (EBM), pretrained on density functional theory (DFT) data. Specifically, we propose three strategic use of EBM: 1) guided exploration using the EBM, 2) stability evaluation to accept the structure or to reject and restart the generation at the end of diffusion steps, and 3) performing post-relaxation refinement. With these three strategies, we demonstrate that the energy estimator significantly enhances the generated molecule’s stability. |
Youngwoo Cho · Seunghoon Yi · Sookyung Kim · Hongkee Yoon · Joonseok Lee 🔗 |
-
|
CAAFE: Combining Large Language Models with Tabular Predictors for Semi-Automated Data Science
(
Poster
)
link
As the field of automated machine learning (AutoML) advances, it becomes increasingly important to incorporate domain knowledge into these systems. Our approach combines the advantages of classical ML classifiers (robustness, predictability and a level of interpretability) and LLMs (domain-knowledge and creativity). We introduce Context-Aware Automated Feature Engineering (CAAFE), a feature engineering method for tabular datasets that utilizes an LLM to iteratively generate additional semantically meaningful features for tabular datasets based on the description of the dataset. The method produces both Python code for creating new features and explanations for the utility of the generated features. Despite being methodologically simple, CAAFE improves performance on 11 out of 14 datasets - boosting mean ROC AUC performance from 0.798 to 0.822 across all dataset - similar to the improvement achieved by using a random forest instead of logistic regression on our datasets. Furthermore, CAAFE is interpretable by providing a textual explanation for each generated feature. CAAFE paves the way for more extensive semi-automation in data science tasks and emphasizes the significance of context-aware solutions that can extend the scope of AutoML systems to semantic AutoML. We release our code, a simple demo and a python package. |
Noah Hollmann · Samuel Gabriel Müller · Frank Hutter 🔗 |
-
|
Titanium 3D Microstructure for Physics-based Generative Models: A Dataset and Primer
(
Poster
)
link
When engineers design components, they rely on accurate property descriptions of the materials being used to predict performance. Most materials used for engineering applications are composed of an arrangement of atomic constituents into crystalline phases, which control the properties of that material. The crystal orientations embedded in this microstructural information differ from the information in conventional light optical images, and are critical for developing and designing materials for a range of applications. However, collecting microstructure information through experimental methods is expensive and time-consuming, especially when 3D information is needed. In order to model material properties under different material processing conditions (resulting in different microstructural arrangements), physics-based generative models are needed to create realistic synthetic microstructures. This research releases microstructural data of a titanium alloy, Ti-6Al-4V, and discusses their information modalities and the physics needed to be incorporated to enable the design of physics-based generative models for generating synthetic microstructures. |
Devendra Jangid · Neal Brodnik · McLean Echlin · Samantha Daly · Tresa Pollock · B.S. Manjunath 🔗 |
-
|
Adaptive Bias Correction for Improved Subseasonal Forecasting
(
Poster
)
link
Subseasonal forecasting---predicting temperature and precipitation 2 to 6 weeks ahead---is critical for effective water allocation, wildfire management, and drought and flood mitigation. Recent international research efforts have advanced the subseasonal capabilities of operational dynamical models, yet temperature and precipitation prediction skills remain poor, partly due to stubborn errors in representing atmospheric dynamics and physics inside dynamical models. Here, to counter these errors, we introduce an adaptive bias correction (ABC) method that combines state-of-the-art dynamical forecasts with observations using machine learning. We show that, when applied to the leading subseasonal model from the European Centre for Medium-Range Weather Forecasts (ECMWF), ABC improves temperature forecasting skill by 60-90% (over baseline skills of 0.18-0.25) and precipitation forecasting skill by 40-69% (over baseline skills of 0.11-0.15) in the contiguous U.S. We couple these performance improvements with a practical workflow to explain ABC skill gains and identify higher-skill windows of opportunity based on specific climate conditions. |
Soukayna Mouatadid · Paulo Orenstein · Genevieve Flaspohler · Judah Cohen · Miruna Oprescu · Ernest Fraenkel · Lester Mackey 🔗 |
-
|
Understanding Energy-Based Modeling of Proteins via an Empirically Motivated Minimal Ground Truth Model
(
Poster
)
link
Energy-based models (EBM) of sequences of evolutionary related families of proteins have the ability to learn the generic constraints necessary to make novel functional sequences, which have been validated by $\textit{in vivo}$ experiments. However, these learned energy functions require re-scaling by a temperature parameter in order to sample novel functional sequences. Here, we generate data from a minimal model motivated by a wide array of empirical evidence for a synergistic cluster of amino acids, or sector, within a sequence. We find our setting captures salient learning behaviors similar to those exhibited by EBMs fitted to real proteins, namely the necessity for temperature tuning to increase generative performance. We discuss how this guides insight into the functional sequence space of proteins and suggest how our model may be exploited to further understanding of the essential functional features within protein sequences.
|
Peter Fields · Wave Ngampruetikorn · Rama Ranganathan · David Schwab · Stephanie Palmer 🔗 |
-
|
Diffusion model based data generation for partial differential equations
(
Poster
)
link
In a preliminary attempt to address the problem of data scarcity in physics-based machine learning, we introduce a novel methodology for data generation in physics-based simulations. Our motivation is to overcome the limitations posed by the limited availability of numerical data. To achieve this, we leverage a diffusion model that allows us to generate synthetic data samples and test them for two canonical cases: (a) the steady 2-D Poisson equation, and (b) the forced unsteady 2-D Navier-Stokes (NS) vorticity-transport equation in a confined box. By comparing the generated data samples against outputs from classical solvers, we assess their accuracy and examine their adherence to the underlying physics laws. In this way, we emphasize the importance of not only satisfying visual and statistical comparisons with solver data but also ensuring the generated data’s conformity to physics laws, thus enabling their effective utilization in downstream tasks. |
Rucha Apte · Sheel Nidhan · Rishikesh Ranade · Jay Pathak 🔗 |
-
|
Neural Modulation Fields for Conditional Cone Beam Neural Tomography
(
Poster
)
link
Conventional Computed Tomography (CT) methods require large numbers of noise-free projections for accurate density reconstructions, limiting their applicability to the more complex class of Cone Beam Geometry CT (CBCT) reconstruction. Recently, deep learning methods have been proposed to overcome these limitations. Our focus is improving methods based on neural fields (NFs), which have shown strong results by approximating in a continuous field the reconstructed density through a neural network. Unlike previous work, which requires training an NF from scratch for each new set of projections, we instead propose to leverage anatomical consistencies over different scans by training a single conditional NF on a dataset of projections. We propose a novel conditioning method where local modulations are modeled per patient as a field over the input domain through a Neural Modulation Field (NMF). The resulting Conditional Cone Beam Neural Tomography (CondCBNT) shows improved performance for both high and low numbers of available projections on noise-free and noisy data. |
Samuele Papa · David Knigge · Riccardo Valperga · Nikita Moriakov · Miltiadis (Miltos) Kofinas · Jan-jakob Sonke · Efstratios Gavves 🔗 |
-
|
INFINITY: Neural Field Modeling for Reynolds-Averaged Navier-Stokes Equations
(
Poster
)
link
For numerical design, the development of efficient and accurate surrogate models is paramount. They allow us to approximate complex physical phenomena, thereby reducing the computational burden of direct numerical simulations. We propose INFINITY, a deep learning model that utilizes implicit neural representations (INRs) to address this challenge. Our framework encodes geometric information and physical fields into compact representations and learns a mapping between them to infer the physical fields. We use an airfoil design optimization problem as an example task and we evaluate our approach on the challenging AirfRANS dataset, which closely resembles real-world industrial use-cases. The experimental results demonstrate that our framework achieves state-of-the-art performance by accurately inferring physical fields throughout the volume and surface. Additionally we demonstrate its applicability in contexts such as design exploration and shape optimization: our model can correctly predict drag and lift coefficients while adhering to the equations. |
Louis Serrano · Léon Migus · Yuan Yin · Jocelyn Mazari · Jean-Noël Vittaut · patrick gallinari 🔗 |
-
|
Integrating processed-based models and machine learning for crop yield prediction
(
Poster
)
link
Crop yield prediction typically involves the utilization of either theory-driven process-based crop growth models, which have proven to be difficult to calibrate for local conditions, or data-driven machine learning methods, which are known to require large data sets. In this work we investigate potato yield prediction using a hybrid modeling approach. A crop growth model is employed to generate synthetic data for (pre)training a convolutional neural net, which is then fine-tuned with observational data. When applied in silico, our hybrid approach yields better predictions than a baseline comprising a purely data-driven approach. When tested on real world data from field trials (n=303) and commercial fields (n=77), our hybrid approach yields competitive results with respect to the crop growth model. In the latter set, however, both models perform worse than a simple linear regression with a hand-picked feature set and dedicated preprocessing designed by domain experts. Our findings indicate the potential of hybrid modeling for accurate crop yield prediction; however, further advancements and validation using extensive real-world data sets is recommended to solidify its practical effectiveness. |
Michiel Kallenberg · Bernardo Maestrini · Ron van Bree · Paul Ravensbergen · Christos Pylianidis · Frits van Evert · Ioannis N. Athanasiadis 🔗 |
-
|
Open Source Infrastructure for Differentiable Density Functional Theory
(
Poster
)
link
Learning exchange correlation functionals, used in quantum chemistry calculations, from data has become increasingly important in recent years, but training such a functional requires sophisticated software infrastructure. For this reason, we build open source infrastructure to train neural exchange correlation functionals. We aim to standardize the processing pipeline by adapting state-of-the-art techniques from work done by multiple groups. We have open sourced the model in the DeepChem library to provide a platform for additional research on differentiable quantum chemistry methods. |
Advika Vidhyadhiraja · Arun Pa Thiagarajan · Shang Zhu · Venkatasubraman Viswanathan · Bharath Ramsundar 🔗 |
-
|
Generating observation guided ensembles for data assimilation with denoising diffusion probabilistic model
(
Poster
)
link
This paper presents an ensemble data assimilation method using the pseudo ensembles generated by denoising diffusion probabilistic model. Since the model is trained against noisy and sparse observation data, this model can produce divergent ensembles consistent with observations. Thanks to the variance in generated ensembles, our proposed method displays better performance than the well-established ensemble data assimilation method when the simulation model is imperfect. |
Yuuichi Asahi · Yuta Hasegawa · Naoyuki Onodera · Takashi Shimokawabe · Hayato Shiba · Yasuhiro Idomura 🔗 |
-
|
Synergizing Deep Reinforcement Learning and Biological Pursuit Behavioral Rule for Robust and Interpretable Navigation
(
Poster
)
link
Integrating theoretical models within machine learning models holds considerable promise for constructing efficient and robust models. In biology, however, the integration can be challenging because the behavioral rules described by theoretical models are not necessarily invariant, in contrast to problems in physics. Here we propose a hybrid architecture that hierarchically integrates biological pursuit models into deep reinforcement learning. Our approach facilitates seamless agent mode switching and rule-based action selection, demonstrating efficient navigation in a predator-prey environment. Interestingly, our results parallel the hunting behavior observed in nature, offering novel insights into biology. As our framework can be integrated with existing hybrid or gray box models, it paves the way for further exploration in this exciting cross-section of machine learning and biology. |
Kazushi Tsutsui · Kazuya Takeda · Keisuke Fujii 🔗 |
-
|
ClimaX: A Foundation Model for Weather and Climate
(
Poster
)
link
Recent data-driven approaches based on machine learning aim to directly solve a downstream forecasting or projection task by learning a data-driven functional mapping using deep neural networks. However, these networks are trained using curated and homogeneous climate datasets for specific spatiotemporal tasks, and thus lack the generality of currently used physics-informed numerical models for weather and climate modeling. We develop and demonstrate ClimaX, a flexible and generalizable deep learning model for weather and climate science that can be trained using heterogeneous datasets spanning different variables, spatiotemporal coverage, and physical groundings.ClimaX extends the Transformer architecture with novel encoding and aggregation blocks that allow effective use of available compute and data while maintaining general utility. ClimaX is pretrained with a self-supervised learning objective on climate datasets derived from CMIP6. The pretrained ClimaX can then be fine-tuned to address a breadth of climate and weather tasks, including those that involve atmospheric variables and spatiotemporal scales unseen during pretraining. Compared to existing data-driven baselines, we show that this generality in ClimaX results in superior performance on benchmarks for weather forecasting and climate projections. |
Tung Nguyen · Johannes Brandstetter · Ashish Kapoor · Jayesh K. Gupta · Aditya Grover 🔗 |
-
|
Learning to Optimize Non-Convex Sum-Rate Maximization Problems
(
Poster
)
link
Solving optimization problems through machine learning is a promising research direction. In this position paper, we sketch a general framework motivated by first-order necessary conditions to solve non-convex sum-rate optimization problems arising from practical resource allocation problems in cellular networks. We construct two parameter matrices to update matrix-form decision variables of the given objective function. We inherently enhance the learning efficiency by increasing the dimensionality of decision variables with a learnable parameter matrix. Our preliminary evaluation shows that our approach achieves up to 98\% optimality over state-of-the-art numerical algorithms while being up to 38$\times$ faster in various settings.
|
Qingyu Song · Guochen Liu · Hong Xu 🔗 |
-
|
Task-Linear Deep Representation of Physical Systems
(
Poster
)
link
Machine learning methods can be a valuable aid in the scientific process, but they need to face challenging settings where data come from inhomogeneous experimental conditions. Recent meta-learning methods have made significant progress in multi-task learning, but they rely on black-box neural networks and suffer from a lack of interpretability. We introduce Task-Linear Deep Representation, or TDLR, a new meta-learning architecture capable of learning efficiently from multiple environments by incorporating the linear structure observed in many problems. Unlike other approaches, we prove that TLDR is able to learn the physical parameters of the system, hence enhancing interpretability. We show that our method performs competitively by comparing it to state-of-the-art algorithms on two systems derived from scientific modeling. |
Matthieu Blanke · Marc Lelarge 🔗 |
-
|
Good Lattice Accelerates Physics-Informed Neural Networks
(
Poster
)
link
Physics-informed neural networks (PINNs) can solve partial differential equations (PDEs) by minimizing the physics-informed loss, ensuring the neural network satisfies the PDE at given points. However, the solutions to a PDE are infinite-dimensional, and the physics-informed loss is a finite approximation to a certain integral over the domain. This indicates that selecting appropriate points is essential. This paper proposes "good lattice training" (GLT), a technique inspired by number theoretic methods. GLT provides an optimal set of collocation points and can train PINNs to achieve competitive performance with smaller computational cost |
Takashi Matsubara · Takaharu Yaguchi 🔗 |
-
|
Predicting Properties of Amorphous Solids with Graph Network Potentials
(
Poster
)
link
Graph neural networks (GNNs) provide an architecture consistent with the physical nature of molecules and crystals, and have proven capable of efficiently learning their properties, particularly from density functional theory (DFT) calculations. When used in atomistic modeling, general-purpose GNNs can unlock new research directions in materials science and chemistry. In this paper, we present an end-to-end molecular dynamics workflow coupled with a large-scale E(3)-equivariant GNN-based general-purpose interatomic potential to model amorphous solids in any inorganic chemistry. Using this approach in high-throughput, we predict the structures and energetics of a large number of inorganic binary amorphous systems, with close to 28,800 unique compositions. By comparing the predicted energies of amorphous solids to DFT, we show that general-purpose GNN potentials provide strong zero-shot capability in modeling these systems. |
Muratahan Aykol · Jennifer Wei · Simon Batzner · Amil Merchant · Ekin Dogus Cubuk 🔗 |
-
|
Exploring the Existence of Atmospheric Blocking’s Precursor Patterns with Physics-Informed Explainable AI
(
Poster
)
link
Atmospheric blocking is an atmospheric flow pattern that is quasi-stationary, self-sustaining, and long-lasting that effectively blocks the prevailing westerly atmospheric flows. This blocking is directly linked to large-scale extreme events such as heat waves, yet there is no confirmed study on the precursor patterns that signal atmospheric blocking’s evolution. In this paper, we investigate the combination of physics, Convolutional Neural Network (CNN), and eXplainable Artificial Intelligence (XAI) to form a scientific hypothesis: precursor patterns of atmospheric blocking do exist. To investigate the predictability and search for signals of the existence of precursor blocking patterns, we integrated the Two-Layer Quasi Geostrophic (QG) Model, an idealized model of atmospheric evolution, into the training process of CNN and predict atmospheric blocking, reaching the prediction accuracy of 95%, 88%, and 72% at 1, 5, and 12 lead days, respectively. Next, we employ XAI to highlight spatial patterns that guide CNN’s prediction. The resulting composite patterns highlighted by XAI algorithms are physically consistent with the composite ground truth observations at different lead days. This work hypothesizes the existence of atmospheric blocking’s precursor patterns, motivating future fundamental research directions focusing specifically on these precursor patterns. |
Anh Nhu · Lei Wang 🔗 |
-
|
Physics-Informed Neural Operator for Coupled Forward-Backward Partial Differential Equations
(
Poster
)
link
This paper proposes a physics-informed neural operator (PINO) framework to solve a system of coupled forward-backward partial differential equations (PDEs) arising from mean field games (MFGs). The MFG system incorporates a forward PDE to model the propagation of population dynamics and a backward PDE for a representative agent's optimal control. The PINO is developed to tackle the forward PDE efficiently, particularly when the initial population density varies. A learning algorithm is devised and its performance is evaluated on one application domain, which is autonomous driving velocity control. The PINO exhibits both memory efficiency and generalization capabilities, compared to physics-informed neural networks (PINN). |
Xu Chen · Yongjie FU · Shuo Liu · Xuan Di 🔗 |
-
|
Evaluating the diversity and utility of materials proposed by generative models
(
Poster
)
link
Generative machine learning models can use data generated by scientific modeling to create large quantities of novel material structures. Here, we assess how one state-of-the-art generative model, the physics-guided crystal generation model (PGCGM), can be used as part of the inverse design process. We show that the default PGCGM's input space is not smooth with respect to parameter variation, making material optimization difficult and limited. We also demonstrate that most generated structures are predicted to be thermodynamically unstable by a separate property-prediction model, partially due to out-of-domain data challenges. Our findings suggest how generative models might be improved to enable better inverse design. |
Alexander New · Michael Pekala · Elizabeth Pogue · Nam Q. Le · Janna Domenico · Christine Piatko · Christopher Stiles 🔗 |
-
|
Unbinned Profiled Unfolding
(
Poster
)
link
Unfolding is an important procedure in particle physics experiments which corrects for detector effects and provides differential cross section measurements that can be used for a number of downstream tasks, such as extracting fundamental physics parameters. Traditionally, unfolding is done by discretizing the target phase space into a finite number of bins and is limited in the number of unfolded variables. Recently, there have been a number of proposals to perform unbinned unfolding with machine learning. However, none of these methods (like most unfolding methods) allow for simultaneously constraining (profiling) nuisance parameters. We propose a new machine learning-based unfolding method that results in an unbinned differential cross section and can profile nuisance parameters. The machine learning loss function is the full likelihood function, based on binned inputs at detector-level. We demonstrate the method and show the impact on a simulated Higgs boson cross section measurement. |
Jay Chan · Benjamin Nachman 🔗 |
-
|
An $\mathcal{A}$-adaptive Loop Unrolled Architecture for Solving Inverse Problems with Forward Model Mismatch
(
Poster
)
link
In inverse problems (IP) we aim to recover the underlying signal from noisy measurements that are generated according to a known forward model. Classical methods for solving IPs usually involve minimizing a least-squares data fidelity term together with a predetermined regularization function, which often leads to unsatisfactory reconstructions. \emph{loop unrolling} (LU) architecture addresses this issue by unrolling the optimization iterations into a sequence of neural networks that in effect learn a regularization function from data. While LU is currently a state-of-the-art method in many applications, the accuracy of the forward model is crucial for its success. This assumption can be limiting in many physical applications due to model simplifications or uncertainties in the apparatus. To address forward model mismatch, this work introduces a forward model residual network, and with an extra variable splitting step, the proposed method can adapt to uncertain forward models accordingly. The method achieves $\sim$ 2 dB PSNR increment in image blind deblurring and seismic blind deconvolution tasks by effectively learning the updates in reconstruction and forward model jointly.
|
Peimeng Guan · Naveed Iqbal · Mark Davenport · Mudassir Masood 🔗 |
-
|
Using machine learning and 3D geophysical modelling for mineral exploration
(
Poster
)
link
New and innovative methods are required to find critical mineral deposits to transition from fossil fuels to renewable energy. Geophysical modelling and inversion has been crucial in finding new deposits over the last few decades, but success rates are declining as the easy to find deposits have been discovered and new deposits become are deeper below the surface. Machine learning may offer a new way to ingest and interpret geophysical and geological data, and improve exploration success rates. The synergy of geophysical modelling and machine learning has not yet been well explored, and thus far machine learning has predominantly been used in mineral exploration to identify patterns in disparate geophysical dataset that are not easy to observe otherwise. In this paper I examine a new approach to achieve better synergy between geophysical and machine learning modelling. The approach relies on generating an ensemble of geophysical inversion results by varying some of the subjective inversion parameters, such as damping and regularisation, and using logged drilling information as training label to predict future drilling success. I show the application of the method in an active exploration program in Western Australia, where ambient seismic noise surface wave tomography ensemble models were used parameters and laboratory zinc mineralisation assay results were used as labels. The method achieved an out-of-box accuracy of 97% and identified new drill targets which are currently being investigated. Although relatively little training data was available for this project, it shows promise as a new way to synergise geophysical and machine learning modelling. |
Gerrit Olivier 🔗 |
-
|
ChatGPT-powered Conversational Drug Editing Using Retrieval and Domain Feedback
(
Poster
)
link
Recent advancements in conversational large language models (LLMs), such as ChatGPT, have demonstrated remarkable promise in various domains, including drug discovery. However, existing works mainly focus on investigating the capabilities of conversational LLMs on chemical reaction and retrosynthesis. While drug editing, a critical task in the drug discovery pipeline, remains largely unexplored. To bridge this gap, we propose ChatDrug, a framework to facilitate the systematic investigation of drug editing using LLMs. ChatDrug jointly leverages a prompt module, a retrieval and domain feedback (ReDF) module, and a conversation module to streamline effective drug editing. We empirically show that ChatDrug reaches the best performance on 33 out of 39 drug editing tasks, encompassing small molecules, peptides, and proteins. We further demonstrate, through 10 case studies, that ChatDrug can successfully identify the key substructures (e.g., the molecule functional groups, peptide motifs, and protein structures) for manipulation, generating diverse and valid suggestions for drug editing. Promisingly, we also show that ChatDrug can offer insightful explanations from a domain-specific perspective, enhancing interpretability and enabling informed decision-making. This research sheds light on the potential of ChatGPT and conversational LLMs for drug editing. It paves the way for a more efficient and collaborative drug discovery pipeline, contributing to the advancement of pharmaceutical research and development. |
Shengchao Liu · Jiongxiao Wang · Yijin Yang · Chengpeng Wang · Ling Liu · Hongyu Guo · Chaowei Xiao 🔗 |
-
|
Physics-Constrained Random Forests for Turbulence Model Uncertainty Estimation
(
Poster
)
link
To achieve virtual certification for industrial design, quantifying the uncertainties in simulation-driven processes is crucial. We discuss a physics-constrained approach to account for epistemic uncertainty of turbulence models. In order to eliminate user input, we incorporate a data-driven machine learning strategy. In addition to it, our study focuses on developing an a priori estimation of prediction confidence when accurate data is scarce. |
Marcel Matha 🔗 |
-
|
Understanding the Efficacy of U-Net & Vision Transformer for Groundwater Numerical Modelling
(
Poster
)
link
This paper presents a comprehensive comparison of various machine learning models, namely U-Net, U-Net integrated with Vision Transformers (ViT), and Fourier Neural Operator (FNO), for time-dependent forward modelling in groundwater systems. Through testing on synthetic datasets, it is demonstrated that U-Net and U-Net + ViT models outperform FNO in accuracy and efficiency, especially in sparse data scenarios. These findings underscore the potential of U-Net-based models for groundwater modelling in real-world applications where data scarcity is prevalent. |
Maria Luisa Taccari · Oded Ovadia · He Wang · Xiaohui Chen · Adar Kahana · Peter Jimack 🔗 |
-
|
Infinite-Fidelity Surrogate Learning via High-order Gaussian Processes
(
Poster
)
link
Multi-fidelity learning is popular in computational physics. While the fidelity is often up to the choice of mesh spacing and hence is continuous in nature, most methods only model finite, discrete fidelities. The recent work (Li et al., 2022) proposes the first continuous-fidelity surrogate model, named infinite-fidelity coregionalization (IFC), which uses a neural Ordinary Differential Equation (ODE) to capture the rich information within the infinite, continuous fidelity space. While showing state-of-the-art predictive performance, IFC is computationally expensive in training and is difficult for uncertainty quantification. To overcome these limitations, we propose Infinite-Fidelity High-Order Gaussian Process (IF-HOGP), based on the recent GP high-dimensional output regression model HOGP. By tensorizing the output and using a product kernel at each mode, HOGP can highly efficiently estimate the mapping from the PDE parameters to the high-dimensional solution output, without the need for any low-rank approximation. We made a simple extension by injecting the continuous fidelity variable into the input, and applying a neural network transformation before feeding the input into the kernel. On three benchmark PDEs, IF-HOGP achieves prediction accuracy better than or close to IFC, yet gains 380x speed-up and 7/8 memory reduction. Meanwhile, uncertainty calibration for IF-HOGP is straightforward. |
Shibo Li · Li Shi · Shandian Zhe 🔗 |
-
|
Simulation-based Inference with the Generalized Kullback-Leibler Divergence
(
Poster
)
link
In Simulation-based Inference, the goal is to solve the inverse problem when the likelihood is only known implicitly. Fitting a normalized density estimator to act as a surrogate model for the posterior is known as Neural Posterior Estimation. Its current form cannot fit unnormalized surrogates because it optimizes the Kullback-Leibler divergence. We propose to (1) optimize a generalized Kullback-Leibler divergence that accounts for the normalization constant for unnormalized distributions (2) The objective recovers Neural Posterior Estimation when the model class is normalized and (3) unifies it with Neural Ratio Estimation, combining both with a single objective. (4) We investigate a hybrid model, offering the best of both worlds by learning a normalized base distribution with a learned ratio, and (5) present benchmark results. |
Benjamin Kurt Miller · Marco Federici · Christoph Weniger · Patrick Forré 🔗 |
-
|
Combining Thermodynamics-based Model of the Centrifugal Compressors and Active Machine Learning for Enhanced Industrial Design Optimization
(
Poster
)
link
The design process of centrifugal compressors requires applying an optimization process which is computationally expensive due to complex analytical equations underlying the compressor’s dynamical equations. Although the regression surrogate models could drastically reduce the computational cost of such a process, the major challenge is the scarcity of data for training the surrogate model. Aiming to strategically exploit the labeled samples, we propose the ActiveCompDesign framework in which we combine a thermodynamics-based compressor model (i.e., our internal software for compressor design) and Gaussian Process-based surrogate model within a deployable Active Learning (AL) setting. We first conduct experiments in an offline setting and further, extend it to an online AL framework where a real-time interaction with the thermodynamics-based compressor’s model allows the deployment in production. ActiveCompDesign shows a significant performance improvement in surrogate modeling by leveraging on uncertainty-based query function of samples within the AL framework with respect to the random selection of data points. Moreover, our framework in production has reduced the total computational time of compressor’s design optimization to around 46% faster than relying on the internal thermodynamics-based simulator, achieving the same performance. |
Shadi Ghiasi · Guido Pazzi · Concettina Del Grosso · Giovanni De Magistris · Giacomo Veneri 🔗 |
-
|
Learning Green's Function Efficiently Using Low-Rank Approximations
(
Poster
)
link
Learning the Green's function using deep learning models enables to solve different classes of partial differential equations. A practical limitation of using deep learning for the Green's function is the repeated computationally expensive Monte-Carlo integral approximations. We propose to learn the Green's function by low-rank decomposition, which results in a novel architecture to remove redundant computations by separate learning with domain data for evaluation and Monte-Carlo samples for integral approximation. Using experiments we show that the proposed method improves computational time compared to MOD-Net while achieving comparable accuracy compared to both PINNs and MOD-Net. |
Kishan Wimalawarne · Taiji Suzuki · Sophie Langer 🔗 |
-
|
Predicting the stabilization quantity with neural networks for Singularly Perturbed Partial Differential Equations
(
Poster
)
link
We propose \textit{SPDE-Net}, an artificial neural network (ANN) to predict the stabilization parameter for the streamline upwind/Petrov-Galerkin (SUPG) stabilization technique for solving singularly perturbed differential equations (SPDEs). The prediction task is modeled as a regression problem and is solved using ANN. Three training strategies for the ANN have been proposed, i.e. supervised, $L^2$ error minimization (global) and $L^2$ error minimization (local). The proposed method has been observed to yield accurate results and even outperform some of the existing state-of-the-art ANN-based partial differential equation (PDE) solvers, such as Physics Informed Neural Network (PINN).
|
Sangeeta Yadav 🔗 |
-
|
Reinstating Continuous Climate Patterns From Small and Discretized Data
(
Poster
)
link
Wind energy is a leading renewable energy source. It does not pollute the environment and reduces greenhouse gas emissions that contribute to global warming. However, current wind characterization is performed at a resolution insufficient for assessing renewable energy resources in different climate scenarios. In this paper, we advocate the use of generative deep models for wind field representation learning. In contrast to existing approaches, we formulate the generative model as an explicit function of the spatial coordinate, thereby learning a continuous representation of the wind field, which can extrapolate from discretized data with demonstrated generalizability. We extend the concept of conditional neural fields by encoding the local turbulent wind properties into latent variables. Such resolution enhancement enables essential localized analyses of renewable energy resources' long-term economic sustainability. |
Xihaier Luo · Xiaoning Qian · Nathan Urban · Byung-Jun Yoon 🔗 |
-
|
Repurposing Density Functional Theory to Suit Deep Learning
(
Poster
)
link
Density Functional Theory (DFT) accurately predicts the properties of molecules given their atom types and positions, and often serves as ground truth for molecular property prediction tasks. Neural Networks (NN) are popular tools for such tasks and are trained on DFT datasets, with the aim to approximate DFT at a fraction of the computational cost. Research in other areas of machine learning has shown that generalisation performance of NNs tends to improve with increased dataset size, however, the computational cost of DFT limits the size of DFT datasets. We present PySCFIPU, a DFT library that allows us to iterate on both dataset generation and NN training. We create QM10X, a dataset with 100M conformers, in 13 hours, on which we subsequently train SchNet in 12 hours. We show that the predictions of SchNet improve solely by increasing training data without incorporating further inductive biases. |
Alexander Mathiasen · Hatem Helal · Paul Balanca · Kerstin Klaeser · Josef Dean · Carlo Luschi · Dominique Beaini · Andrew Fitzgibbon · Dominic Masters 🔗 |
-
|
A Machine Learning Pressure Emulator for Hydrogen Embrittlement
(
Poster
)
link
A recent alternative for hydrogen transportation as a mixture with natural gas is blending it into natural gas pipelines. However, hydrogen embrittlement of material is a major concern for scientists and gas installation designers to avoid process failures. In this paper, we propose a physics-informed machine learning model to predict the gas pressure on the pipes' inner wall. Despite its high-fidelity results, the current PDE-based simulators are time- and computationally-demanding. Using simulation data, we train an ML model to predict the pressure on the pipelines' inner walls, which is a first step for pipeline system surveillance. We found that the physics-based method outperformed the purely data-driven method and satisfy the physical constraints of the gas flow system. |
Minh Chau · João Almeida · Elie Alhajjar · Alberto Costa Nogueira Junior 🔗 |
-
|
Speeding up Fourier Neural Operators via Mixed Precision
(
Poster
)
link
The Fourier neural operator (FNO) is a powerful technique for learning surrogate maps for partial differential equation (PDE) solution operators. For many real-world applications, which often require high-resolution data points, training time and memory usage are significant bottlenecks. While there are mixed-precision training techniques for standard neural networks, those work for real-valued datatypes and therefore cannot be directly applied to FNO, which crucially operates in (complex-valued) Fourier space. On the other hand, since the Fourier transform is already an approximation (due to discretization error), we do not need to perform the operation at full precision. In this work, we (i) profile memory and runtime for FNO with full and mixed-precision training, (ii) conduct a study on the numerical stability of mixed-precision training of FNO, and (iii) devise a training routine which substantially decreases training time and memory usage (up to 27%), with little or no reduction in accuracy, on the Navier-Stokes and Darcy flow equations. Combined with the recently proposed tensorized FNO (Kossaifi et al., 2023), the resulting model has far better performance while also being significantly faster than the original FNO. |
Renbo Tu · Colin White · Jean Kossaifi · Kamyar Azizzadenesheli · Gennady Pekhimenko · Anima Anandkumar 🔗 |
-
|
Physics-based deep learning framework to learn and forecast cardiac electrophysiology dynamics
(
Poster
)
link
Biophysically detailed mathematical modeling of cardiac electrophysiology is often computationally demanding, for example, when solving problems for various patient pathological conditions. Furthermore, it is still difficult to reduce the discrepancy between the output of idealised mathematical models and clinical measurements, which are usually noisy.In this work, we propose a fast physics-based deep learning framework to learn complex cardiac electrophysiology dynamics from data. This novel framework has two components, decomposing the dynamics into a physical term and a data-driven term, respectively. This construction allows the framework to learn from data of different complexity. Using in silico data, we demonstrate that this framework can reproduce the complex dynamics of transmembrane potential, even in presence of noise in the data. This combined physics-based data-driven approach may improve cardiac electrophysiology modeling by providing a robust biophysical tool for predictions. |
Victoriya Kashtanova · Maxime Sermesant · patrick gallinari 🔗 |
-
|
OL-Transformer: A Fast and Universal Surrogate Simulator for Optical Multilayer Thin Film Structures
(
Poster
)
link
Deep learning-based methods have recently been established as fast and accurate surrogate simulators for optical multilayer thin film structures. However, existing methods only work for limited types of structures with different material arrangements, preventing their applications towards diverse and universal structures. Here, we propose the Opto-Layer (OL) Transformer to act as a universal surrogate simulator for enormous types of structures. Combined with the technique of structure serialization, our model can predict accurate reflection and transmission spectra for up to $10^{25}$ different multilayer structures, while still achieving a six-fold time speedup compared to physical solvers. Further investigation reveals that the general learning ability comes from the fact that our model first learns the physical embeddings and then uses the self-attention mechanism to capture the hidden relationship of light-matter interaction between each layer.
|
Taigao Ma · Haozhu Wang · L. Jay Guo 🔗 |
-
|
Meta-Learning Deep Kernels for Latent Force Inference
(
Poster
)
link
Latent force models offer an interpretable alternative to purely data driven inference in dynamical systems. Uncertainty in the output variables is treated by deriving the kernel function of the low-dimensional latent forces directly from the dynamics. However, exact computation of posterior kernel terms is rarely tractable, requiring approximations for complex scenarios such as nonlinear dynamics. In this paper, we overcome these issues by posing the problem as meta-learning a general class of latent force models. By employing a deep kernel and a sensible embedding, we achieve extrapolation from a synthetic dataset to real experimental datasets. Moreover, our model is the first of its kind to scale up to large datasets. |
Jacob Moss · Felix Opolka · Jeremy England · Pietro Lió 🔗 |
-
|
Convolutional Neural network for local stabilization parameter prediction for Singularly Perturbed PDEs
(
Poster
)
link
Singularly Perturbed Partial Differential Equations are challenging to solve with conventional numerical techniques such as Finite Element Methods due to the presence of boundary and interior layers. Often the standard numerical solution has spurious oscillations in the vicinity of these layers. Stabilization techniques are employed to eliminate these spurious oscillations in the numerical solution. The accuracy of the stabilization technique depends on a user-chosen stabilization parameter, where an optimal value is challenging to find. In this work, we focus on predicting an optimal value of the stabilization parameter for a stabilization technique called the Streamline Upwind Petrov Galerkin technique for solving singularly perturbed partial differential equations. This paper proposes \textit{SPDE-ConvNet}, a convolutional neural network for predicting stabilization parameters by minimizing a loss based on the cross-wind derivative term. The proposed technique is compared with the state-of-the-art variational form-based neural network schemes. |
Sangeeta Yadav 🔗 |
-
|
Neural Polytopes
(
Poster
)
link
We find that simple neural networks with ReLU activation generate polytopes as an approximation of a unit sphere in various dimensions. The species of polytopes are regulated by the network architecture, such as the number of units and layers. For a variety of activation functions, generalization of polytopes is obtained, which we call neural polytopes. They are a smooth analogue of polytopes, exhibiting geometric duality. This finding initiates research of discrete geome- try via machine learning and also a visualization of trained networks. |
Koji Hashimoto · Tomoya Naito · Hisashi Naito 🔗 |
-
|
Improving the Lipschitz stability in Spectral Transformer through Nearest Neighbour Coupling
(
Poster
)
link
Statistical physics has played a pivotal role in the formulation of neural networks and understanding their behaviour. However, the effort to utilize the physical principle in the transformer architecture is still limited. In our work, we first show that spectral feature learning with self-attention is prone to instability. Inspired from the Ising model, we then propose a transformer based network using a adjacently coupled spectral attention to learn the spectral mapping from RGB images. We further analyse its stability using the theory of Lipschitz constant. The method is evaluated and compared with different state-of-the-art methods on multiple standard datasets. |
Abhishek Sinha 🔗 |
-
|
Optimization or Architecture: What Matters in Non-Linear Filtering?
(
Poster
)
link
In non-linear filtering, it is traditional to compare non-linear architectures such as neural networks to the standard linear Kalman Filter (KF). We observe that this methodology mixes the evaluation of two separate components: the non-linear architecture, and the numeric optimization method. In particular, the non-linear model is often optimized, whereas the reference KF model is not. We argue that both should be optimized similarly. We suggest the Optimized KF (OKF), which adjusts numeric optimization to the positive-definite KF parameters. We demonstrate how a significant advantage of a neural network over the KF may entirely vanish once the KF is optimized using OKF. This implies that experimental conclusions of certain previous studies were derived from a flawed process. The benefits of OKF over the non-optimized KF are further studied theoretically and empirically, where OKF demonstrates consistently improved accuracy in a variety of problems. |
Ido Greenberg · Netanel Yannay · Shie Mannor 🔗 |
-
|
Predictive Modeling of Engine-out Emissions using a Combination of Computational Fluid Dynamics and Machine Learning
(
Poster
)
link
Analysis-driven design of Internal Combustion Engines (ICE) is extremely valuable in significantly reducing hardware investments and accelerating development of low Greenhouse Gas (GHG) emitting vehicles compliant with strict emissions regulations. Advanced physics-based engine modeling tools use system-level models coupled with Computational Fluid Dynamics (CFD) simulations to predict engine-out emissions. The success of this methodology largely relies on the accuracy of analytical predictions, especially engine-out emissions. Results show excellent agreement in prediction of engine performance parameters, oxides of Nitrogen (NOx) emissions and combustion noise, while the Carbon Monoxide (CO), Unburned Hydrocarbons (HC) and Smoke emissions predictions remain a challenge even with large chemical kinetics solvers and refined mesh resolution. In this study, a hybrid approach combining CFD analysis with Machine Learning (ML) for prediction of engine-out emissions of CO, HC and Smoke is demonstrated. Input features generated by physics-based CFD simulations and experimentally measured emissions data as labels or targets were used to train a deep Convolutional Neural Network (CNN) model. This approach led to a significant improvement in prediction accuracy of all three emissions species and captured the qualitative trends as well. The ML model could be used to augment the engine modeling toolkit leading to significantly more accurate predictions of engine-out emissions, lower computational costs and reduced turnaround times for engine simulations. |
Alok Warey · Jian Gao · Ronald Grover 🔗 |