Mathématiques et Informatique Appliquées
du Génome à l'Environnement

 

 

JOUANAUD Yanis

Type
Sujet
Predicting Microbial Community Interactions using Physics Informed Neural Networks.
Date de début
Date de fin
Encadrant(s)
Lorenzo Sala (Dynenvie), Beatrice Laroche (Dynenvie), Hugo Gangloff (MIA Paris-Saclay), Nicolas Jouvin (MIA Paris-Saclay)
Equipe(s)
Description/résumé

The gut microbiota comprises a vast array of hundreds of microorganisms crucial for functions such as digestion, metabolism, immune response, and neurological processes. Disruptions in this complex system have been linked to autoimmune and inflammatory conditions. Moreover, the gut microbiota acts as a defense mechanism against the invasion of pathogens introduced through ingested food. Recent advances in sequencing technologies allow for the precise identification of bacterial species and their quantities in fecal samples. The goal of our work is to understand the relationships and interactions among these bacteria, their associations with pathogens, and their roles within the ecosystem. A common approach in literature is to describe these interactions via the Generalized Lotka-Volterra (GLV) model. In this context, a significant challenge arises due to the bacterial data having a considerably lower number of samples compared to the multitude of bacterial species, with a small number of individuals sampled over a limited timeframe. It is important to highlight that directly estimating parameters for the GLV model—whether through Maximum Likelihood estimation, Bayesian estimation, or genetic algorithms—is challenging.
In previous works we adopted the Generalised Smoothing Algorithm (Ramsay et al. 2007), speficically we used splines in order to represent the abundances of bacterial species across time, which should be close to the experimental data while being also solution of the GLV model with unknown parameters. As a result, this approach concurrently estimates spline coefficients and model parameters by iteratively minimizing an objective function. This objective function considers the proximity of splines to the data, a penalty associated with the deviation of the splines as a solution to the GLV model based on the parameters to be estimated, and a sparsity penalty on these parameters.
In this context a substitute approach to the splines are the employment of Physics Informed Neural Networks (PINNs). We will investigate the use of this hybrid machine-learning technique as a parametric approximation of the trajectories describing the abundances of bacterial species across time.

Ecole/université (pour les thèses et les stages)
Université Paris-Saclay
Niveau/diplôme (pour les stages)
M1