Intitulé du projet
Advancing metabolic function prediction through integrated genomic and high-throughput phenotyping
Nature du financement
Ferments du Futur
État du projet
Soumis
Année de soumission
2026
Programme / appel + année
AAP2026
Equipe(s) impliquée(s) dans le projet
Migale
StatInfOmics
Coordinateur·trice (nom et prénom)
Mariadassou Mariadassou
Rôle de MaIAGE dans le projet
Coordinateur.trice
Responsable de Work Package
Partenaire (projet multipartenaires)
Nom(s) du(des) participant(s) - MaIAGE
M. Mariadassou, V. Loux, S. Schbath, C. Hennequet-Antier, A. Barnabé
Nom(s) du(des) partenaire(s) (nom, labo et localisation) - Hors MaIAGE
E. Guedon, N. Roland - STLO - INRAE Rennes, J. Auber, N. Jouvin - MIA-PS - INRAE/AgroParisTech
Date de début du projet
Date de fin du projet
Résumé
Who hasn’t dreamed of predicting single bacterium or bacterial consortium metabolic capabilities directly from genomes? While genome annotation tools exist, genome-based inference of metabolic functions remains incomplete, especially in food matrices. Traditional phenotyping is costly, highlighting the need for predictive, data-driven approaches. Our project aims to bridge genomics, AI, and open science to improve microbial phenotype prediction in fermented foods and stimulate community driven efforts. We will curate a benchmark dataset of 300 LAB/PAB strains, integrating public (BacDive, metaTraits, DSMZ) and internal (CIRM, STLO) data, enriched with kinetic phenotypes (e.g., sugar consumption rates). Unlike existing datasets, ours will focus on food-relevant traits, ensuring broad species coverage and phenotypic diversity. We propose two complementary modeling approaches: (1) interpretable pan-genome ML using biologically grounded features (genes, orthologs) and (2) AI-based models leveraging foundation DNA model (Evo2) for high-accuracy predictions.
The project will deliver a FAIR-compliant public database, pre-trained models, and R/Python packages that will serve as the basis to host a data challenge, in partnership with DataIA, to evaluate model generalization, including cross-kingdom predictions (e.g., yeast). This will facilitate rational strain selection and advance genome-to-phenotype prediction.
The project will deliver a FAIR-compliant public database, pre-trained models, and R/Python packages that will serve as the basis to host a data challenge, in partnership with DataIA, to evaluate model generalization, including cross-kingdom predictions (e.g., yeast). This will facilitate rational strain selection and advance genome-to-phenotype prediction.
Champ thématique du contrat (MathNum)
Grand objectif concerné - principal - (MathNum)
Ce projet s'inscrit-il dans le périmètre scientifique du département MathNum ?