Mathématiques et Informatique Appliquées
du Génome à l'Environnement

 

 

SKEL

Intitulé du projet
Scientific Knowledge Extraction and Linking
Nature du financement
Public (EPIC, coll. locales…)
État du projet
Soumis
Année de soumission
2026
Programme / appel + année
AAP Multitrack GS ISN
Equipe(s) impliquée(s) dans le projet
Bibliome
Coordinateur·trice (nom et prénom)
Ferré Arnaud
Rôle de MaIAGE dans le projet
Coordinateur.trice
Nom(s) du(des) participant(s) - MaIAGE
A. Ferré, L. Deléger
Nom(s) du(des) partenaire(s) (nom, labo et localisation) - Hors MaIAGE
O. Ferret - LASTI (CEA) - Paris-Saclay
Date de début du projet
Date de fin du projet
Résumé
The project aims to enable the transformation of scientific articles into structured and exploitable knowledge by automatically analyzing their textual content, beyond simple metadata. It focuses on developing methods for entity recognition and entity linking, that is, identifying relevant entities in full-text articles and associating them with reference concepts in knowledge bases, while accounting for the strong variability of scientific terminology. A central objective is to design approaches that are sufficiently generic to adapt rapidly to different scientific domains without requiring costly annotation or retraining.

The main scientific challenge lies in jointly addressing entity recognition and linking, leveraging knowledge bases as a primary source of information, and overcoming the limitations of current approaches, in particular the lack of robustness and generalization, as well as the constraints of large language models. The project explores hybrid methods combining LLMs and lighter models, with an emphasis on reducing reliance on annotated data and improving efficiency. Ultimately, it aims to contribute to large-scale systems for extracting, structuring, and connecting knowledge from scientific literature.

This proposal concerns a request for half PhD funding, intended to complement an existing half funding already secured through the AIKO program. The AIKO program, funded within the France 2030 initiative and coordinated by INRIA, supports the methodological foundations of this project (https://maiage.inrae.fr/node/3495).
Grand objectif concerné - secondaire - (MathNum)
Ce projet s'inscrit-il dans le périmètre scientifique du département MathNum ?