Mathématiques et Informatique Appliquées
du Génome à l'Environnement


Nédellec Claire



 Email :
Adresse : INRA - Unité MaIAGE
                 Bâtiment 233
                 Domaine de Vilvert
                 78352 JOUY-EN-JOSAS CEDEX
Tél : +33 (0)1 34 65 28 78
Fax : +33 (0)1 34 65 00 00
Secrétariat : +33 (0)1 34 65 28 86


Claire Nédellec is research director at INRAE, leader of Bibliome team.

She has been researcher at LRI, Université Paris-Sud from 1994 to 2001 after a PhD in Inductive Logic Programming applied to cooperative machine learning. She has obtained her HDR 2013. She joined the MaIAGE unit in 2001 where she created the Bibliome team. 

Research interests

Machine Learning and Natural Language Processing for Information Extraction and Ontology building.
Open Science and text mining
Application to scientific and technical domains, Life Science


On-going projects

FAIROmics H2020 -  FAIRification of multiOmics data to link databases and create knowledge graphs for fermented foods (2024-2027).

HoloOligoStructure diversity, functionality and modulation of milk oligosaccharides in monogastric livestock species: towards optimal development of rabbit and pig holobionts (2022-2025)

TyDI - Terminology Design Interface (2021-2025). Coordination.

BEYOND  ANR PPR Cultiver et protéger autrement.-  Building epidemiological surveillance and prophylaxis with observations both near and distant (2021-2025). Task leader.

D2KAB  Data to Knowledge in Agriculture and Biodiversity - ANR (2019-2024). WP leader.

Recent projects

TIERS-ESV  Traitement de l’Information et Expertise des Risques Sanitaires pour l’Epidémiosurveillance en Santé Végétal - IB2021 Départements INRAE MathNum et SPE (2021-2023). Coordination.

OntoBedding  Amélioration de plongements lexicaux par des ontologies pour leur adaptation aux domaines de spécialité DIM IdF RFSI, sept STIC Université Paris-Saclay (2019).

ENovFood  Linking a phenotypic and a network food microbe data bases: an application for food microbial ecology and food innovation Métaprogramme MEM (2018-2020). WP leader.

Visa TM  - BSN, CoSO  Vers une infrastructure de services avancés pour le text mining (2017-2018). Coordination.

OpenMinTeD H2020 Open Mining Infrastructure for Text and Data (2015-2018). Task leader.

D-ONTExploitation optimisée des bases de données phénotypiques - Des ontologies pour le partage d’information, ACI Phase 2016-2018. Coordination.

Florilege  - A database gathering microbial phenotypes of food interest. Métaprogramme MEM  - Action ciblée  (2016-2018). WP leader.

OntoBiotope : Métaprogramme INRA MEM (Métagénomique des écosystèmes microbiens). (2012-2013). Coordination.

Quaero : Automatic multimedia content processing Oséo. (2008-2013). WP leader.

FSOV SAM Blé : Sélection du blé tendre assistée par marqueur Fond de soutien à l'obtention végétale (2010-2013). WP leader.


Co-animation of WG D2K - De la Donnée à la Connaissance.Labex DigiCosme

Member of the scientific committee of INRAE MathNum research department and Graduate School ISN (University Paris-Saclay). Member of the Research and Teaching committees. GS ISN deputy at comité de pilotage de la science ouverte

Artificial Intelligence Institute DataIA, Université Paris-Saclay. Executive committee member. 

Organization of international challenges in NLP: Genic Interaction Extraction Challenge at LLL'05 (Learning Language in Logic), BioNLP Shared Task (2011, 2013, 2016) puis BioNLP Open Shared Task : 2019


On going supervision

Myriam Dulor, Definition and evaluation of bibliographic, linguistic and biological criteria for the relevance of information on insect vectors of plant diseases from a historical perspective, co-supervision Nicolas Sauvion (PHIM), Robert Bossy.
Mariya Borovikova, " Information extraction from textual data for epidemiosurveillance for plant health" Thèse en préparation. Université Paris-Saclay, projet ANR Beyond. co-supervision Mathieu Roche (Tetis), Arnaud Ferré, Robert Bossy (MaIAGE). 


Recent supervision

Anfu Tang, Extraction d'informations relationnelles à partir de textes en domaine spécialisé - adaptabilité et passage à l'échelle. Thèse en préparation. Université Paris-Saclay DigiCosme, co-supervision Pierre Zweigenbaum (LISN) et Louise Deléger (MaIAGE). 

Catalina Garcia, ingénieur, projet TIERS-ESV.

Estelle Chaix, post-doc, OpenMinTeD project, Visa TM project.

Clara Sauvion, ingénieur, projets TIERS-ESV et D2KAB.

Arnaud Ferré, Représentations vectorielles et apprentissage automatique pour l’alignement d’entités textuelles et de concepts d’ontologie : application à la biologie. Thèse soutenue le 24-05-2019. Université Paris-Saclay IDI, co-supervision with Pierre Zweigenbaum (LIMSI). Projet OntoBedding post-doc.

 Mouhamadou Ba, post-doc, OpenMinTeD project, Visa TM project.


Recent publications


  • Mathilde Rumeau, François Fenaille, Agnès Girard, Valentin Loux, Mouhamadou Ba, Claire Nédellec, Louise Deléger, Robert Bossy, Sophie Aubin, Christelle Knudsen, Sylvie Combes. (2024) MilkOligoThesaurus, A mammalian milk oligosaccharide thesaurus for automatic annotation and text data mining of scientific articles: a dataset of synonyms from the scientific literature. Data in Brief. 2024, 110404, ISSN 2352-3409,

  • Cindy E. Morris, Andrea Radici, Christine N. Meynard, Nicolas Sauvion, Claire Nédellec, et al.. More than food: Why restoring the cycle of organic matter in sustainable plant production is essential for the One Health nexus. CAB Reviews Perspectives in Agriculture Veterinary Science Nutrition and Natural Resources, 2024, 19, pp.1. 10.1079/cabireviews.2024.0008. hal-04524528

  • Dérozier S, Bossy R, Deléger L, Ba M, Chaix E, Harlé O, Loux V., Falentin H., Nédellec C. (2023) Omnicrobe, an open-access database of microbial habitats and phenotypes using a comprehensive text mining and data fusion approach. PLoS ONE 18(1): e0272473. 

  • Anfu Tang, Louise Deléger, Robert Bossy, Pierre Zweigenbaum, Claire Nédellec. (2022) Do syntactic trees enhance domain-specific BERT models for relation extraction? Database, Volume 2022.

  • Morris, C.E., Géniaux, G., Nédellec, C., Sauvion, N. & Soubeyrand, S. (2021) One Health concepts and challenges for surveillance, forecasting, and mitigation of plant disease beyond the traditional scope of crop production. Plant Pathology, 00, 1– 12.

  • Ferré, A., Deléger, L., Bossy, R., Zweigenbaum, P., Nédellec, C., (2020). C-Norm: a neural approach to few-shot entity normalization. BMC Bioinformatics 21579

  • Claire Nédellec, Liliana Ibanescu, Robert Bossy, Pierre Sourdille (2020)WTO, an ontology for wheat traits and phenotypes in scientific publications. 18(2) Genomics & Informatics. juin 2020. doi: 10.5808/GI.2020.18.2.e14

  • Ferré, A., Deléger, L., Bossy, R., Zweigenbaum, P., Nédellec, C.,. C-Norm: a neural approach to few-shot entity normalization. BMC Bioinformatics 21, 579 (2020).

International conference

  • Tang. A. Bossy R., Nédellec C., Deléger L. Exploiting Graph Embeddings from Knowledge Bases for Neural Biomedical Relation Extraction. In proceedings of the 29th Annual International Conference on Natural Language & Information Systems (NLDB 2024), Torino, Italy, June 2024. 

  • Mariya Borovikova, Arnaud Ferré, Robert Bossy, Mathieu Roche and Claire Nédellec. Semantically-Informed Domain Adaptation for Named Entity Recognition. ISMIS-2024 (International Symposium on Methodologies for Intelligent Systems), Poitiers, 2024.

  • Mariya Borovikova, Arnaud Ferré, Robert Bossy, Mathieu Roche, Claire Nédellec. Could keyword masking strategy improve language model? In: Proceedings of the 28th International Conference on Natural Language & Information Systems (NLDB 2023), Métais, E., Meziane, F., Sugumaran, V., Manning, W., Reiff-Marganiec, S. (eds). Lecture Notes in Computer Science, vol 13913. Springer, Cham. University of Derby, United Kingdom, 21-23 June 2023.

  • Anfu Tang, Louise Deléger, Robert Bossy, Pierre Zweigenbaum, Claire Nédellec. Does constituency analysis enhance domain-specific pre-trained BERT models for relation extraction? Proceedings of the BioCreative VII Challenge Evaluation Workshop, 8-10 Feb. 2021. ISBN: 978-0-578-32368-8

    Arnaud Ferré, Robert Bossy, Mouhamadou Ba, Louise Deléger, Thomas Lavergne, Pierre Zweigenbaum, Claire Nédellec. Handling Entity Normalization with no Annotated Corpus: Weakly Supervised Methods Based on Distributional Representation and Ontological Information, Proceedings of the 12th international conference on Language Resources and Evaluation (LREC-2020), pages 1959–1966. European Language Resources Association (ELRA) publisher, mai 2020.

Complete list of publications

Full list of publications on HAL