Knowledge Graph chatbot – toward Large Language Model based Interaction with Metabolomics Knowledge Graphs

About the project

The KGBot project (Knowledge Graph chatBot) aims to enhance an AI-powered chemistry chatbot prototype designed to improve the accessibility and usability of metabolomics knowledge graphs (KGs).

By leveraging mass spectrometry data, the chatbot employs a natural language interface to generate queries (using the SPARQL language), allowing chemists to intuitively explore complex metabolomics the knowledge graph (represented in RDF).

Key objectives include broadening the chatbot’s compatibility with various large language models (LLMs) and KGs, integrating dynamic tools for data extraction and visualization, and enabling extended dialogical interactions to support iterative queries. The project also seeks to enrich user interactions by providing features such as result visualization, hypothesis generation, and analysis recommendations.

Building on the interdisciplinary expertise of the project partners, this initiative fosters transdisciplinary collaboration and aims to deliver scalable solutions applicable across multiple domains. Anticipated outcomes include enhanced access to scientific data and the development of a robust open-source framework to support future academic and industrial applications. Funding will be directed towards supporting postdoctoral researchers and student contributions to the project.

Principal investigators
  • Louis-Félix Nothias (Institut de Chimie de Nice, Université Côte d’Azur, Nice)
  • Fabien Gandon (Wimmics UniCA, Inria, CNRS, i3S)
Project partners
  • Swiss Institute of Bioinformatics, Lausanne, Switzerland
  • School of Pharmaceutical Sciences, Univ. of Geneva, Switzerland
Duration
  • November 2024 - November 2025
Total amount
  • 70 000 euros
Publications
  • Emma Tysinger, Marco Pagni, Olivier Kirchhoffer, Florence Mehl, Fabien Gandon, et al.. An Artificial Intelligence Agent for Navigating Knowledge Graph Experimental Metabolomics Data. 2023 Swiss Metabolomics Society Annual Meeting, Swiss Metabolomics Society Zurich; ETH Zurich, Sep 2023, Zurich, Switzerland. https://inria.hal.science/hal-04381448

  • Best paper award for "User Interface and Agent Interface for Online Generation of Knowledge Graph’s Competency Questions and Question-Query Training Sets" de Yousouf Taghzouti, Franck Michel, Tao Jiang, Louis-Felix Nothias et Fabien Gandon, au Workshop RAGE-KG 2025 https://2025.rage-kg.org/ 

    More and more fields and professions want to use the latest generative AI techniques to facilitate access to their databases. However, there are few datasets of question-query pairs that can be used to fine tune large language models for tasks such as translating natural language questions into queries over graph databases (SPARQL).

    This paper presents Q2Forge, a web application designed to make it easier to create question-query pairs for any RDF knowledge graph. The tool enables users to generate, test, and refine competency questions and their SPARQL equivalents directly within the interface. It leverages a retrieval augmented generation (RAG) architecture for contextual enrichment. The result is an open-source solution for creating reusable question-query datasets, applicable to any knowledge graph.

    Find out more: https://hal.science/hal-05289962

Leveraged projects

Contribution to the proof of concept as part of a preparatory work for MetabolinkAI an international research project funded by the Agence National de la Recherche (ANR) and the Swiss National Fund (SNF) (2025-2029).
Project website: www.metabolinkai.net