Data Infrastructure and Augmented Intelligence for Sustainable Development
Status: Active
Project Duration: December 2025. - September 2029.
Acronym: PiPi
About Project
The project develops an infrastructure for storing, managing and using data based on generative artificial intelligence in the domains of environment and renewable energy.
The project focuses on:
(I) creating a data space that enables secure connection of heterogeneous data sources through cataloguing and standardization of data meaning, with a flexible architecture adapted to different user needs,
(II) developing a Retrieval Augmented Generation (RAG) system based on large-scale language models that use the collected data.
The RAG system is optimized for specific domains through improved document search strategies, selection and integration of domain-specific language models, and a comprehensive evaluation framework that ensures the reliability and accuracy of the generated information. Key innovations of the project include the development of a flexible data space architecture with clearly defined semantic relationships, the implementation of advanced algorithms for fast retrieval of relevant information, and evaluation methodologies adapted to the specificities of the environmental and energy domains. The expected results will enable researchers, decision-makers and practitioners to access complex data and generate insights based on reliable sources of knowledge, which significantly accelerates informed decision-making in critical areas of sustainability.