Interdisciplinary Data Discovery: Approaches and Good Practices from Around the World
Session Organisers: Anna-Lena Fluegel, Claudia Martens
A vital precondition for researchers to use “Data to Improve our World” is that they are able to search for and find this data to be able to reuse and build upon it. New fields of interdisciplinary research like AI and robotics, or a global pandemic that required insights from not only virology and respiratory medicine, but also behavioural psychology, sociological and economical data to battle it in all its dimensions are but two examples for the need of researchers to think outside the box and discover data beyond their go-to discipline-specific repositories. For research endeavours like this, interdisciplinary discovery portals for research data are vital.
This session aims at exchanging experiences and good practices on the challenges of enabling metadata discovery across disciplines. Our own portal, B2FIND, is a cross-disciplinary discovery portal for research data, developed within the pan-European Collaborative Data Infrastructure EUDAT CDI. The portal is a comprehensive joint metadata catalogue referencing metadata records for data stored in various data centres and using different metadata schemas on datasets of different levels of granularity. From our experience, the major challenge an interdisciplinary discovery portal faces is bridging the semantic and structural gaps between discipline-specific data management practices and standards on the one hand and a common search space yielding useful search results on the other. In practice, however, this is challenging.
While B2FIND references data from around the world, from an organisational point of view most of the integrated repositories belong to European infrastructures or projects. Therefore, in this session we would like to leave our Eurocentric perspective and learn about how discovery portals from other parts of the world deal with challenges related to metadata, standards, organisational and technical hurdles in their everyday practice. The aim of the session is to learn about good practices and to connect key players in data discovery around the globe. We are especially interested to hear about the following questions:
- In order to enable and support FAIR data principles, close cooperation and coordination with scientific communities, research infrastructures and other initiatives dealing with metadata standardisation are essential. How does your discovery portal manage to be part of this dialogue?
- What is your good-practice approach to manage semantic mappings with a number of ever evolving metadata schemas? How do you choose controlled vocabularies? How do you deal with multilingual search?
- How does your discovery portal find a balance between community-specific metadata that serve discipline-specific needs on the one side, and a metadata schema that is sufficiently generic to capture interdisciplinary research data, but is also specific enough to enable a useful search with satisfying search results?
- What are the crucial challenges and solutions (for example technical, political, organisational, regarding the use of the standards etc.) you encounter in enabling interdisciplinary data discovery?
Based on the gathered insights we would like to have a structured discussion about the future of interdisciplinary search, based on two (exaggerated) theses:
- The future for data discovery is Google. Everything we do should aim for search engine optimization.
- The future of data discovery depends on national (European/ Panafrican / Panamerican/ international) infrastructures that decide which standards and technical solutions to use.
Suggested agenda for the session:
- Welcome, overview (5 min)
- Lightning Talks (45 min)
- Question and Answers (20 min)
- Structured Discussion (20 min)
The desired outcome for this session is an exchange of ideas of possibilities and opportunities to develop better discovery functions and methods for research data across disciplines.