As our world has continually become more digitalised, we have begun facing new challenges for a digitalised world. One of these challenges, which has gained more attention over the last 6 years, is how to handle data. The ‘FAIR Guiding Principles for scientific data management and stewardship’ were published in Scientific Data in 2016, laying out a set of principles to make sure that the large amounts of data being created every year can be found and reused by the right people to bring continual benefits to society.
What is FAIR?
FAIR is an acronym for Findable, Accessible, Interoperable and Reusable.
FAIR is used to talk about three things: data (or any digital object), metadata (information about that digital object), and infrastructure, although data is the most commonly used term.
The first step in (re)using data is to find them. Metadata and data should be easy to find for both humans and computers. Machine-readable metadata are essential for automatic discovery of datasets and services, so this is an essential component of the FAIRification process.
Once the user finds the required data, they need to know how the data can be accessed, possibly including authentication and authorisation.
The data usually need to be integrated with other data. In addition, the data need to interoperate with applications or workflows for analysis, storage, and processing.
The ultimate goal of FAIR is to optimise the reuse of data. To achieve this, metadata and data should be well-described so that they can be replicated and/or combined in different settings.
Why is this practically important?
In the case of Research & Development (R&D), over €2 Trillion is spent globally every year. This R&D often results in data and datasets which are useful for further and future research. If this data cannot be found, cannot be accessed, does not interoperate, therefore is unable to be reused, essentially it is a huge financial loss for global research as well as a setback to scientific progress.
Has progress been made?
Even though the FAIRness of data is so important, only 28% of researchers are familiar with the principles, according to the 2021 State of Open Data report by Digital Science. Although this number has grown steadily over the last 5 years, it is still only a quarter of researchers. FAIR Data is important in 2022 because not enough researchers know what FAIR means or how to implement these principles on their datasets.
How does EUDAT contribute?
EUDAT provides a series of services for Research Data Management (RDM). These include services for Data Access and Reuse.
- Data storage: B2SHARE (primarily for storing very large sets of data) and B2DROP (mainly for storing long-tail research data during ongoing research collaborations)
- Searching for data: B2FIND (searching based on metadata descriptions of data stored in EUDAT data centres and in other data repositories). B2HANDLE is the distributed service for storing, managing and accessing persistent identifiers (PIDs) and essential metadata (PID records) as well as managing PID namespaces.
- Authentication and Authorization (AAI) of users: B2ACCESS (enables users to log in using various identities – for example, an identity from the research organisation they work for or with a Google account).
Find out more about how the EUDAT Collaborative Data Infrastructure (EUDAT CDI) services can help you make your research data FAIR!
Check out our 30-minute webinar and slides, Introduction to the EUDAT CDI and its Services.