User documentation about the advanced search capabilities of the EUDAT B2SHARE service.Modified: 2 February 2020
B2SHARE is a web-based service for storing and publishing data sets, intended for European scientists. The service utilises other EUDAT services for reliability and data retention, while storing the data at trusted repositories with national backing, in order to provide a professionally managed and supported IT environment.
B2SHARE supports a default set of metadata fields that are defined in the root schema. Every record must provide the mandatory fields in this schema. In addition, the service allows communities to define their own set of metadata fields aggregated in the so-called community metadata schema. Every record that is published as part of these communities must also fill in the mandatory fields in these community-specific schemas. All metadata field values are indexed by B2SHARE and are therefore searchable using the web interface and REST API.
How to access the B2SHARE service
B2SHARE is available from the following URL: https://b2share.eudat.eu.
Searching in B2SHARE
Search is available directly on the front page of the web interface or after clicking on the 'Search' button on that same page. Searching for any keyword always ends up on the advanced search page of B2SHARE.
To search using the REST API, append the query normally entered in the search box to the search API URL as the value of the parameter 'q': https://b2share.eudat.eu/api/records?q=<query-string>. For more information on searching through the REST API, see the relevant section in the B2SHARE REST API documentation.
B2SHARE indexes all the metadata of each record using Elasticsearch search technology. After each record creation, versioning or metadata update, this index will be updated automatically and therefore all changes can immediately be found using the search functionality.
Elasticsearch indexes all metadata fields by not storing the exact field values but the more generic equivalents instead. In practice this means that when a singular keyword is entered, also plural equivalents are matched successfully.
Registered and unregistered users can use the search field on the B2SHARE home page (Figure 1). Enter a search query in the search field and click the "Search" button. The text entered can be part of a title, keyword, abstract or any other metadata, including values for metadata fields defined in community-specific schemas.
Unregistered users can only search for data sets that are publicly accessible. The default search mode is simple search, which provides an input box where your queries can be typed. Usually it is sufficient to just type some keywords one is interested in and hit return. Please note that only the latest version of matching records is shown in the search result.
Figure 1. The B2SHARE front page with search bar.
Multiple keyword search
You can enter multiple keywords at once to either find records that match all of the keywords or any of the keywords. To search for exact multi-keyword matches, encapsulate the keywords in double quotes. See for example the difference between the results for '"container technologies"' and 'container technologies'.
Logical search operators
To make a distinction between all or any matching keywords or to exclude a specific word, use the logical operators 'AND', 'OR' or 'NOT'. The default operator between multiple keywords is 'OR'. Always make sure to use capitals, otherwise the operator will be interpreted as a keyword. To see the difference between the operators, compare the searches between 'lofar AND singularity', 'lofar OR singularity' and 'lofar NOT singularity'.
It is also possible to use the character equivalent for each operator. For an overview, see Table 1.
Table 1. Search operators in B2SHARE.
To get to the advanced search page, hit the search button on the front page and select the additional options in the new form below the search text field (Figure 2). Using the form on this page, the search can be narrowed down to a specific community, sorted by most recent or best matching records and the page size or number of records returned per page can be adjusted.
Figure 2. The B2SHARE advanced search page.
Next to simply entering one or more keywords in the search bar's text field, it is also possible to make your search specific to a metadata field in the B2SHARE root schema or one of the communities metadata schemas. Searching for specific metadata fields requires you to prepend your keyword with the metadata field's internal name and structure followed by a colon (':'). Thus if you want to limit your search to the title(s) of records only, use the 'titles.title' prefix. See for example the difference between 'titles.title:container' and 'container'.
To find the corresponding internal field names and structure, you can use the information provided for the schema of a community, e.g. the EUDAT community. This information is also available through the B2SHARE REST API, see for example the schema definition of the EUDAT community. A non-exhaustive overview of all root schema fields is given in the Table 2.
|Field name||Internal name|
Table 2. Some examples of metadata fields and their internal name and structure equivalents.
If you want to use logical operators within a specific field search, encapsulate the value for the field in parentheses, for example 'titles.title:(container NOT technologies)' returns records containing the keyword 'container' in the title field, but not 'technologies'.
Community-specific metadata field searches
Searching for values in metadata fields defined by communities specifically works similarly to searching for field-specific values.
Finding the community's metadata schema identifier
One important difference is that the community metadata schema's internal identifier is required. This value that can be found by examining the community's definition using the REST API response for all communities. Using the link to a specific community found in the 'links' section of each community and adding '/schemas/last' to the URL, the community's entire metadata schema definition (including the default fields) can be found, see for example the schema for the InGrid community. In the properties of the 'community_specific' structure a reference can be found to the community-specific metadata block that defines the community-specific fields for this community. The identifier is the value for the 'id' field.
Searching for a community-specific metadata field value
Using the identifier value of the community schema of InGrid 'fccd46c7-db79-460b-ad34-abf078d194a3', it is possible to search specifically for one of the metadata fields, e.g. 'Timespan'. This works as follows: prepend the value with 'community_specific', followed by the schema identifier, followed by the field's internal name and closed by a colon following the value. For the value '2016' this leads to 'community_specific.fccd46c7-db79-460b-ad34-abf078d194a3.time_span:2016'. Naturally, since this field is only defined for the InGrid community, all found records are part of this community.
Please visit our training site on GitHub for B2SHARE and other hands-on training material.
Our B2SHARE presentations offer training material for the service.
Support for B2SHARE is available via the EUDAT ticketing system through the webform.
If you have comments on this page, please submit them though the EUDAT ticketing system.
Hans van Piggelen, email@example.com