The EUDAT team is pleased to announce the release of the beta version of the Generic Execution Framework (GEF). The trend towards larger data volumes exacerbates the problem of costly and time consuming data transfers to off-site data processing locations. The EUDAT GEF enables you to deploy your containerised scientific tools to computational resources in close proximity to the input data in order to minimise overall data transfer costs for your research community.
Move the tools not the data to avoid costly data transfers!
The EUDAT GEF enables execution of containerized software tools on data stored in the EUDAT Collaborative Data Infrastructure (CDI). You can avoid or minimise data transfers by manually containerising the software tools required for a specific data processing job and enacting them as close to the input data as the infrastructure permits, minimising overall data transfer costs. This is the idea behind the EUDAT GEF. This approach also fosters a high degree of reproducibility as the same containerised tools and their configuration can be run repeatedly on different sites without requiring new installation or configuration procedures. |
The GEF employs the Docker containerisation technology and uses specifically annotated Docker container images called GEF services for scientific data processing. You can configure the GEF to interface with Docker Server and Docker Swarm installations on various platforms. If you wish to employ the GEF you are asked to containerise your software tools by uploading a set of containerisation instructions in the form of a Dockerfile to the GEF. Input data to GEF services can be specified via Persistent Identifiers (PID) or URL and are automatically transferred to the processing location. The GEF is integrated with the EUDAT AAI and requires users who wish to build GEF services and run them to have a B2ACCESS account. Note that the GEF beta is only integrated with the B2ACCESS development instance and you require a separate account there. The aim of the GEF development is full integration with all EUDAT data services of the CDI and it currently supports B2SHARE and B2DROP.