This report details the development of a demonstration tool for CINECA Task T5.1, which integrates WP1-3 CINECA services to advance federated biobank search. UML diagrams and mockups were used to model the search process and create a user interface. The architecture of the system and strategies for accessing and storing data were also described. The report highlights efforts to integrate pre-analytical metadata, which are important for determining the quality of biospecimens, and a synthetic dataset was prepared to develop search services for accessing and visualizing such data. The evaluation of the robustness and search effectiveness of the implemented services is also discussed.
https://doi.org/10.5281/zenodo.6783295
In collaboration with other GA4GH-associated projects, CINECA is developing infrastructure which will permit effective use of widely-dispersed data increasing the size and quality of datasets available for disease research. In alignment with community standards, using standardised interfaces, data analysis will be federated and migrated to the data, respecting data access restrictions.
Solutions CINECA is adopting from the Discovery Work Stream include the Data Connect and Beacon v2 API, while from the DURI and Data Security Work Streams the GA4GH Passports, AAI and DUO are being utilised.
Recently WP4 has delivered a simple demonstrator pipeline to perform a federated joint variant genotyping analysis. The goal of this use case is to demonstrate how a simple metric (in this case, allele frequency) can be computed in a federated manner, without requiring ever collecting the individual level data in a central location.
Read MoreThis video describes a common framework for designing portable federated pipelines. The joint cohort genotyping pipeline is provided as a specific implementation example. The ability of the pipeline to run in different environments, accessing the data via different protocols, and applying the appropriate normalisations is demonstrated.
Read MoreThe CINECA project aims to develop a common infrastructure to support federated data analysis across national cohorts in Europe, Canada, and Africa. In this report, the progress made over the past four years is discussed, which involves the development of six modular workflows to quantify and normalize molecular traits, pre-process genotype data, and test for associations between molecular traits and genotypes. The approach improves on the previous state of the art by packaging software dependencies into Docker/Singularity containers, using the Nextflow language to orchestrate complex multi-step workflows, and using the HASE(1) framework to reduce the amount of data that needs to be transferred between cohorts. The project provides training materials and open access datasets to encourage adoption and demonstrates how the workflows can be used to perform federated analysis across multiple real cohorts located in Switzerland, Germany, the Netherlands, and Estonia.
https://doi.org/10.5281/zenodo.7464116
Authors - Álvaro González (CSC), Shubham Kapoor (CSC), Kirill Tsukanov (EMBL-EBI)
The federated analysis platform defined by this task aims to provide technological solutions for three exemplar use cases: Federated joint cohort genotyping; Polygenic Risk Scores (PRS) workflow across two similar ethnic background sample sets; Federated QTL analysis for molecular phenotypes. In this deliverable, we gathered the technical requirements based on these use case descriptions and wrote a short design document which explains the requirements and lists the different options for a solution.
Three distinct frameworks were considered to address the requirements from the use-cases. The chosen framework supports different computing environments, which is a requirement for true federated analysis. The framework also supports extending compatibility with GA4GH standards, such as WES, htsget, and AAI / Passports. Plans to extend this proposed solution beyond these initial sites will be carried out after the initial phase of validation.
https://doi.org/10.5281/zenodo.4609356
Read MoreIn this deliverable document, we report on the activities in task 6.4 - Training Programme, describe the CINECA training activities in the first 24 months of the project and provide the Training Plan for the next 12-24 months. For training interventions targeted at a broader audience, we have set up a webinar series, providing quarterly online learning interventions. We ran a total of 6 webinars (3 of these webinars in 2019, and 3 in 2020), with 23 attendees on average, 68% on average of those who registered. In addition, a series of short training videos (https://www.cineca-project.eu/short-videos) was created to facilitate the uptake of CINECA outputs. Eight short videos were produced by work packages on different topics. The short videos were submitted to ELIXIR’s training portal to increase engagement and disseminated via CINECA’s various communication channels.
https://doi.org/10.5281/zenodo.6223125
Authors - Éloïse Gennet, Melanie Goisauf, Delphine Pichereau, Emmanuelle Rial-Sebbag
Remaining liberties that GDPR provides to EU Member States, as well as remaining ambiguities on GDPR interpretation, continue to feed debates in the ethical and legal literature. Projects like CINECA, which is seeking to facilitate health data exchanges between cohorts in Europe, Canada and Africa, offer valuable experience and input on essential ethical and legal gaps between countries and cohorts on questions such as the ethical lawful basis for international health data sharing and secondary processing for research purposes.
The focus of this deliverable will be on answering, both from a legal and an ethical point of view, two priority questions: How to choose a legal basis for CINECA’s data processing? And how should CINECA apprehend broad consent to further data processing? The goal will be to study how the CINECA project could be efficiently conducted (especially data sharing) while being legally compliant with relevant laws and regulations across all member states, and most of all, being compliant with established ethical guidelines and practices across three continents.
https://doi.org/10.5281/zenodo.4298450
Read More