Meta-analysis of splicing and expression QTLs across three cohorts D4.4

Authors - Kaur Alasoo, Robert Warmerdam, Harm-Jan Westra
Contributing Partners: SIB, UTartu, BIOS, UOxf, UCT, EMBL-EBI, CS

The CINECA project aims to develop a common infrastructure to support federated data analysis across national cohorts in Europe, Canada, and Africa. In this report, the progress made over the past four years is discussed, which involves the development of six modular workflows to quantify and normalize molecular traits, pre-process genotype data, and test for associations between molecular traits and genotypes. The approach improves on the previous state of the art by packaging software dependencies into Docker/Singularity containers, using the Nextflow language to orchestrate complex multi-step workflows, and using the HASE(1) framework to reduce the amount of data that needs to be transferred between cohorts. The project provides training materials and open access datasets to encourage adoption and demonstrates how the workflows can be used to perform federated analysis across multiple real cohorts located in Switzerland, Germany, the Netherlands, and Estonia.

Deliverables, WP4deliverables, wp4