Date: 31 January 2023
Time: 16:00 - 17:00 CET
Slides: Federated analysis for polygenic risk scores
Contact: Marta Lloret Llinares, Mamana Mbiyavanga
Overview
Polygenic risk scores (PRS) provide an estimate of an individual’s disposition to a trait or complex disease which are calculated as the sum of the risk alleles weighted by the effect size estimate of the genome-wide association study data on the phenotype. PRS scores help in understanding the shared aetiology of certain traits and also in risk prediction and prevention of certain diseases.
As a part of the CINECA project we have been working on the development of a demonstrator of federated genetic analyses utilising a computational pipeline for PRS analysis. We have implemented this workflow using Nextflow, a modular and reproducible workflow manager that can be deployed and autoscaled to many different working environments including Slurm-based clusters and Kubernetes deployed using AWS amongst other computing environments. In this webinar, we will provide an overview of this PRS pipeline utilising the CINECA UK1 synthetic dataset, derived from the 1000 genomes project, as a demonstrator. As part of the work package we will further extend the demonstrator to other datasets and GWAS workflows, and integrate it to be run in a federated setting utilising the GA4GH guidelines on using Beacons for data discovery along with AAI passports for data access authorisation and deployments.
tARGET AUDIENCE
Computational biologists or similar interested in federated analysis and polygenic risk scores.
Format
Public webinar
About the speakers
Will Rayner is the head of the Data and Analytics group at the Institute of Translational Genomics in the Computational Health Department at Helmholtz Munich. He is interested in all aspects of data management and data privacy and has been leading the CINECA WP4 PRS use case.
Anshika Chowdhary is a Data informatician, at the Institute of translational genomics in Helmholtz Zentrum Munich. She has been working on the development of the workflows in WP4 of the PRS use case, and analysis of the eQTL catalog on the datasets at HMGU.
About CINECA
The CINECA (Common Infrastructure for National Cohorts in Europe, Canada, and Africa) project aims to develop a federated cloud enabled infrastructure to make population scale genomic and biomolecular data accessible across international borders, to accelerate research, and improve the health of individuals across continents. CINECA will leverage international investment in human cohort studies from Europe, Canada, and Africa to deliver a paradigm shift of federated research and clinical applications. The CINECA consortium will create one of the largest cross-continental implementations of human genetic and phenotypic data federation and interoperability with a focus on common (complex) disease, one of the world’s most significant health burdens. CINECA has assembled a virtual cohort of 1.4M individuals from population, longitudinal and disease studies. Federated analyses will deliver new scientific knowledge, harmonisation strategies and the necessary ELSI framework supporting data exchange across legal jurisdictions enabling federated analyses in the cloud. CINECA will provide a template to achieve virtual longitudinal and disease specific cohorts of millions of samples, to advance benefits to patients. It will leverage partner membership of standards and infrastructures like the Global Alliance for Global Health, BBMRI, ELIXIR, and EOSC driving the state of the art in standards development, technical implementation and FAIR data.