Height

Boinformatics: Chatting about polygenic risk scores

Listen here.

Polygenic risk scores (PRS) rely on the genome-wide association studies (GWAS) to predict the phenotype based on the genotype. However, the prediction accuracy suffers when GWAS from one population are used to calculate PRS within a different population, which is a problem because the majority of the GWAS are done on cohorts of European ancestry.

In this episode, Bárbara Bitarello helps us understand how PRS work and why they don’t transfer well across populations.

Low transferability of height polygenic risk scores in admixed ancestry populations

Abstract

Polygenic risk scores (PRS) summarize the results of GWAS into a single number that can predict quantitative phenotype or disease risk. One barrier to the use of PRS in clinical practice is that the majority of GWAS come from cohorts of European ancestry, and predictive power is lower in non-European ancestry cohorts. There are many possible reasons for this decrease; here we show that differences in allele frequencies, LD patterns, and phenotypic variance across ancestries are unlikely to be driving this pattern. We focus on PRS for height in cohorts with admixed African and European ancestry, which allows us to test for ancestry-related differences in PRS prediction while controlling for environment. We first show that that the predictive power of height PRS increases linearly with European ancestry (partial R2 ranges from 0.02-0.12 for 0-100% European ancestry). We replicate this pattern with effect sizes re-estimated within sibling pairs, ruling out residual population structure. This pattern persists when PRS is computed using subsets of SNPs in regions of both high and low LD and ancestry-related differences in effect size are not correlated with local recombination rate. This suggests that differences in LD are not a major driver of low transferability. Next, we show that frequency differences of associated variants between African and European ancestry backgrounds explain only up to 11% of the observed reduction in predictive power and that there is no association between ancestry and phenotypic variance, indicating that the reduction in PRS predictive power cannot be explained by causal variants that are specific to the African ancestry background. Finally, we see a modest improvement in prediction when using a multi-PRS approach that includes ancestry-specific effect sizes in the PRS. We conclude that the reduced predictive power in non-European ancestry populations is largely explained by differences in causal effect sizes across these ancestries.

Polygenic risk scores perform poorly across populations

Abstract |

The vast majority of genome-wide association studies (GWAS) are performed in cohorts of European ancestry. Systematic differences in polygenic risk scores (PRS) between European and non-European ancestry populations are believed to be largely spurious. However, it is not clear whether they are completely inaccurate nor how much individual-level predictive power is lost by applying PRS based on European-ancestry GWAS to non-European ancestry populations. Finally, a quantitative understanding of the biological or statistical basis for the poor performance of PRS in non-European-ancestry populations is lacking. To test this, we explored how well PRS predict a well-studied and highly polygenic trait: height. We calculated PRS using 41 sets of independent SNPs based on distance (physical or genetic) or LD clumping and pruning methods. We first compared PRS based on effect sizes from two independent GWAS: GIANT and UK Biobank (UKB), both of which were performed in individuals of European ancestry. We replicate previous observations of population-level differences in PRS, but these results are significantly different depending on datasets and clumping strategies. Depending on clumping strategy, the average difference between 1000 Genomes European and African PRS varies from 0.57-6.75 standard deviations (SD) using GIANT and 0.48-2.16 SD using UKB summary statistics. This dependence on clumping strategy supports the idea that most of these differences are spurious. We then investigated individual-level prediction in ~7,300 African American (AA) and ~7,400 European American (EA) individuals. Using UKB effect sizes, we found that PRS explain ~1.7% of height variation in AA individuals, compared to about 5.5% of variation in EA individuals. In both AA and EA, PRS explains more of the variance in height in individuals with a higher proportion of European ancestry. Interestingly, although we find a significant positive correlation (r=0.262, p=7.3e-115) between PRS and European ancestry among AA individuals, the correlation between height and European ancestry is extremely low (r=-0.02, p=0.056) and genome-wide ancestry explains only 0.05% of the variance in height, confirming that cross-population differences in PRS do not correlate to phenotypic differences. Finally, we evaluated whether local ancestry improves prediction for non-European populations, investigated dependence on other genomic features and extended our model to other phenotypes.