Differential Dropout among SNP Genotypes and Impacts on Association Tests
Publisher:
Karger
E-ISSN:
1423-0062|63|3-4|219-228
ISSN:
0001-5652
Source:
Human Heredity,
Vol.63,
Iss.3-4, 2007-03,
pp. : 219-228
Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.
Previous
Menu
Next
Abstract
Background: Current biotechnologies are able to achieve high accuracy and call rates. Concerns are raised on how differential performance on various genotypes may bias association tests. Quantitatively, we define differential dropout rate as the ratio of no-call rate among heterozygotes and homozygotes. Methods: The hazard ofdifferential dropout is examined for population- and family-based association tests through a simulation study. Also, we investigate detection approaches such as Hardy-Weinberg Equilibrium (HWE) and testing for correlation between sample call rate and sample heterozygosity. Finally, we analyze two public datasets and evaluate the magnitudes of differential dropout. Results: In case-control settings, differential dropout has negligible effect on power and odds ratio (OR) estimation. However, the impact on family-based tests range from minor to severe depending on the disease parameters. Such impact is more prominent when disease allele frequency is relatively low (e.g., 5%), where a differential dropout rate of 2.5 can dramatically bias OR estimation and reduce power even at a decent 98% overall call rate and moderate effect size (e.g., ORtrue = 2.11). Both of the two public datasets follow HWE; however, HapMap data carries detectable differential dropout that may endanger family-based studies. Conclusions: Case-control approach appears to be robust to differential dropout; however, family-based association tests can be heavily biased. Both of the public genotype data show high call rate, but differential dropout is detected in HapMap data. We suggest researchers carefully control this potential confounder even using data of high accuracy and high overall call rate.