Choosing Haplotype-Tagging SNPS Based on Unphased Genotype Data Using a Preliminary Sample of Unrelated Subjects with an Example from the Multiethnic Cohort Study

Publisher: Karger

E-ISSN: 1423-0062|55|1|27-36

ISSN: 0001-5652

Source: Human Heredity, Vol.55, Iss.1, 2003-08, pp. : 27-36

Disclaimer: Any content in publications that violate the sovereignty, the constitution or regulations of the PRC is not accepted or approved by CNPIEC.

Previous Menu Next

Abstract

We describe an approach for picking haplotype-tagging single nucleotide polymorphisms (htSNPs) that is presently being taken in two large nested case-control studies within a multiethnic cohort (MEC), which are engaged in a search for associations between risk of prostate and breast cancer and common genetic variations in candidate genes. Based on a preliminary sample of 70 control subjects chosen at random from each of the 5 ethnic groups in the MEC we estimate haplotype frequencies using a variant of the Excoffier-Slatkin E-M algorithm after genotyping a high density of SNPs selected every 3–5 kb in and surrounding a candidate gene. In order to evaluate the performance of a candidate set of htSNPS (which will be genotyped in the much larger case-control sample) we treat the haplotype frequencies estimate above as known, and carry out a formal calculation of the uncertainty of the number of copies of common haplotypes carried by an individual, summarizing this calculation as a coefficient of determination, R2h. A candidate set of htSNPS of a given size is chosen so as to maximize the minimum value of R2h over the common haplotypes, h.