Elucidating the role of 8q24 in colorectal cancer
Depth of sequencing coverage in the CG panel was high across each of the target regions, 48x–58x (Fig. Concordance between Illumina Omni Express genotype and sequencing data in 84 samples was 99%; 139 668 SNPs and 16 173 indels and substitutions were catalogued within the 16.2 Mb region.Of these, 96 195 were also present in the 1000 genomes panel, and a further 11 653 were monomorphic in the five GWASs.
We excluded the Xp22.2 locus from the analysis due to the low density of GWAS SNPs on the X chromosome.In total, 44 478 of all variants mapping to the 16 regions of association had frequencies ≥1%, 4859 (11%) of which were had not been catalogued by db SNP132.Figure 2 shows the number and minor allele frequency (MAF) distribution of variants in the 1000 genomes and CG panels.In total, 46 829 of all variants mapping to the 16 regions had frequencies ≥1%, 4658 (10%) of which were not referenced in db SNP132.In addition to using 1000 genomes data, we made use of deep sequencing (30×) data generated on 253 individuals, 199 of whom had been diagnosed with early-onset CRC (henceforth referred to as the CG panel).To ensure recovery of all variants contributing to CRC risk at these loci through imputation in addition to utilizing 1000 Genomes Project data (13) as a reference panel, we made use high-coverage sequencing data on 253 individuals, 199 of whom had familial CRC.
We studied five non-overlapping case–control series of Northern European ancestry, which post-QC provided GWAS data on 5626 CRC cases and 7817 controls ().
The 1000 genomes Phase I Interim reference panel based on low-coverage (4–6x) sequencing of 1094 individuals from Africa (AFR; = 181) catalogued 203 047 SNPs mapping to the 16.2 Mb region.
A total of 92 095 SNPs were monomorphic in all five GWASs.
To enhance our ability to discover low-frequency risk variants, in addition to using 1000 Genomes Project data as a reference panel, we made use of high-coverage sequencing data on 253 individuals, 199 with early-onset familial CRC.
For 13 of the regions, it was possible to refine the association signal identifying a smaller region of interest likely to harbour the functional variant.
Approximately 86% of these variants were shared by both reference panels.