Genomic duplications are important sources of structural change and gene innovation. In humans, the most recent and highly identical sequences (>90% homology, >1 kb long) are known as segmental duplications (SDs). Single-nucleotide variants or single-nucleotide polymorphisms within SDs have not been systematically assessed due to limitations around mapping short-read sequencing data. Single-nucleotide variant rs62486260 was flagged in a study of familial renal stone disease but it was unclear whether it was real or an artifact resulting from the presence of a SD. We describe in silico and wet-lab approaches to investigate this, using segment-specific long-PCR assays, followed by short PCR for Sanger sequencing. Our conclusion was that rs62486260 is an artifact. Our approach can be generalized to deal with other such situations.
Keywords: PCR; SNV; Sanger sequencing; T2T; TCAF2; genotype; in silico; pseudogene; segmental duplication.
The method described includes a two-step procedure for determining whether an apparent single-nucleotide polymorphism may be an artifact resulting from the presence of a duplicated genomic region/pseudogene. Step one involves identifying sequence differences between the two duplicated regions and designing a long PCR assay to specifically amplify each region separately. Step 2 involves amplifying a short PCR product which flanks the single-nucleotide polymorphism of interest, from the long products generated in step 1.