Skip to main page content
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2018 Nov 20;11(Suppl 5):105.
doi: 10.1186/s12920-018-0416-0.

Bi-stream CNN Down Syndrome Screening Model Based on Genotyping Array

Affiliations
Free PMC article

Bi-stream CNN Down Syndrome Screening Model Based on Genotyping Array

Bing Feng et al. BMC Med Genomics. .
Free PMC article

Abstract

Background: Human Down syndrome (DS) is usually caused by genomic micro-duplications and dosage imbalances of human chromosome 21. It is associated with many genomic and phenotype abnormalities. Even though human DS occurs about 1 per 1,000 births worldwide, which is a very high rate, researchers haven't found any effective method to cure DS. Currently, the most efficient ways of human DS prevention are screening and early detection.

Methods: In this study, we used deep learning techniques and analyzed a set of Illumina genotyping array data. We built a bi-stream convolutional neural networks model to screen/predict the occurrence of DS. Firstly, we built image input data by converting the intensities of each SNP site into chromosome SNP maps. Next, we proposed a bi-stream convolutional neural network (CNN) architecture with nine layers and two branch models. We further merged two CNN branch models into one model in the fourth convolutional layer, and output the prediction in the last layer.

Results: Our bi-stream CNN model achieved 99.3% average accuracies, and very low false-positive and false-negative rates, which was necessary for further applications in disease prediction and medical practice. We further visualized the feature maps and learned filters from intermediate convolutional layers, which showed the genomic patterns and correlated SNPs variations in human DS genomes. We also compared our methods with other CNN and traditional machine learning models. We further analyzed and discussed the characteristics and strengths of our bi-stream CNN model.

Conclusions: Our bi-stream model used two branch CNN models to learn the local genome features and regional patterns among adjacent genes and SNP sites from two chromosomes simultaneously. It achieved the best performance in all evaluating metrics when compared with two single-stream CNN models and three traditional machine-learning algorithms. The visualized feature maps also provided opportunities to study the genomic markers and pathway components associated with Human DS, which provided insights for gene therapy and genomic medicine developments.

Keywords: Convolutional neural networks; Deep learning; Genotyping; Human down syndrome.

Conflict of interest statement

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Figures

Fig. 1
Fig. 1
Chromosome SNP maps to represent the intensities of all SNP site on HSA21. Each column represents the information of one single gene located on the chromosome. Each row represents adjacent SNP sites within the same gene. Therefore, each pixel of of the chromosome SNP map is used to represent the intensity of each SNP site of genes
Fig. 2
Fig. 2
Bi-stream CNN architecture taking two chromosome SNP maps as inputs The upper CNN branch model and the lower CNN branch model both take one chromosome SNP map as input image. We merged two branch CNN models into one CNN model in the fourth convolutional layer C4, which was also followed by a max-pooling layer. Detailed CNN architecture construction and configurations are available in the Method section
Fig. 3
Fig. 3
Visualization of feature maps and trained filter weights from convolutional layer C1(shown in Fig. 2). Figure a, b, c and d in figure (a) represent four feature maps from convolutional layer C1 of lower branch CNN model (shown in Fig. 2). Figure e, f, g and h in figure (a) are the corresponding 3x3 filters weights of Figure a, b c and d. Figure a, b, c and d in Figure (b) represent four feature maps from convolutional layer C1 of the upper branch CNN model. Figure e, f, g and h in figure (b) are the corresponding 3x3 filters weights for Figure a, b, c and d
Fig. 4
Fig. 4
Detailed configurations and structures for each layer of the bi-stream CNN DS prediction/screening model

Similar articles

See all similar articles

References

    1. Antonarakis SE. Down syndrome and the complexity of genome dosage imbalance. Nat Rev Genet. 2016. - PubMed
    1. Gardiner KJ. Molecular basis of pharmacotherapies for cognition in down syndrome. Trends Pharmacol Sci. 2010;31(2):66–73. doi: 10.1016/j.tips.2009.10.010. - DOI - PMC - PubMed
    1. Prandini P, Deutsch S, Lyle R, Gagnebin M, Vivier CD, Delorenzi M, Gehrig C, Descombes P, Sherman S, Bricarelli FD, et al. Natural gene-expression variation in down syndrome modulates the outcome of gene-dosage imbalance. Am J Hum Genet. 2007;81(2):252–63. doi: 10.1086/519248. - DOI - PMC - PubMed
    1. Weijerman ME, De Winter JP. Clinical practice. Eur J Pediatr. 2010;169(12):1445–52. doi: 10.1007/s00431-010-1253-0. - DOI - PMC - PubMed
    1. Patterson D. Molecular genetic analysis of down syndrome. Hum Genet. 2009;126(1):195–214. doi: 10.1007/s00439-009-0696-8. - DOI - PubMed

Publication types

Feedback