An improved PSO algorithm for generating protective SNP barcodes in breast cancer

PLoS One. 2012;7(5):e37018. doi: 10.1371/journal.pone.0037018. Epub 2012 May 18.

Abstract

Background: Possible single nucleotide polymorphism (SNP) interactions in breast cancer are usually not investigated in genome-wide association studies. Previously, we proposed a particle swarm optimization (PSO) method to compute these kinds of SNP interactions. However, this PSO does not guarantee to find the best result in every implement, especially when high-dimensional data is investigated for SNP-SNP interactions.

Methodology/principal findings: In this study, we propose IPSO algorithm to improve the reliability of PSO for the identification of the best protective SNP barcodes (SNP combinations and genotypes with maximum difference between cases and controls) associated with breast cancer. SNP barcodes containing different numbers of SNPs were computed. The top five SNP barcode results are retained for computing the next SNP barcode with a one-SNP-increase for each processing step. Based on the simulated data for 23 SNPs of six steroid hormone metabolisms and signalling-related genes, the performance of our proposed IPSO algorithm is evaluated. Among 23 SNPs, 13 SNPs displayed significant odds ratio (OR) values (1.268 to 0.848; p<0.05) for breast cancer. Based on IPSO algorithm, the jointed effect in terms of SNP barcodes with two to seven SNPs show significantly decreasing OR values (0.84 to 0.57; p<0.05 to 0.001). Using PSO algorithm, two to four SNPs show significantly decreasing OR values (0.84 to 0.77; p<0.05 to 0.001). Based on the results of 20 simulations, medians of the maximum differences for each SNP barcode generated by IPSO are higher than by PSO. The interquartile ranges of the boxplot, as well as the upper and lower hinges for each n-SNP barcode (n = 3∼10) are more narrow in IPSO than in PSO, suggesting that IPSO is highly reliable for SNP barcode identification.

Conclusions/significance: Overall, the proposed IPSO algorithm is robust to provide exact identification of the best protective SNP barcodes for breast cancer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Breast Neoplasms / diagnosis*
  • Breast Neoplasms / genetics*
  • Computer Simulation
  • Female
  • Genotype
  • Gonadal Steroid Hormones / metabolism
  • Humans
  • Metabolic Networks and Pathways / genetics
  • Multilocus Sequence Typing / methods*
  • Odds Ratio
  • Polymorphism, Single Nucleotide / genetics*

Substances

  • Gonadal Steroid Hormones