Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2020 Jun 16;94(13):e00014-20.
doi: 10.1128/JVI.00014-20. Print 2020 Jun 16.

Inferring Transmission Bottleneck Size from Viral Sequence Data Using a Novel Haplotype Reconstruction Method

Affiliations

Inferring Transmission Bottleneck Size from Viral Sequence Data Using a Novel Haplotype Reconstruction Method

Mahan Ghafari et al. J Virol. .

Abstract

The transmission bottleneck is defined as the number of viral particles that transmit from one host to establish an infection in another. Genome sequence data have been used to evaluate the size of the transmission bottleneck between humans infected with the influenza virus; however, the methods used to make these estimates have some limitations. Specifically, viral allele frequencies, which form the basis of many calculations, may not fully capture a process which involves the transmission of entire viral genomes. Here, we set out a novel approach for inferring viral transmission bottlenecks; our method combines an algorithm for haplotype reconstruction with maximum likelihood methods for bottleneck inference. This approach allows for rapid calculation and performs well when applied to data from simulated transmission events; errors in the haplotype reconstruction step did not adversely affect inferences of the population bottleneck. Applied to data from a previous household transmission study of influenza A infection, we confirm the result that the majority of transmission events involve a small number of viruses, albeit with slightly looser bottlenecks being inferred, with between 1 and 13 particles transmitted in the majority of cases. While influenza A transmission involves a tight population bottleneck, the bottleneck is not so tight as to universally prevent the transmission of within-host viral diversity.IMPORTANCE Viral populations undergo a repeated cycle of within-host growth followed by transmission. Viral evolution is affected by each stage of this cycle. The number of viral particles transmitted from one host to another, known as the transmission bottleneck, is an important factor in determining how the evolutionary dynamics of the population play out, restricting the extent to which the evolved diversity of the population can be passed from one host to another. Previous study of viral sequence data has suggested that the transmission bottleneck size for influenza A transmission between human hosts is small. Reevaluating these data using a novel and improved method, we largely confirm this result, albeit that we infer a slightly higher bottleneck size in some cases, of between 1 and 13 virions. While a tight bottleneck operates in human influenza transmission, it is not extreme in nature; some diversity can be meaningfully retained between hosts.

Keywords: influenza A; population bottleneck; transmission.

PubMed Disclaimer

Figures

FIG 1
FIG 1
(A) Simulated system of viral transmission. A population comprising seven viral genotypes transmits to a new host, leading to a population in the recipient which includes six of the seven genotypes. A plot shows the sampled frequencies of the distinct genotypes, or haplotypes, before and after transmission, reported to four significant figures. Our explicit model of viral transmission based on haplotype frequencies (described in the text) infers a population bottleneck of 17 viruses from these data. (B) An alternative analysis of the same population measures allele frequencies from the population before and after the transmission event; these are shown in an equivalent plot. A calculation of the population bottleneck from these data infers a value nearly 2 orders of magnitude larger than that of our previous calculation.
FIG 2
FIG 2
(A) Simulated system of viral transmission. A population consists of eight viral segments. For each segment, two haplotypes exist in the pretransmission population at a frequency of exactly 50%. In seven segments, these haplotypes differ by a single genetic variant, while in the eighth, the haplotypes differ by ten genetic variants. Posttransmission, the haplotype frequencies in each of the eight segments are described by eight independent random binomial samples. The 17 allele frequencies are similarly described by 17 random binomial samples, albeit that these statistics are not independent of each other. (B) Inferred population bottlenecks from 5,000 simulations of this transmission process, calculated with haplotype-based and allele frequency-based methods. A method based upon independent transmission of alleles has an increased variance relative to the haplotype-based method. (C) Likelihood function for each model in the case in which transmission results in a 45/55 split in haplotype frequencies in each segment. The black circle and line indicate the correct transmission bottleneck and an analytical confidence interval based upon a window of two likelihood units. The inference in each case is correct, but the allele-frequency method, which treats the allele frequencies as being statistically independent, has a false level of confidence in the inferred value.
FIG 3
FIG 3
Numbers of inferred and correctly inferred haplotypes given simulated sequence data. A total of 6 haplotypes were included in each of 800 simulations tested.
FIG 4
FIG 4
Transmission bottleneck sizes inferred from simulated data using different input data and methodologies. Inferences are shown in color according to the data and method used. Calculations with inferred haplotypes took as input data generated from a haplotype reconstruction method applied to simulated sequence data in which both the haplotypes and their frequencies before and after transmission were inferred. Calculations with the correct haplotypes took as input data from a haplotype reconstruction in which the identities of the correct haplotypes were given, with only their frequencies being inferred. Inferences from the explicit method were only calculated for smaller population bottleneck sizes, as the method does not scale well to evaluating larger bottlenecks. Results from the explicit method were so accurate as to not have a meaningful interquartile range; numbers displayed in these cases indicate the number of inferences giving a precisely correct inference of the population bottleneck. Horizontal dashed lines indicate the simulated bottleneck sizes.
FIG 5
FIG 5
Bottleneck sizes inferred from the data presented in reference . Dots indicate the maximum likelihood bottleneck size inferred for each of the 38 systems in this work for which we were able to infer a bottleneck. Vertical bars represent confidence intervals of 2 log likelihood units from the maximum.
FIG 6
FIG 6
Notation in the transmission model. Transmission of the population qB with bottleneck NT results in the founder population qF. The founder population grows under the influence of genetic drift, the effects of which are described by the effective population size NG. Growth results in the population qA. The populations qB and qA are observed, producing data sets represented by xB and xA, which are used to reconstruct the original populations in terms of haplotypes. In order to calculate the variance of the reconstructed populations q*B and q*A, data sets equivalent to xB and xA, denoted x*B and x*A, are generated and used to infer sets q**B and q**A.

Similar articles

Cited by

References

    1. Sidorenko Y, Reichl U. 2004. Structured model of influenza virus replication in MDCK cells. Biotechnol Bioeng 88:1–14. doi:10.1002/bit.20096. - DOI - PubMed
    1. Zwart MP, Elena SF. 2015. Matters of size: genetic bottlenecks in virus infection and their potential impact on evolution. Annu Rev Virol 2:161–179. doi:10.1146/annurev-virology-100114-055135. - DOI - PubMed
    1. McCrone JT, Woods RJ, Martin ET, Malosh RE, Monto AS, Lauring AS. 2018. Stochastic processes constrain the within and between host evolution of influenza virus. Elife 7:e35962. doi:10.7554/eLife.35962. - DOI - PMC - PubMed
    1. Biek R, Pybus OG, Lloyd-Smith JO, Didelot X. 2015. Measurably evolving pathogens in the genomic era. Trends Ecol Evol 30:306–313. doi:10.1016/j.tree.2015.03.009. - DOI - PMC - PubMed
    1. Stack JC, Murcia PR, Grenfell BT, Wood JLN, Holmes EC. 2013. Inferring the inter-host transmission of influenza A virus using patterns of intra-host genetic variation. Proc Biol Sci 280:20122173. doi:10.1098/rspb.2012.2173. - DOI - PMC - PubMed

Publication types