Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2014 Sep 8:47:11.12.1-34.
doi: 10.1002/0471250953.bi1112s47.

BEDTools: The Swiss-Army Tool for Genome Feature Analysis

Affiliations

BEDTools: The Swiss-Army Tool for Genome Feature Analysis

Aaron R Quinlan. Curr Protoc Bioinformatics. .

Abstract

Technological advances have enabled the use of DNA sequencing as a flexible tool to characterize genetic variation and to measure the activity of diverse cellular phenomena such as gene isoform expression and transcription factor binding. Extracting biological insight from the experiments enabled by these advances demands the analysis of large, multi-dimensional datasets. This unit describes the use of the BEDTools toolkit for the exploration of high-throughput genomics datasets. Several protocols are presented for common genomic analyses, demonstrating how simple BEDTools operations may be combined to create bespoke pipelines addressing complex questions.

Keywords: bioinformatics; genome analysis; genome features; genome intervals; genomics.

PubMed Disclaimer

Figures

Figure 1
Figure 1. Examples of genome arithmetic operations
Each tool in the BEDTools suite performs a relatively simple operation on one, a pair, or multiple genome interval datasets. The examples presented here reflect genome arithmetic operations on two genome interval files (green and blue). For example, if blue intervals represent gene annotations and green intervals represent DNA sequence alignments, then the result of the intersect tool (gray interval) represents the genomic interval that is shared between a single gene annotation and sequence alignment.
Figure 2
Figure 2. BEDTools scalability
The runtime in seconds (left panel) and memory usage (right panel) are compared when using either unsorted genome intervals (dashed red) or genome intervals that have been pre-sorted in genome order (solid red). As a basis of comparison, the BEDTools performance is compared to the BEDOPS toolset, both with (solid gray) and without (dashed gray) automatic error checking.
Figure 3
Figure 3
Histogram of genome-wide sequencing coverage for NA19146.
Figure 4
Figure 4
Cumulative distribution of sequencing coverage observed among all exome-targeted bases for NA12891.
Figure 5
Figure 5. Schematic of the BEDTools map tool’s functionality
The map tool summarizes the overlaps observed between two interval files, A and B. The result is a summary of all intervals in B that overlapped each interval in A. The summary is computed based on operations such as mean or max, which compute the average or maximum value for a given column from all of the intersecting B intervals.
Figure 6
Figure 6. Transcription factor binding occupancy at transcription start sites
Depicted are the consensus binding profiles of the Sp1 transcription factor (red) and a reverse cross-linked negative control (gray) as observed among all transcription start sites (TSS) in the human genome.
Figure 7
Figure 7
Histogram of the fraction of Dnase I hypersensitivity sites common to the 20 cell types compared.
Figure 8
Figure 8. Schematic of the BEDTools jaccard tool’s functionality
See main text for details.
Figure 9
Figure 9
Heatmap of Jaccard similarities observed for 20 fetal tissue samples based upon Dnase I hypersensitivity profiles.

Similar articles

Cited by

References

    1. Danecek P, Auton A, Abecasis G, Albers CA, Banks E, DePristo MA, Handsaker RE, Lunter G, Marth GT, Sherry ST, McVean G, Durbin R. The variant call format and VCFtools. Bioinformatics. 2011;27:2156–2158. - PMC - PubMed
    1. Dunham, Kundaje I, Aldred A, Collins SF, Davis PJ, Doyle CA, Epstein F, Frietze CB, Harrow S, Kaul J, Khatun R, Lajoie J, Landt BR, Lee SG, Pauli BK, Rosenbloom F, Sabo KR, Safi P, Sanyal A, Shoresh A, Simon N, Song JM, Trinklein L, Altshuler ND, Birney RC, Brown E, Cheng JB, Djebali C, Dong S, Ernst X, Furey J, Gerstein TS, Giardine M, Greven B, Hardison M, Harris RC, Herrero RS, Hoffman J, Iyer MM, Kelllis S, Kheradpour M, Lassmann P, Li T, Lin Q, Marinov X, Merkel GK, Mortazavi A, Parker A, Reddy SC, Rozowsky TE, Schlesinger J, Thurman F, Wang RE, Ward J, Whitfield LD, Wilder TW, Wu SP, Xi W, Yip HS, Zhuang KY, Bernstein J, Green BE, Gunter ED, Snyder C, Pazin M, Lowdon MJ, Dillon RF, Adams LA, Kelly LB, Zhang CJ, Wexler J, Good JR, Feingold PJ, Crawford EA, Dekker GE, Elinitski J, Farnham L, Giddings PJ, Gingeras MC, Guigo TR, Hubbard R, Kellis TJ, Kent M, Lieb WJ, Margulies JD, Myers EH, Starnatoyannopoulos RM, Tennebaum JA, Weng SA, White Z, Wold KP, Yu B, Wrobel Y, Risk J, Gunawardena BA, Kuiper HP, Maier HC, Xie CW, Chen L, Mikkelsen X, Gillespie TS, Goren S, Ram A, Zhang O, Wang X, Issner L, Coyne R, Durham MJ, Ku T, Truong M, Eaton T, Dobin ML, Tanzer A, Lagarde A, Lin J, Xue W, Williams C, Zaleski BA, Roder C, Kokocinski M, Abdelhamid F, Alioto RF, Antoshechkin T, Baer I, Batut MT, Bell P, Bell I, Chakrabortty K, Chrast S, Curado J, Derrien J, Drenkow T, Dumais J, Dumais E, Duttagupta J, Fastuca R, Fejes-Toth M, Ferreira K, Foissac P, Fullwood S, Gao MJ, Gonzalez H, Gordon D, Howald A, Jha C, Johnson S, Kapranov R, King P, Kingswood B, Li C, Luo G, Park OJ, Preall E, Presaud JB, Ribeca K, Robyr P, Ruan D, Sammeth X, Sandu M, Schaeffer KS, See L, Shahab LH, Skancke A, Suzuki J, Takahashi AM, Tilgner H, Trout H, Walters D, Wang N, Hayashizaki H, Reymond Y, Antonarakis A, Hannon SE, Ruan GJ, Carninci Y, Sloan P, Learned CA, Malladi K, Wong VS, Barber MC, Cline GP, Dreszer MS, Heitner TR, Karolchik SG, Kirkup D, Meyer VM, Long LR, Maddren JC, Raney M, Grasfeder BJ, Giresi LL, Battenhouse PG, Sheffield A, Showers NC, London KA, Bhinge D, Shestak AA, Schaner C, Kim MR, Zhang SK, Mieczkowski ZZ, Mieczkowska PA, Liu JO, McDaniell Z, Ni RM, Rashid Y, Kim NU, Adar MJ, Zhang S, Wang Z, Winter T, Keefe D, Iyer D, Sandhu VR, Zheng KS, Wang M, Gertz P, Vielmetter J, Partridge J, Varley EC, Gasper KE, Bansal C, Pepke A, Jain S, Amrhein P, Bowling H, Anaya KM, Cross M, Muratet MK, Newberry MA, McCue KM, Nesmith K, Fisher-Aylor AS, Pusey KI, DeSalvo B, Parker G, Balasubramanian SL, Davis S, Meadows NS, Eggleston SK, Newberry T, Levy JS, Absher SE, Wong DM, Blow WH, Visel MJ, Pennachio A, Elnitski LA, Petrykowska L, Abyzov HM, Aken A, Barrell B, Barson D, Berry G, Bignell A, Boychenko A, Bussotti V, Davidson G, Despacio-Reyes C, Diekhans G, Ezkurdia M, Frankish I, Gilbert A, Gonzalez J, Griffiths JM, Harte E, Hendrix R, Hunt DA, Jungreis T, Kay I, Khurana M, Leng E, Lin J, Loveland MF, Lu J, Manthravadi Z, Mariotti D, Mudge M, Mukherjee J, Notredame G, Pei C, Rodriguez B, Saunders JM, Sboner G, Searle A, Sisu S, Snow C, Steward C, Tapanari C, Tress E, van Baren ML, Washieti MJ, Wilming S, Zadissa L, Zhengdong A, Brent Z, Haussler M, Valencia D, Raymond A, Addleman A, Alexander N, Auerbach RP, Bettinger RK, Bhardwaj K, Boyle N, Cao AP, Cayting AR, Charos P, Cheng A, Eastman Y, Euskirchen C, Fleming G, Grubert JD, Habegger F, Hariharan L, Harmanci M, Iyenger A, Jin S, Karczewski VX, Kasowski KJ, Lacroute M, Lam P, Larnarre-Vincent H, Lian N, Lindahl-Allen J, Min M, Miotto R, Monahan B, Moqtaderi H, Mu Z, O’Geen XJ, Ouyang H, Patacsil Z, Raha D, Ramirez D, Reed L, Shi B, Slifer M, Witt T, Wu H, Xu L, Yan X, Yang KK, Struhl X, Weissman K, Tenebaum SM, Penalva SA, Karmakar LO, Bhanvadia S, Choudhury RR, Domanus A, Ma M, Moran L, Victorsen J, Auer A, Centarin T, Eichenlaub L, Gruhl M, Heerman F, Hoeckendorf S, Inoue B, Kellner D, Kirchmaier T, Mueller S, Reinhardt C, Schertel R, Schneider L, Sinn S, Wittbrodt R, Wittbrodt B, Jain J, Balasundaram G, Bates G, Byron DL, Canfield R, Diegel TK, Dunn MJ, Ebersol D, Frum AK, Garg T, Gist K, Hansen E, Boatman RS, Haugen L, Humbert E, Johnson R, Johnson AK, Kutyavin EM, Lee TM, Lotakis K, Maurano D, Neph MT, Neri SJ, Nguyen FV, Qu ED, Reynolds H, Roach AP, Rynes V, Sanchez E, Sandstrom ME, Shafer RS, Stergachis AO, Thomas AB, Vernot S, Vierstra B, Vong J, Weaver S, Yan MA, Zhang Y, Akey M, Bender JA, Dorschner M, Groudine MO, MacCoss M, Navas MJ, Stamatoyannopoulos P, Stamatoyannopoulos G, Beal JA, Brazma K, Flicek A, Johnson P, Lukk N, Luscombe M, Sobral NM, Vaquerizas D, Batzoglou JM, Sidow S, Hussami A, Kyriazopoulou-Panagiotopoulou N, Libbrecht S, Schaub MW, Miller MA, Bickel W, Banfai PJ, Boley B, Huang NP, Li H, Noble JJ, Bilmes WS, Buske JA, Sahu OJ, Kharchenko AO, Park PV, Baker PJ, Taylor D, Lochovsky JL. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74. - PMC - PubMed
    1. Durbin RM, Abecasis GR, Altshuler DL, Auton A, Brooks LD, Gibbs RA, Hurles ME, McVean GA. A map of human genome variation from population-scale sequencing. Nature. 2010;467:1061–1073. - PMC - PubMed
    1. Favorov A, Mularoni L, Cope LM, Medvedeva Y, Mironov AA, Makeev VJ, Wheelan SJ. Exploring massive, genome scale datasets with the GenometriCorr package. PLoS Comput Biol. 2012;8:e1002529. - PMC - PubMed
    1. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D. The human genome browser at UCSC. Genome research. 2002;12:996–1006. - PMC - PubMed

Publication types

Substances