Chromosome-Conformation-Capture-Carbon-Copy (5C) is a molecular technology based on proximity ligation that enables high-resolution and high-coverage inquiry of long-range looping interactions. Computational pipelines for analyzing 5C data involve a series of interdependent normalization procedures and statistical methods that markedly influence downstream biological results. A detailed analysis of the trade-offs inherent to all stages of 5C data analysis has not been reported. Here, we provide a comparative assessment of method performance at each step in the 5C analysis pipeline, including sequencing depth and library complexity correction, bias mitigation, spatial noise reduction, distance-dependent expected and variance estimation, statistical modeling, and loop detection. We discuss methodological advantages and disadvantages at each step and provide a full suite of algorithms, lib5C, to allow investigators to test the range of approaches on their own 5C data. Principles learned from our comparative analyses can be applied to protein-independent proximity ligation-based data, including Hi-C, 4C, and Capture-C.
Keywords: 5C; chromosome conformation capture; epigenetics; higher-order genome architecture; looping interactions; statistical modeling.
Copyright © 2019 Elsevier Inc. All rights reserved.