We argue that published results demonstrate that new insights into human brain function may be obscured by poor and/or limited choices in the data-processing pipeline, and review the work on performance metrics for optimizing pipelines: prediction, reproducibility, and related empirical Receiver Operating Characteristic (ROC) curve metrics. Using the NPAIRS split-half resampling framework for estimating prediction/reproducibility metrics (Strother et al., 2002), we illustrate its use by testing the relative importance of selected pipeline components (interpolation, in-plane spatial smoothing, temporal detrending, and between-subject alignment) in a group analysis of BOLD-fMRI scans from 16 subjects performing a block-design, parametric-static-force task. Large-scale brain networks were detected using a multivariate linear discriminant analysis (canonical variates analysis, CVA) that was tuned to fit the data. We found that tuning the CVA model and spatial smoothing were the most important processing parameters. Temporal detrending was essential to remove low-frequency, reproducing time trends; the number of cosine basis functions for detrending was optimized by assuming that separate epochs of baseline scans have constant, equal means, and this assumption was assessed with prediction metrics. Higher-order polynomial warps compared to affine alignment had only a minor impact on the performance metrics. We found that both prediction and reproducibility metrics were required for optimizing the pipeline and give somewhat different results. Moreover, the parameter settings of components in the pipeline interact so that the current practice of reporting the optimization of components tested in relative isolation is unlikely to lead to fully optimized processing pipelines.