We review a few popular statistical models for correlated binary outcomes, present maximum likelihood estimates for the model parameters, and discuss model selection issues using a variety of goodness-of-fit test statistics. We apply bootstrap strategies that are computationally efficient to evaluate the performance of goodness-of-fit statistics and observe that generally the power and the type I error rate of the goodness-of-fit statistics depend on the model under investigation. Our simulation results show that careful choice of goodness-of-fit statistics is an important issue especially when we have a small sample and the outcomes are highly correlated. Two biomedical applications are included.