Statistical tests of heterogeneity and bias, in particular publication bias, are very popular in meta-analyses. These tests use statistical approaches whose limitations are often not recognized. Moreover, it is often implied with inappropriate confidence that these tests can provide reliable answers to questions that in essence are not of statistical nature. Statistical heterogeneity is only a correlate of clinical and pragmatic heterogeneity and the correlation may sometimes be weak. Similarly, statistical signals may hint to bias, but seen in isolation they cannot fully prove or disprove bias in general, let alone specific causes of bias, such as publication bias in particular. Both false-positive and false-negative signals of heterogeneity and bias can be common and their prevalence may be anticipated based on some rational considerations. Here I discuss the major common challenges and flaws that emerge in using and interpreting statistical tests of heterogeneity and bias in meta-analyses. I discuss misinterpretations that can occur at the level of statistical inference, clinical/pragmatic inference and specific cause attribution. Suggestions are made on how to avoid these flaws, use these tests properly and learn from them.