Several scores based on symptoms and signs have been developed to assess the presence of heart failure. The goal of this study was to compare six heart failure scores in non-hospitalised subjects and to determine their usefulness in population based research. The scores were applied to 54 participants of a population based study. All underwent a complete medical examination, including chest X-ray, electrocardiography and Doppler echocardiography. Using all information available, a cardiologist, unaware of the results of the scores, clinically classified participants as having no, possible or definite heart failure. Sensitivity, specificity, predictive values and receiver operating characteristics were calculated, using the cardiologist's assessment as a gold standard. The cardiologist judged definite or possible heart failure to be present in 17 persons. All scores had a high sensitivity for the detection of definite heart failure, whereas the study of men born in 1913 and Walma's score had the highest sensitivity for the combination of possible and definite heart failure. Gheorgiade's and the Boston score had the highest positive predictive values. In conclusion, five of the six scores we studied are broadly similar in the detection of heart failure. The men born in 1913 score relies heavily on the assessment of dyspnea, resulting in a relatively large number of false positives. Although the scores are useful in detecting manifest heart failure, objective measurements of cardiac function appear necessary to reduce the false positive rate and accurately detect early stages of heart failure.