This report summarizes the proceedings of the September 9-10, 2005 meeting of the Expert Working Group on Hazard Identification and Risk Assessment in Relation to In Vitro Testing, part of an initiative on genetic toxicology. The objective of the Working Group was to develop recommendations for interpretation of results from tests commonly included in regulatory genetic toxicology test batteries, and to propose an appropriate strategy for follow-up testing when positive in vitro results were obtained in these assays. The Group noted the high frequency of positive in vitro findings in the genotoxicity test batteries with agents found not to be carcinogenic and thought not to pose a carcinogenic health hazard to humans. The Group agreed that a set of consensus principles for appropriate interpretation and follow-up testing when initial in vitro tests are positive was needed. Current differences in emphasis and policy among different regulatory agencies were recognized as a basis of this need. Using a consensus process among a balanced group of recognized international authorities from industry, government, and academia, it was agreed that a strategy based on these principles should include guidance on: (1) interpretation of initial results in the "core" test battery; (2) criteria for determining when follow-up testing is needed; (3) criteria for selecting appropriate follow-up tests; (4) definition of when the evidence is sufficient to define the mode of action and the relevance to human exposure; and (5) definition of approaches to evaluate the degree of health risk under conditions of exposure of the species of concern (generally the human). A framework for addressing these issues was discussed, and a general "decision tree" was developed that included criteria for assessing the need for further testing, selecting appropriate follow-up tests, and determining a sufficient weight of evidence to attribute a level of risk and stop testing. The discussion included case studies based on actual test results that illustrated common situations encountered, and consensus opinions were developed based on group analysis of these cases. The Working Group defined circumstances in which the pattern and magnitude of positive results was such that there was very low or no concern (e.g., non-reproducible or marginal responses), and no further testing would be needed. This included a discussion of the importance of the use of historical control data. The criteria for determining when follow-up testing is needed included factors, such as evidence of reproducibility, level of cytotoxicity at which an increased DNA damage or mutation frequency is observed, relationship of results to the historical control range of values, and total weight of evidence across assays. When the initial battery is negative, further testing might be required based on information from the published literature, structure activity considerations, or the potential for significant human metabolites not generated in the test systems. Additional testing might also be needed retrospectively when increase in tumors or evidence of pre-neoplastic change is seen. When follow-up testing is needed, it should be based on knowledge about the mode of action, based on reports in the literature or learned from the nature of the responses observed in the initial tests. The initial findings, and available information about the biochemical and pharmacological nature of the agent, are generally sufficient to conclude that the responses observed are consistent with certain molecular mechanisms and inconsistent with others. Follow-up tests should be sensitive to the types of genetic damage known to be capable of inducing the response observed initially. It was recognized that genotoxic events might arise from processes other than direct reactivity with DNA, that these mechanisms may have a non-linear, or threshold, dose-response relationship, and that in such cases it may be possible to determine an exposure level below which there is negligible concern about an effect due to human exposures. When a test result is clearly positive, consideration of relevance to human health includes whether other assays for the same endpoint support the results observed, whether the mode or mechanism of action is relevant to the human, and - most importantly - whether the effect observed is likely to occur in vivo at concentrations expected as a result of human exposure. Although general principles were agreed upon, time did not permit the development of recommendations for the selection of specific tests beyond those commonly employed in initial test batteries.