Molecular docking computationally screens thousands to millions of organic molecules against protein structures, looking for those with complementary fits. Many approximations are made, often resulting in low "hit rates." A strategy to overcome these approximations is to rescore top-ranked docked molecules using a better but slower method. One such is afforded by molecular mechanics-generalized Born surface area (MM-GBSA) techniques. These more physically realistic methods have improved models for solvation and electrostatic interactions and conformational change compared to most docking programs. To investigate MM-GBSA rescoring, we re-ranked docking hit lists in three small buried sites: a hydrophobic cavity that binds apolar ligands, a slightly polar cavity that binds aryl and hydrogen-bonding ligands, and an anionic cavity that binds cationic ligands. These sites are simple; consequently, incorrect predictions can be attributed to particular errors in the method, and many likely ligands may actually be tested. In retrospective calculations, MM-GBSA techniques with binding-site minimization better distinguished the known ligands for each cavity from the known decoys compared to the docking calculation alone. This encouraged us to test rescoring prospectively on molecules that ranked poorly by docking but that ranked well when rescored by MM-GBSA. A total of 33 molecules highly ranked by MM-GBSA for the three cavities were tested experimentally. Of these, 23 were observed to bind--these are docking false negatives rescued by rescoring. The 10 remaining molecules are true negatives by docking and false positives by MM-GBSA. X-ray crystal structures were determined for 21 of these 23 molecules. In many cases, the geometry prediction by MM-GBSA improved the initial docking pose and more closely resembled the crystallographic result; yet in several cases, the rescored geometry failed to capture large conformational changes in the protein. Intriguingly, rescoring not only rescued docking false positives, but also introduced several new false positives into the top-ranking molecules. We consider the origins of the successes and failures in MM-GBSA rescoring in these model cavity sites and the prospects for rescoring in biologically relevant targets.