Rapid evolution and high sequence diversity enable Human Immunodeficiency Virus (HIV) populations to acquire mutations to escape antiretroviral drugs and host immune responses, and thus are major obstacles for the control of the pandemic. One strategy to overcome this problem is to focus drugs and vaccines on regions of the viral genome in which mutations are likely to cripple function through destabilization of viral proteins. Studies relying on sequence conservation alone have had only limited success in determining critically important regions. We tested the ability of two structure-based computational models to assign sites in the HIV-1 capsid protein (CA) that would be refractory to mutational change. The destabilizing mutations predicted by these models were rarely found in a database of 5811 HIV-1 CA coding sequences, with none being present at a frequency greater than 2%. Furthermore, 90% of variants with the low predicted stability (from a set of 184 CA variants whose replication fitness or infectivity has been studied in vitro) had aberrant capsid structures and reduced viral infectivity. Based on the predicted stability, we identified 45 CA sites prone to destabilizing mutations. More than half of these sites are targets of one or more known CA inhibitors. The CA regions enriched with these sites also overlap with peptides shown to induce cellular immune responses associated with lower viral loads in infected individuals. Lastly, a joint scoring metric that takes into account both sequence conservation and protein structure stability performed better at identifying deleterious mutations than sequence conservation or structure stability information alone. The computational sequence-structure stability approach proposed here might therefore be useful for identifying immutable sites in a protein for experimental validation as potential targets for drug and vaccine development.
Keywords: HIV-1; capsid protein (CA); destabilizing mutations; point mutation modeling; protein structural stability prediction; tolerated sequence space.