While predominant models of visual word form area (VWFA) function argue for its specific role in decoding written language, other accounts propose a more general role of VWFA in complex visual processing. However, a comprehensive examination of structural and functional VWFA circuits and their relationship to behavior has been missing. Here, using high-resolution multimodal imaging data from a large Human Connectome Project cohort (N = 313), we demonstrate robust patterns of VWFA connectivity with both canonical language and attentional networks. Brain-behavior relationships revealed a striking pattern of double dissociation: structural connectivity of VWFA with lateral temporal language network predicted language, but not visuo-spatial attention abilities, while VWFA connectivity with dorsal fronto-parietal attention network predicted visuo-spatial attention, but not language abilities. Our findings support a multiplex model of VWFA function characterized by distinct circuits for integrating language and attention, and point to connectivity-constrained cognition as a key principle of human brain organization.