Graph-based computational network analysis has proven a powerful tool to quantitatively characterize functional architectures of the brain. However, the test-retest (TRT) reliability of graph metrics of functional networks has not been systematically examined. Here, we investigated TRT reliability of topological metrics of functional brain networks derived from resting-state functional magnetic resonance imaging data. Specifically, we evaluated both short-term (<1 hour apart) and long-term (>5 months apart) TRT reliability for 12 global and 6 local nodal network metrics. We found that reliability of global network metrics was overall low, threshold-sensitive and dependent on several factors of scanning time interval (TI, long-term>short-term), network membership (NM, networks excluding negative correlations>networks including negative correlations) and network type (NT, binarized networks>weighted networks). The dependence was modulated by another factor of node definition (ND) strategy. The local nodal reliability exhibited large variability across nodal metrics and a spatially heterogeneous distribution. Nodal degree was the most reliable metric and varied the least across the factors above. Hub regions in association and limbic/paralimbic cortices showed moderate TRT reliability. Importantly, nodal reliability was robust to above-mentioned four factors. Simulation analysis revealed that global network metrics were extremely sensitive (but varying degrees) to noise in functional connectivity and weighted networks generated numerically more reliable results in compared with binarized networks. For nodal network metrics, they showed high resistance to noise in functional connectivity and no NT related differences were found in the resistance. These findings provide important implications on how to choose reliable analytical schemes and network metrics of interest.