Historical maps are unique sources of retrospective geographical information. Recently, several map archives containing map series covering large spatial and temporal extents have been systematically scanned and made available to the public. The geographical information contained in such data archives makes it possible to extend geospatial analysis retrospectively beyond the era of digital cartography. However, given the large data volumes of such archives (e.g., more than 200,000 map sheets in the United States Geological Survey topographic map archive) and the low graphical quality of older, manually-produced map sheets, the process to extract geographical information from these map archives needs to be automated to the highest degree possible. To understand the potential challenges (e.g., salient map characteristics and data quality variations) in automating large-scale information extraction tasks for map archives, it is useful to efficiently assess spatio-temporal coverage, approximate map content, and spatial accuracy of georeferenced map sheets at different map scales. Such preliminary analytical steps are often neglected or ignored in the map processing literature but represent critical phases that lay the foundation for any subsequent computational processes including recognition. Exemplified for the United States Geological Survey topographic map and the Sanborn fire insurance map archives, we demonstrate how such preliminary analyses can be systematically conducted using traditional analytical and cartographic techniques, as well as visual-analytical data mining tools originating from machine learning and data science.
Keywords: Sanborn fire insurance maps; USGS topographic maps; color moments; image information mining; low-level image descriptors; map processing; retrospective landscape analysis; t-distributed stochastic neighborhood embedding; visual data mining.