Pervasive infrastructures, such as cell phone networks, enable to capture large amounts of human behavioral data but also provide information about the structure of cities and their dynamical properties. In this article, we focus on these last aspects by studying phone data recorded during 55 days in 31 Spanish cities. We first define an urban dilatation index which measures how the average distance between individuals evolves during the day, allowing us to highlight different types of city structure. We then focus on hotspots, the most crowded places in the city. We propose a parameter free method to detect them and to test the robustness of our results. The number of these hotspots scales sublinearly with the population size, a result in agreement with previous theoretical arguments and measures on employment datasets. We study the lifetime of these hotspots and show in particular that the hierarchy of permanent ones, which constitute the 'heart' of the city, is very stable whatever the size of the city. The spatial structure of these hotspots is also of interest and allows us to distinguish different categories of cities, from monocentric and "segregated" where the spatial distribution is very dependent on land use, to polycentric where the spatial mixing between land uses is much more important. These results point towards the possibility of a new, quantitative classification of cities using high resolution spatio-temporal data.