How S2Vec learns the language of cities

When we think about artificial intelligence and geography, we often focus on navigation, or getting from point A to point B. But the built environment—the complex web of roads, buildings, businesses, and infrastructure that defines our world—contains much more than just coordinates on a map. These features tell stories about socio-economic health, environmental patterns, and urban development.

Until recently, converting these diverse geospatial features into a format that machine learning (ML) models can understand has been a manual and labor-intensive process. Researchers often had to handcraft specific metrics for each new problem they wanted to solve. As part of the Google Earth AI initiative, Google Research has developed a new method to fill this gap. It uses foundational models and advanced AI inference to transform planetary information into actionable intelligence.

In line with EarthAI’s vision, we recently introduced S2Vec, a self-supervised framework designed to learn general-purpose embeddings (i.e., compact numerical summaries) of the built environment. S2Vec allows AI to understand neighborhood characteristics the same way humans do, recognizing the distribution patterns of gas stations, parks, and housing, and using that knowledge to predict key metrics from population density to environmental impact. Our evaluation showed that S2Vec showed competitive performance against image-based baselines in socio-economic forecasting tasks, especially geographic adaptation (extrapolation), while the need for improvement was clear in environmental tasks such as tree cover and elevation.

Source link