Earth version of ChatGPT explodes: Google AI compresses the human planet in 64 dimensions, opening a 10-meter "God's perspective" in seconds

avatar
36kr
07-31
This article is machine translated
Show original

Google DeepMind has launched its new AlphaEarth Foundations, a "God's eye view" project. With a 10-meter resolution, it creates an unprecedented digital portrait of Earth. Netizens are exclaiming, "Isn't this the 'ChatGPT for Earth'?"

Just now, the Alpha family has new products!

This time, Google DeepMind has helped humans open up a "God's perspective" - the new AlphaEarth Foundations is mapping the entire earth in amazing detail.

It integrates petabyte-level Earth observation data to generate a unified data representation.

Specifically, AlphaEarth Foundations condenses the information of every 10x10 meter grid on the earth into efficient data, with a total of 64 dimensions.

The 10-meter resolution is enough for you to see every corner of the earth clearly.

Google condenses the essence of a year's multi-source satellite data into each 10-meter square pixel.

The uniqueness of AlphaEarth Foundations lies in its powerful "feature learning" capability.

Through complex embedding technology, the model can extract key features from optical, radar, and 3D data, easily distinguishing between beaches and deserts, and forests and farmlands.

This capability allows it to outperform other AI and traditional methods, reducing error rates by 24%.

On the same day, the Google team also released a 63-page comprehensive technical report.

Paper address: https://storage.googleapis.com/deepmind-media/DeepMind.com/Blog/alphaearth-foundations-helps-map-our-planet-in-unprecedented-detail/alphaearth-foundations.pdf

AlphaEarth Foundations is like a "virtual satellite", providing humans with a window to peek into the pulse of the Earth.

It allows scientists to analyze Earth's dynamics more quickly and efficiently, monitor crop health, track deforestation, and address global issues such as climate change.

One netizen praised Google for taking an important step towards building the "Earth version of ChatGPT".

The question is, why do we need an AI version of the earth model?

AI version of "virtual satellite" debuts, 64-dimensional ultra-high precision

Every day, satellites capture every inch of the Earth's changes from space, generating massive amounts of images and observation data.

These data provide scientists and decision makers with a near real-time panoramic view of the Earth.

Over the past 15 years, the Earth Engine platform has opened up Earth observation images and geospatial data, which have completely changed the way we understand the Earth.

However, its complexity, multimodality, and refresh rate have also posed a new challenge: how to connect heterogeneous data sets and use them efficiently?

The birth of AlphaEarth Foundations has become the winning weapon to solve this problem.

This is an AI model called a "virtual satellite" that can integrate massive amounts of spherical observation data into a unified digital representation (i.e., "embedding") that can be easily processed by a computer system.

Ultimately, it mapped all of Earth's land and coastal waters in unprecedented detail.

AlphaEarth Foundations not only provides scientists with a more complete and consistent picture of Earth's evolution, but also helps them make more informed decisions in areas such as food security, deforestation, urban expansion, and water resource management.

How it works

By addressing the two major challenges of "data overload" and "information inconsistency", AlphaEarth Foundations provides us with a new perspective on understanding the Earth.

First, it integrates massive amounts of information from dozens of different public sources, including optical satellite imagery, radar, 3D laser mapping, climate simulations, and more.

It integrates all this information and then analyses global land and coastal waters with ultra-high precision at a clear 10x10 meters, tracking changes in the Earth over time.

Second, it makes this data easily accessible.

The system's key innovation is its ability to create a highly compact numerical summary for each square area.

These summaries use 16 times less storage space than other AI systems, significantly reducing the cost of planetary-scale analysis.

The breakthrough allows scientists to do something that has been impossible until now: create detailed, consistent maps of the world on demand.

AlphaEarth Foundations works by extracting non-uniformly sampled frames from a video sequence to index the location at any point in time. This helps the model construct a continuous view of the place while interpreting a large amount of measurement data.

Whether they are monitoring crop health, tracking deforestation, or observing new construction, they no longer need to rely solely on a satellite flying overhead.

Now they have a new kind of "geospatial data foundation."

AlphaEarth Foundations has also demonstrated unparalleled accuracy after rigorous testing.

It excels at a variety of tasks over time, including identifying land uses and estimating surface properties.

Crucially, even when labeled data is scarce, its error rate is 24% lower than other models on average, demonstrating remarkable learning efficiency.

The global embedding field is decomposed into individual embeddings (from left to right). Each embedding contains 64 components that map to coordinate points on the 64-dimensional sphere.

In the map data generated below, AlphaEarth Foundations embeds 64 dimensions, assigning three colors to red, green, and blue respectively, to visualize the rich details of our world.

In Ecuador, the model was able to penetrate persistent cloud cover to provide detailed images of farmland at various stages of development.

Elsewhere, it clearly maps the complex surface of Antarctica – a region notoriously difficult to image due to erratic satellite imaging.

Furthermore, it reveals differences in Canadian agricultural land use that are invisible to the naked eye.

Next, we will break down in detail the power of the dataset generated by AlphaEarth Foundations.

10x10 pixels, condensing one year's data

The Google team used AlphaEarth Foundations to generate a global pre-computed embedding dataset with a 10-meter resolution every year covering 2017 to the present.

These "embedded" images may appear to be ordinary Earth Engine image collections, but they incorporate AI-powered feature extraction capabilities into every pixel.

What information is contained in the "embedding" vector?

Multi-source measurement data

Embedding vectors are learned from multiple data sources to capture the semantic information of surface attributes.

For example, the embedding of a pixel not only reflects its spectral characteristics, but also includes the surrounding environment, seasonal changes (such as vegetation phenology, snow cover), and terrain and climate characteristics.

· Space-time background

AlphaEarth Foundations was trained on over 3 billion independent image frames sampled from over 5 million locations worldwide.

The model treats satellite imagery of a location over time as successive frames in a video.

This enables learning across space, time, and measurement modes, generating embeddings that capture spatial context while preserving temporal trajectories.

This means that each embedding vector in the Satellite Embedding Dataset provides a highly compact and semantically rich representation of the conditions of each 10-meter pixel (100 square meters) on the Earth’s land surface.

The embedding of each 10-meter pixel also captures information about its surrounding area.

Therefore, even if some areas (such as the asphalt surface of a parking lot and a highway) are very similar when viewed in isolation, their embedding vectors can be very different.

· 64-dimensional view of the Earth: coordinates and bands

The satellite imagery embedded in the dataset has 64 bands – but they are not the same as classic optical reflectivity or radar returns.

On the contrary, the 64 "bands" of a single pixel in the AlphaEarth Foundations embedding represent a 64-dimensional coordinate on a 64-dimensional "sphere".

These coordinates are generated by DL and are mathematically interpretable, but they are not direct physical measurements. Instead, they are a compact representation of the high-dimensional measurement space.

A "satellite embedding" is essentially a coordinate point on the surface of a 64-dimensional "sphere"

With the satellite embedding dataset, scientists were able to perform a "similarity search."

Just select a target pixel and the embedding vector can quickly locate areas with similar surface and environmental conditions on a global scale through a simple dot product calculation.

The embedding vector for downtown New York City easily matches that of other highly urbanized areas.

Similarity search for coordinates 73.9812, 40.7628 (Midtown Manhattan, New York City, USA)

Furthermore, similarity-based comparison is also applicable to the temporal dimension and can be used for embedding-driven change detection and stability monitoring.

The AlphaEarth Foundations embedding space is designed to be consistent over time.

By comparing the embedding vectors of the same pixel in different years, it is easy to monitor urban expansion, wildfire recovery, reservoir level changes, etc.

The following figure shows some of the changes observed in the embedding space between 2020 and 2024. The last figure in each row shows the similarity of each pixel to itself (brighter values indicate greater differences), corresponding to the following types of changes:

Suburban sprawl

The land after the wildfire, interspersed with areas that had been deforested before the fire

Changes in artificial reservoirs from drought period to water relief period

Variations in farmland between years, demonstrating how embeddings can capture intra-year dynamics such as crop cycles and fallows

Furthermore, the embedding vectors, combined with Earth Engine’s ee.Clusterer algorithm, automatically cluster pixels into different land surface types (e.g., forest, soil, urban areas) without pre-labeling.

This can reveal hidden geomorphological patterns and aid the study of topographic, hydrological and phenological characteristics.

ChatGPT for Earth, creating maps on demand

Currently, the Satellite Embedding dataset, driven by AlphaEarth Foundations, is one of the largest datasets in Earth Engine.

It contains over 1.4 trillion "embedded" footprints every year.

Numerous organizations around the world, including the Food and Agriculture Organization of the United Nations, Harvard Forest, the Group on Earth Observations, MapBiomas, Oregon State University, and others, have used this dataset to create custom maps and gain insights into the real world.

In practical applications, AlphaEarth Foundations has achieved initial results.

The Global Ecosystems Atlas project uses datasets to classify unmapped ecosystems into categories such as marine scrub and hyperarid deserts.

This first-of-its-kind resource provides critical support for countries to prioritize protected areas, promote ecological restoration, and curb biodiversity loss.

There is also the Brazilian "MapBiomas" project, which, by testing this dataset, has gained a deeper understanding of agricultural and environmental changes, providing information for conservation strategies and sustainable development initiatives in key ecosystems such as the Amazon rainforest.

AlphaEarth Foundations represents an important step forward in humanity's understanding of Earth's dynamics.

Next, the Google team is using it to generate annual embeddings and combine it with the powerful Gemini to maximize its effectiveness.

As part of Google Earth AI, they will also continue to explore the best ways to apply the model's time series capabilities.

References:

https://x.com/bilawalsidhu/status/1950580970907648234

https://deepmind.google/discover/blog/alphaearth-foundations-helps-map-our-planet-in-unprecedented-detail/

https://medium.com/google-earth/ai-powered-pixels-introducing-googles-satellite-embedding-dataset-31744c1f4650

This article comes from the WeChat public account "Xinzhiyuan" , author: Xinzhiyuan, and is authorized to be published by 36Kr.

Source
Disclaimer: The content above is only the author's opinion which does not represent any position of Followin, and is not intended as, and shall not be understood or construed as, investment advice from Followin.
Like
Add to Favorites
Comments