Blog post cover image

Ahead of SatSummit 2024, we recently convened a group of leading organizations and experts developing and deploying large earth models, commonly known as “foundation models.” This gathering aimed to foster connections, share insights, and identify potential collaboration opportunities among participants from academia, non-profits, government, and the private sector.

The meeting brought together a wide range of experts committed to open development of large earth foundation models. A huge thank you to these organizations for their willingness to engage in open and candid discussions:

With growing interest in the status and future of large earth foundation models, we wanted to share the key takeaways and future directions from our discussions with the broader community.

Establishing Benchmarking Standards

A central theme of the meeting was the need for effective benchmarking to evaluate and compare model performance. Benchmarking FM models is more difficult than benchmarking more traditional models for several reasons. FM models have a wide range of possible applications and data sources. Testing across all of these at the same time is complicated. It is also not clear how an FM model can be adapted to downstream tasks in a standard way. Different fine-tuning methods might have different complexity and are not always comparable. A complex fine-tuning method is expected to perform better than a simpler one but is also harder to set up and train. And finally, there are qualitative elements that matter to the user community. How difficult it is to use an FM model for downstream users and how well it performs in similarity search are not things that can be easily quantified. To advance this effort, a working group was created that focuses on benchmarking large earth foundation models. Interested parties can join this group to contribute to this critical initiative.

Embeddings and Data Standards

Embeddings emerged as a key topic, being integral to the use of foundation models as they transform raw data into dense, low-dimensional representations, enabling downstream applications to process and understand complex patterns within the data efficiently. This led to a robust discussion on data standards and formats, highlighting their crucial role in establishing trust, legitimacy, and adoption. While the geospatial community often experiments with data, aligning standards and formats is essential, particularly for large earth foundation models. Participants acknowledged the need for collaboration on developing open tools for model pipelines. In this vein, Ben Strong and Chris Holmes have initiated a survey to gather best practices for storing and sharing embeddings using GeoParquet.

Policy & Collaboration

The conversation also touched on non-technical issues, emphasizing the need for shared FM policy frameworks and conceptual frameworks for FM operations. Transparency and shareability are critical at the policy level. David Saah from the University of San Francisco presented his perspective on building conceptual frameworks, contributing to a broader understanding of FM's role in society.

The collaborative spirit and openness of the meeting were particularly productive. There was a strong consensus on the value of such gatherings, leading to plans for another meeting adjacent to AGU in December. Those interested in joining can reach out to Ian Schuler for more details.

We hope to inspire and inform the broader community about the advancements and collaborative opportunities in developing large earth models by sharing these key takeaways. Together, we can drive progress and innovation in this exciting field.

What we're doing.

Latest