Scaling Science: A Guide#

A guide on science applied in industry by Development Seed, in collaboration with the Institute of Electrical and Electronics Engineers Geoscience and Remote Sensing Society (IEEE GRSS).

To run exectuable code within the chapters indicated on the left side panel, please navigate to the upper right hand corner, select the rocket icon and select the option to launch in Binder. This rocket icon will appear on executable pages (e.g. “Open Access to Data and Code”).

Introduction#

This guide serves as a comprehensive resource designed to empower researchers, scientists, and practitioners with the knowledge and tools needed to execute science scalably within industry settings, with a particular focus on how to embrace open science practices effectively. The contents aim to address the growing demand for transparent, reproducible, and collaborative research approaches in various domains.

Intended Audience#

“Scaling Science” is intended for those who seek to gain a comprehensive understanding of how to effectively harness and scale geospatial data and scientific research using modern, open-source tools and methodologies. It is designed for scientists, researchers and data practitioners who are looking to leverage cloud computing, machine learning and big data analytics to enhance their projects. By exploring this resource, you will learn best practices for managing large datasets, automating workflows and collaborating in a way that maximizes the impact and reach of your scientific endeavors.

The guide offers practical insights and step-by-step instructions, making it a valuable asset for anyone aiming to advance their data-driven research and achieve scalable, reproducible and impactful results. Specifically, the “Scaling Science” guide helps bridge the gap between academia and industry by introducing academics to the tools and techniques that are widely used in the commercial sector. Note: this guide focuses primarily on examples using the Python programming language.

Why Is This Needed#

In academia, researchers often work with limited computational resources and may rely on proprietary and/or time-consuming methods for data analysis. This guide provides a pathway to adopting industry-standard practices such as using cloud platforms for large-scale data storage and processing, online repositories for publishing and collaborating on code and implementing machine learning models for deriving insights from data. It also covers the integration of geospatial data with other types of datasets to provide richer, more comprehensive analyses.

By learning these skills, academics can transition more smoothly into industry roles where there is a high demand for expertise in data science, geospatial analysis and scalable computing. The ability to work with big data and cloud infrastructure is highly valued in sectors such as environmental consulting, urban planning, agriculture and disaster response. This guide emphasizes collaboration and reproducibility, which are crucial for interdisciplinary projects and for maintaining the integrity and transparency of scientific work in a commercial setting.

Ultimately, the “Scaling Science” guide empowers academics to expand their skill set, enabling them to be better prepared to tackle complex, real-world problems using advanced technological solutions.

Overview of Contents#

Throughout, you will explore the ways in which industry is empowered by:

  1. Promoting open science culture: The guide seeks to emphasize a culture of openness, transparency, and collaboration within the scientific community. It provides insights into the principles and benefits of open science, encouraging participants to adopt open methodologies and share their research outputs openly.

  2. Capacity building: Through a combination of tutorials, example scenarios, and interactive activities, the guide equips participants with essential skills and techniques for working openly. It covers topics such as data management, code development, reproducibility, documentation, and collaborative workflows, enabling participants to implement open science practices effectively in their research projects.

  3. Navigating new tools and platforms: The guide introduces participants to a wide range of open science tools, platforms, and resources that facilitate collaborative research and knowledge sharing, with an effort to also provide guidance on their selection and use.

  4. Encouraging Community Engagement: By highlighting the importance of community engagement and participation in open science initiatives, the guide encourages participants to contribute to open-source projects, collaborate with peers, and engage with the broader scientific community. It emphasizes the value of sharing knowledge, expertise, and resources to drive collective innovation and impact.

  5. Facilitating Scalability and Sustainability: As the title suggests, “Scaling Science” emphasizes scalability and sustainability in open science practices. The guide offers strategies and best practices for scaling open science initiatives within collaborative and often interdisciplinary industry settings, ensuring long-term impact and adoption.

“Scaling Science” was developed to empower researchers and practitioners with the knowledge, skills, and resources needed to embrace open science principles, foster collaboration, and drive positive change in their respective fields. By promoting openness, transparency, and community engagement, we hope this helps advance scientific research and accelerate progress towards solving many complex challenges in the geosciences.

grss ds

\(~\)

Built with Jupyter Book 2.0 tool set, as part of the ExecutableBookProject.

DOI