DGGS-Native Data Cubes: A Design Pattern for Scalable, Distortion-Aware Analytics
Keywords: Geoinformatics, Discrete Global Grid Systems, DGGS, Geospatial data, Multidimensional indexing
Abstract. The rapid growth of geospatial big data has intensified the need for efficient frameworks to store, process, and analyse large-scale, multidimensional datasets. Geospatial data cubes have emerged as a key paradigm for organising spatio-temporal information into analysis-ready structures, enabling scalable analytics across Earth observation and related domains.
This paper presents a synthesis-oriented analysis of recent mathematical and architectural advances in geospatial data cube infrastructures, with a particular focus on multidimensional indexing, sparse computation and compression, and spatial tessellation based on Discrete Global Grid Systems (DGGS). Rather than conducting a systematic review, the study integrates theoretical and system-level contributions to examine how these methods jointly address limitations of projection-dependent raster models, improve storage efficiency, and support consistent multi-resolution analysis.
We argue for DGGS-indexed data cubes, where the spatial dimension is treated as a first-class, hierarchical global grid identifier, serving as a unifying computational substrate that integrates multi-resolution spatial referencing with sparse tensor computation, thereby enabling globally consistent, scalable, and actionable geospatial analytics. This perspective clarifies their role as a foundational component for scalable, reproducible, and globally consistent geospatial analytics, while outlining key challenges and research directions for their operational adoption. By highlighting points of convergence between DGGS-based spatial referencing, cloud-native storage formats, and scalable computational strategies, the paper reframes geospatial data cubes not merely as data storage abstractions, but as integrated computational infrastructures.