Abstract
Estimating groundwater storage (GWS) anomalies by subtracting model-derived components from Gravity Recovery and Climate Experiment (GRACE) terrestrial water storage (TWS) anomalies is a common practice in hydrology. Typically, land surface model (LSM) simulated soil moisture (SM), snow water equivalent, and canopy water content are removed from GRACE-derived TWS, and the residual is interpreted as GWS. However, this method implicitly assumes that LSMs account for all non-groundwater storages within distinct, physically meaningful compartments. In this comment, we examine the assumptions, semantic and structural challenges embedded in this approach, suggesting that users consider the consequences of model simplifications and conventions on the results. We encourage careful interpretation and advocate for more sophisticated methods, including making use of data assimilation and models that represent physical hydrological processes more completely.
Plain Language Summary
Scientists often use data from NASA's Gravity Recovery and Climate Experiment (GRACE) satellites to estimate changes in groundwater storage. A common method involves subtracting soil moisture and snow estimated by computer models from the total water measured by GRACE. The leftover amount is assumed to be groundwater. However, many of these models do not explicitly simulate groundwater processes and some may represent deep soil water in ways that differ from physical groundwater systems. As a result, the “groundwater” estimated from this subtraction method may not accurately reflect actual subsurface water changes. In this commentary, we discuss potential limitations of this approach and highlight areas where model representations can be improved to match real-world behavior. We recommend using models that explicitly account for groundwater flow and storage and that physical terms of water components are clearly defined. These steps can support more accurate and meaningful assessments of groundwater changes from satellite data.
Key Points
-
The GRACE-LSM residual method assumes that models account for all non-groundwater storages and that storage labels have physical meaning
-
Models without groundwater modules may absorb long-memory storage signals in other compartments
-
The bottom-layer soil moisture in many models behaves like shallow groundwater, yet it remains classified as “soil moisture”
1 Introduction
The Gravity Recovery and Climate Experiment (GRACE) satellite missions (Landerer et al., 2020; Tapley et al., 2004) have revolutionized our understanding of terrestrial water storage (TWS) variability at regional to global scales. The approach of estimating groundwater storage anomalies (GWS) from GRACE by balancing other water storage components was pioneered in early work that combined GRACE TWS with independent estimates of soil moisture (SM) and snow (e.g., M. Rodell & Famiglietti, 2002; Yeh et al., 2006). Since then, the method has been broadly used and implemented in many ways (e.g., Abdelmohsen et al., 2025; Castle et al., 2014; Famiglietti et al., 2011; Liu et al., 2022; Rodell et al., 2009).
A widely used sub-category of this method involves subtracting land surface model (LSM) water states—i.e., SM, snow water equivalent (SWE), and canopy water content (CWC)—from GRACE-derived TWS anomalies (e.g., Rodell et al., 2009). This “GRACE minus LSM” (or simply GRACE-LSM) residual approach is appealing due to its conceptual simplicity and the availability of global model output. However, it relies on questionable assumptions about how models represent and partition water storages and fluxes, which may influence the interpretation of results.
2 Conceptual Deficiencies in Residual-Based Groundwater Estimation
-
Unconfined and confined groundwater (the target);
-
Ignoring surface water (lakes, wetlands, floodplains);
-
Model structural errors (e.g., inaccurate soil and landcover parameters, and numerical simplifications of physical processes);
-
Modeling biases in the water budget partition and land surface states (e.g., rainfall inaccuracy, snow and SM misrepresentation, and canopy water biases);
-
Neglecting inter-basin fluxes (especially in karst or sedimentary basins).
The combined influence of model simplifications, misrepresented and lacking components, and errors may be larger than the actual groundwater signal itself, making sole attribution of the residual to groundwater potentially problematic.
2.1 Groundwater Simulation, or Lack Thereof
The Global Land Data Assimilation System (GLDAS; Rodell et al., 2004) is likely the most popular data set used in the GRACE-LSM residual method—a Google Scholar search for “GRACE and GLDAS and groundwater” resulted in more than 4,400 citations—although we acknowledge that a variety of LSMs have been employed (e.g., Abdelmohsen et al., 2025; Joodaki et al., 2014; Liu et al., 2022; M. Rodell & Famiglietti, 2002). LSMs are typically designed with a specific water balance structure that includes processes like evapotranspiration, surface runoff, and shallow subsurface flow. However, deep soil processes such as aquifer recharge, lateral subsurface flow, and discharge to external basins are often simplified or compensated. When the GRACE-LSM approach is applied with these models, the residual may include signals from unrepresented processes. As a result, interpreting the residual strictly as a groundwater signal may be problematic if the model's water balance structure does not explicitly account for all relevant storages and fluxes. For example, three of the four LSMs in GLDAS (Noah, VIC, CLM), lack explicit representations of dynamic GWS or lateral subsurface flow (He et al., 2023; Koster et al., 2000). Their bottom soil layers are no deeper than 2–3 m and treat this layer as unsaturated. Consequently, they sometimes overestimate the range of variability of SM storage, or stated another way, part of what is reported as a SM change may actually be a GWS variability.
2.2 Compensation Effects and Model Biases
Having been originally designed and tuned to serve as the lower boundary condition for an atmospheric model or global climate model, LSMs without groundwater modules may absorb long-memory storage signals in other compartments. e.g., deep SM, snow processes, and water fluxes (e.g., evapotranspiration and runoff) may act to compensate for the absence of baseflow dynamics, especially in regions with significant groundwater contributions (Humphrey et al., 2016; Scanlon et al., 2018). Subtracting these compartments from GRACE TWS may therefore distort or misrepresent the underlying hydrologic behavior. It is well established that these states and fluxes lack realism relative to ground measurements, though they are generally more skillful in capturing temporal anomalies and the global climatology of states such as SM is unknown (Koster et al., 2009; Kumar et al., 2022). The biases stemming from developmental legacy of each model are significant factors while computing the GRACE-LSM residual. Moreover, most LSMs do not account for human water management, which may lead to additional uncertainty in the residual.
2.3 Surface Water Storage
A common assumption of the GRACE-LSM approach is that changes in surface water storage (SWS) are negligible, or limited attributions to lakes and reservoirs—the latter being usually derived from independent data sets. In practice, the residual derived by subtracting SM, SWE, and CWC from TWS reflects the combined contributions of GWS, SWS, and biomass. Studies have shown that seasonal changes in biomass are small (typically less than about 0.5 mm), which falls below the detection threshold of GRACE (Rodell et al., 2005). However, overlooking SWS can introduce additional uncertainty when decomposing TWS into its components, particularly in regions with extensive surface water systems, such as river networks, floodplains, wetlands, and over arid areas with dynamic surface-groundwater interactions (Getirana et al., 2017). Some authors suggest that SWS may, in certain contexts, represent groundwater where and when the water table intersects the land surface (Winter et al., 1998), a perspective we find reasonable. Nevertheless, SWS contributions should not be ignored, especially in regions where they are likely to be significant, or no prior quantification of its impacts has been conducted.
2.4 Nomenclature and Semantic Ambiguity
Semantic ambiguities also exist. The lowest SM layer in many models behaves hydrologically like shallow groundwater, with slow response times and deep infiltration, yet remains classified as “soil moisture.” This blurs the line between unsaturated and saturated zone dynamics, creating confusion about what is truly being subtracted from GRACE. Such ambiguity suggests that “groundwater” in residual-based approaches may be more a function of nomenclature than hydrological substance (Wood et al., 2011).
3 Towards Physically Consistent Approaches
-
Employing hydrological models that explicitly simulate groundwater dynamics and baseflow processes, such as ParFlow (Maxwell et al., 2015), Noah-MP (He et al., 2023), CLSM-F2.5 (Koster et al., 2000), and PCR-GLOBWB (Sutanudjaja et al., 2018). A certain level of physical realism of key land states (e.g., SM, snow, surface waters) must be ensured for each model. This could be accomplished by calibrating the models to available ground/satellite reference data sets;
-
Assimilating GRACE data into physically based models that conserve mass and update all storage terms consistently (e.g., Kumar et al., 2016; Nie et al., 2024). GLDAS v2.2, for instance, includes CLSM-F2.5 with GRACE data assimilation (Li et al., 2019);
-
Accounting for human activities within hydrological modeling frameworks (e.g., Getirana et al., 2025; Nie et al., 2019);
-
Leveraging in situ groundwater observations and piezometric data to validate GRACE-derived trends (e.g., Getirana et al., 2020; Nie et al., 2019); and
-
Reframing the interpretation of model storage outputs, explicitly stating how each compartment relates to observed or inferred physical processes (e.g., Koster et al., 2009).
4 Conclusion
Given the widespread use of GRACE-LSM residuals to infer GWS anomalies, it is essential to recognize the limitations and assumptions embedded in this subset of approaches. Many studies have estimated GWS based on GRACE and ground or satellite measurements while accounting for inherent uncertainties. Here, we focus on the GRACE minus LSM residual approach and urge more careful consideration of structural limitations and other sources of uncertainty. Many LSMs do not explicitly simulate key processes, leading to incomplete partitioning of TWS when GWS is computed as a residual. For example, SWS may be lacking, while simulated deep SM may have been tuned to behave more like groundwater, introducing ambiguity into the residuals. These issues suggest the need for greater care in interpreting the results and, better yet, application of more advanced approaches. Continued refinement of modeling frameworks, including more explicit representations of human and subsurface processes, efforts to improve their physical realism, and clearer terminologies, can help to improve the robustness and interpretability of residual-based groundwater estimates. We encourage the community to adopt physically consistent frameworks when using LSMs to support groundwater studies.
Conflict of Interest
The authors declare no conflicts of interest relevant to this study.
Open Research
Data Availability Statement
The authors did not use or create any new data.