Key features of EarthLab are given with respect to five elements as follows:
(1) The ESM will feature horizontal resolutions of 10–25 km on a global scale, which enables the earth spheres to be simulated well, including the atmosphere, the oceans (including sea ice), the land surface, vegetation (e.g., dynamic vegetation evolution), aerosol and atmospheric chemical processes, and biogeochemical processes on land and sea (e.g., carbon cycle), and their interactions. Initially and separately, there will be a development and application of a solar-terrestrial space environment model, an ice sheet model, and a solid earth model.
(2) The spatial resolution of the regional high-precision environmental simulation system will reach 3 km over the whole of China and 1 km over some key areas of the country, which is geared towards directly tackling key regional environmental problems, such as weather forecasting, air pollution forecasting and early warning, agricultural drought forecasting, and climate risk forecasting.
(3) There will be a dedicated support and management system to facilitate earth system modeling, which will provide support for program development, debugging, and performance tuning, as well as for user-oriented model testing, evaluation, and adjustment.
(4) EarthLab will include a data assimilation system and a related database, with a cutting-edge visualization system in support of the numerical simulation of the earth system in China.
(5) The high-performance computing system will have 15 PFLOPS of peak performance and 80 PB of storage capacity, which will provide fundamental support for the entire EarthLab facility.
Global earth numerical simulation system
The ESM is one of the key software systems of EarthLab, which is currently the CAS-ESM version 2 (Zhang et al., 2020), and has been used for CMIP6 simulations. It consists of several component models—namely, the Atmospheric General Circulation Model developed in IAP (IAP-AGCM version 5), the LASG/IAP Climate system Ocean Model (LICOM version 2), the Beijing Normal University/IAP Common Land Model (CoLM), the Los Alamos Sea-Ice Model (CICE version 4), and the IAP Aerosol and Atmospheric Chemistry Model (IAP-AACM), using the CESM Coupler 7 (CPL7) to integrate all the components above. It also comprises the IAP Dynamic Global Vegetation Model (IAP-DGVM) and the IAP fire model that are incorporated into the land model, and the IAP ocean biogeochemistry model that is incorporated into the ocean model. In addition, the mesoscale Weather Research and Forecasts (WRF) model is coupled to atmospheric model IAP AGCM through CPL7 with two-way interactions to enable regionally refined climate simulation or one-way downscaling (He et al., 2013).
IAP-AGCM 5 is a grid-point model with a terrain-following coordinate (Zhang et al., 2013). It has three configurations of horizontal resolution with 1.4° × 1.4°, 0.5° × 0.5°, and 0.31° × 0.23° (lon × lat). The default vertical configuration has 35 levels with a model top at 2.2 hPa while the high-top version has 69/91 levels with a model top at 0.01 hPa (Chai et al., 2021a, b). LICOM 2.0 (Liu et al., 2012) uses $ \eta $-coordinates with horizontal resolutions of ~1.0° × 1.0° globally, increasing to 1.0° × 0.5° (lon × lat) between 10°S and 10°N. It has 30 vertical layers with a grid spacing of 10 m in the upper 150 m. Additionally, the Ocean Biogeochemistry General Circulation Model (OBGCM) was developed and embedded into LICOM to represent the natural carbon cycle and the uptake and storage of anthropogenic CO2 in the ocean (Xu et al., 2013). CoLM (Dai et al., 2003), with its advanced performance, was well-known as the Community Land Model (CLM) incorporated into the Community Climate System Model (CCSM). It has 15 vertical soil layers resolving down to 42.1 m deep and five snow layers, using the same horizontal resolution with IAP-AGCM. The latest version of CoLM contains the biogeochemical processes in the soil and provides the lower boundary of greenhouse gases for the global climate model, including carbon dioxide, methane, and nitrous oxide, which is under development (e.g., Li et al., 2020). IAP-DGVM has 12 kinds of natural land vegetations classified by physical, phylogenetic, phenological parameters, and bioclimatic limitations (Zeng et al., 2014), giving an advanced performance in the simulation of vegetation distributions and carbon fluxes (Zhu et al., 2018). The IAP fire model is developed based on a fire parameterization, which is used to simulate the fire burned regions, fire seasonality, and interannual variability as well as to further characterize the carbon cycle and ecological processes (Li et al., 2012). Both the IAP-DGVM and the IAP fire model are embedded within the CoLM. The IAP-AACM is developed to calculate gaseous chemistry, aqueous chemistry, heterogeneous chemistry, dry and wet deposition, which is coupled with IAP-AGCM through a coupler via two-way interactions (Chen et al., 2015; Wei et al., 2019).
For different research objectives, the above component models are coupled and integrated into two versions of the ESM as follows:
(1) A medium-resolution ESM, which consists of all component models described above. The horizontal resolution is 140 km for the atmospheric model, and 100 km for the ocean model, while the resolutions for the other component models are consistent with the atmosphere or ocean model. In reference to the experimental design for climate change in CMIP6, a series of numerical simulations have been completed (e.g., Zhang et al., 2020; Dong et al., 2021; Jin et al., 2021).
(2) A high-resolution climate system model, which consists of three sub-systems: atmosphere, ocean (including sea ice), and land surface. The horizontal resolution is 25 km for the atmosphere and land surface models, and 10 km for the ocean model. This version will also contribute to the simulations in the next generation of CMIP, such as HighResMIP, and will be directly used for short-term climate prediction.
The ESM also comprises a solar-terrestrial space environment model, an ice sheet model, and a solid earth model, which are not coupled with the other seven component models during the current phase but will be integrated into the ESM during the second phase. The solar-terrestrial space environment model is an integration of a kinetic model and a three-dimensional magnetohydrodynamics model, which can be used to simulate the evolution of the solar wind from the Sun to the Earth. The ice sheet model can simulate the ice sheet flow at the continental scale based on satellite data and results from the atmosphere and ocean via solving equations governed by ice dynamics using the finite element method. The solid earth model is a multi-scale and multi-field system containing the mantle convection and plate motion module, the global viscoelastic and elastic module, the sea-storm tide and tsunami module, the landscape evolution module, and the regional porous elastic module. All three separated component models are under development. Figure 2 shows a schematic diagram of the ESM numerical simulation system. Most of the component models will gain further improvement and development based on the existing self-developed models. This project will build a software platform that integrates the latest scientific understanding of the earth system, facilitates understanding of the basic laws of physics, chemistry, and biogeochemistry for the various spheres of the earth system and their interaction mechanisms, while serving research and applications in earth system science.
Regional high-precision simulation system
The regional high-precision simulation system is targeted at key environmental issues that impose major impacts on the national economy and people’s livelihoods, such as weather forecasting, atmospheric pollution prediction and warnings, agricultural drought prediction, and climate risk evaluation. The system consists of four sub-systems—namely, a regional cloud-resolving weather forecasting sub-system, a regional high-resolution air pollution modeling sub-system, a drought modeling sub-system for grain crops of major agricultural production areas in China and the world, and a regional high-precision long-term climate change risk modeling sub-system. Through model development and software system engineering, regional refinements in the simulation of these key environmental elements will be realized. The spatial resolution will reach 1–3 km over the whole of China, greatly enhancing the current level of high-resolution simulation technology and capacity catering for scientific research and application needs in terms of the regional environment. The relationships among the four sub-systems are shown in Fig. 3.
(1) The Regional Cloud-Resolving Weather Forecasting sub-system is to provide accurate and precise forecasting products and improve the forecasting accuracy of heavy rainfall, strong convection, typhoons, and other high-impact weather based on the Weather Research and Forecasting Model (WRF). It has an advanced multi-source observation data assimilation module, which integrates various conventional or irregular meteorological observations to establish the assimilation analysis field.
(2) The Regional High-Resolution Air Pollution Modeling sub-system is a new-generation air quality modeling system based on Global Nested Air Quality Prediction Model System (GNAQPMS, Wang et al., 2017). It includes an aerosol microdynamic module, a secondary organic aerosol module, a combined moisture absorption and growth of aerosol component module, and a visibility module. It involves advanced modeling techniques, such as emissions inversion, ensemble forecasting, and source apportionment.
(3) The Regional High-Precision Long-term Climate Change Risk Modeling sub-system is designed to analyze the trends of global climate change, the trends and thresholds of East Asian climate, and the impacts of climate change on socioeconomics over China and East Asia. It uses climate change data, energy systems, and a human greenhouse gas emission assessment application module to calculate the greenhouse gas emissions generated by human socio-economic and energy activities as well as the surface characteristics of the climate change risk assessment area.
(4) The Drought Modeling subsystem for Grain Crops in Major Agricultural Regions enables the simulations of the global agricultural remote sensing drought index and the formation of agricultural drought. It can automatically extract the global farmland distribution, including paddies, drylands, and irrigated water areas. It includes a global database of six major cereal crops (rice, maize, wheat, sorghum, rice, and barley) and a growth modeling module. It enables the simulation of food-crop growth in China and major agricultural areas worldwide under conditions of real-time meteorological observations and dynamic analysis data.
Supercomputing support and management system
The supercomputing support and management system has support functions for the development and transplantation of the underlying code, debugging and tuning, and optimization of the computing environment, as well as management functions for user-oriented model testing, evaluation, and tuning. Based on the common C-Coupler, the flux interactive ensemble coupled platform, and the physical scheme parameter tuning platform (all with independent R&D in China), the system utilizes domestic and foreign advanced coupling techniques, quantitative uncertainty analysis methods, and data analysis techniques, as well as adopting a means of modern software engineering with the intent of building an integrated supercomputing support and management system. The system addresses the bottlenecks encountered by China’s ESMs in terms of parallelization, componentization, coupler techniques, and large data analysis, in addition to supporting full-process scientific research activities covering the ESM design, development, debugging, verification, release, and application with flexibility and efficiency. The specific content of construction includes three sub-systems of computing and encapsulation, model resources, and model services. The hierarchical relationships among the sub-systems and the parallel relationships among the sub-systems are shown in Fig. 4.
Database, data assimilation and visualization system
The database, data assimilation, and visualization system provides data support, assimilation tools, model initial and forcing data, database management, and scientific data visualization functions for the whole of EarthLab, and underpins the earth system numerical simulation. It consists of six sub-systems as follows: data assimilation, observation and simulation experiments, satellite key parameter retrieval, data merging, supporting database, and visualization, as shown in Fig. 5.
This system acquires massive amounts of routine observational data, radar data, and meteorological satellite remote sensing data for China and develops a data assimilation system by employing an advanced variational and ensemble data assimilation, and builds a supporting database and analysis dataset shared by the whole of EarthLab. The system provides high-precision, high-resolution input data, and validation data for global-scale earth system modeling and regional high-precision environmental simulations, and serves as an efficient visualization tool that supports visual analysis of big data for EarthLab.
The observation and simulation experiment sub-system, will establish the classified quality identifications of the observations through the real-time monitoring and evaluation of the instruments on in-orbit satellites including FY-3 and FY-4, and generate real-time parameters of the earth and space environment that affect the observations by the in-orbit detectors based on all kinds of existing observations and simulations. The sub-system will also build a simulation system based on the requirements for designing new payloads of satellites. It helps to obtain optimal parameters of in-orbit instruments through simulation performance tests, in order to provide a basis and essential parameters for the product algorithm, assist in the joint debugging of the ground system, and offer scientific references, and a basis for such issues as the constellation layout and payload update of the subsequently launched satellites.
The satellite key parameter retrieval subsystem will use FY-series geostationary and polar orbital meteorological satellite data to establish a real-time data processing and retrieval system with rigorous quality control of the observations and appropriate retrieval algorithms. The system can efficiently and regularly receive, process, and retrieve key atmospheric parameters with high accuracy and high resolution, such as temperature, humidity, and cloud parameters, providing real-time satellite data for verification of the simulations. Moreover, it will reprocess the long-term sequence data of historical satellites to obtain consistent and reliable data (level 1), which can be used to obtain key atmospheric parameters that have significant effects on weather and climate, such as cloud, aerosol, and atmospheric precipitation. The acquisition of these parameters can provide input data for the earth system model, deepen our understanding of weather processes, and offer observational facts for testing and improving ESMs.
The data merging subsystem will collect real-time observational data derived from automatic observation stations, radars, satellites, and model data from numerical forecasts to yield multi-source, high-quality, and high-precision meteorological grid products using data fusion and assimilation technology. All of the data will be preprocessed by the other sub-systems before getting into the data merging subsystem. The product will contain land surface temperature, wind, humidity, precipitation, and solar radiation data to drive land models and 3-D cloud fields for the hot start of mesoscale atmosphere models, and so on.
High-performance computing system for earth sciences
As one of the core contents of the EarthLab hardware as a whole, the HPC system for earth sciences will provide a high-performance and extensible hardware platform specially oriented to perform ESM numerical simulations and regional high-precision environmental simulations, and support the efficient operation and data exchange for various models. Further, it provides external services via high-speed networks, and necessary hardware support for the simulation support and management system, database, and data assimilation and visualization system. The system consists of five sub-systems as follows: model calculation, data storage, network switching, support and management, and infrastructure. The framework is shown in Fig. 6. Upon completion, it will become a world-leading earth system simulation computer. This system has a high-speed network connected with the China Science and Technology Network (CSTNET) and the China Education and Research Network (CERNET), which enables open sharing of the EarthLab computing resources to users at home and abroad.