General information

Domain & resolution

First you should decide upon the final(!) (see "Cascading" below) domain and resolution you want to run. For this it is important to know that:

the more grid points a grid has => the more cores are needed
the higher the resolution => the shorter the model timestep => the longer the runtime of the simulation

Have a look at the following wiki about how to set up a model grid: Set up a model grid

Topology (MPI & OpenMP)

Resolution → timestep

The relation between the horizontal resolution and the model timestep is linear. Here are some examples:

Resolution [degree]	Resolution [km]	model timestep [sec]	For GY grid
0.009°	~1 km	30 s
0.0225°	~2.5 km	60 s
0.036°	~4 km	90 s
0.11°	~12 km	300 s
	~25 km	720 s	Grd_nj = 417
	~55 km	1800 s	Grd_nj = 171

Resolution → timestep → radiation timestep

Calculating the increments on different fields depending on the radiation is rather costly. Therefore, the option exists to calculate the increments only every n seconds but still apply them at every timestep. The parameter to set is called 'KntRad_S' from the 'gem_settings.nml'. It should get set to a multiple 4 to 6 of the model timestep. The unit is seconds.

Resolution → (non) hydrostatic

If the grid resolution is ~10 km or lower the model is hydrostatic and one should set 'Dynamics_hydro_L = .true.' in the 'gem_settings.nml'. For grid resolutions of ~4km and higher one should set 'Dynamics_hydro_L = .false.'

Model levels

The model levels on GEM are terrain following at the surface and pressure levels at the model top. The model levels are generally closer to each other at the surface and further apart higher up. The model is very sensitive to the location of these model levels. They must neither be too far apart nor too close and the transition from terrain following to pressure levels must be very smooth. There are ways to set up these model levels. Several sets of model levels have already been created and can get selected by setting 'CLIMAT_etaname' in the file configexp.cfg. The possible settings for this parameter can get found in the file:
~/modeles/GEMDM/*/bin/Climat_eta2preshyb

Vertical sponge & top piloting (LAM only)

Probably because waves get reflected back into the atmosphere from the model lid, a spurius wind jet forms at the equatorial stratopause. To remove this spurius wind jet, a so called "vertical sponge" is used at the model top to absorb these waves so they do not get reflected. For more information and how to set up the vertical sponge (Vspng) have a look at the following wiki: Diffusion & Sponges

In LAM model one also has the option of instead nesting the top model layers. Which is also called "top piloting".

When top piloting is used the vertical sponge should NOT get used!!!

Global grid → adjust global pressure

Due to the semi-lagrangian advection scheme, which does not exactly conserve the total mass of the atmosphere, the GEM model is "loosing air" with time. Therefore, the parameter 'Schm_psadj' from the gem_settings.nml needs to get set to one of the following:

!# * 0 -> No conservation of surface pressure (for LAM grids)
!# * 1 -> Conservation of total air mass pressure (for global grids)
!# * 2 -> Conservation of dry air mass pressure (for global grids)

Cascadeing

When the resolution of the driving data is too large with respect to the model grid resolution, one or more simulations can get run with an intermediate resolution and a domain slightly larger than the next inner domain.

Spectral nudging

To constrain the large scales of a simulation spectral nudging can get used. To learn more about this and which parameters to set have a look at the following wiki page: Spectral nudging

Period

Start and end date of a simulation need to get set in the file configexp.cfg with the parameters:
CLIMAT_startdate="YYYY MM DD hh"
CLIMAT_enddate="YYYY MM DD hh"

Greenhouse gases / Scenario

Depending on the period (past, current, future) and for future simulation depending on the scenario simulated, a table containing the annual mean concentrations of the main 5 greenhouse gases (CO2, N2O, CH4, CFC-11, CFC-12) needs to get assigned to the parameter 'CLIMAT_ghg_list' (in file configexp.cfg). A selection of different lists can be found under: ${MODEL_DATA}/Greenhouse_gases

But one can also create their own list.

Initial conditions

At the beginning of every simulations several fields (atmospheric, surface and soil) need to get initialized. Especially for short simulations it is very important to use fields that best represent the starting state of a simulation. If such fields are not available one can also chose to spinup the simulation. Whereas atmospheric fields get into equilibrium very quickly, surface fields like snow on the ground might take up to at least a season. Soil fields, like soil temperature and soil moisture, can take several years to get into equilibrium, depending on the soil depth.

To get a list of fields that need to get initialized have a look at the following wiki: Initial conditions

Driving data (lower & lateral)

Lower boundary conditions (SST & SIC)

Since our(!!!) GEM version is not coupled to an ocean model, the sea surface temperature (SST) as well as the sea ice fraction (SIC) need to get prescribed. These two fields can get prescribed at every model timestep up to monthly fields. Most of the dataset we have are either daily or monthly. If the dataset contains monthly data, all timesteps can be inside one file. If the data are at a higher frequency, one file per month should get used, with the filename ending on *_YYYYMM. The parameter 'GEM_anclima' needs to get set to the location of the dataset to get used. A selection of already available datasets can get found under: ${MODEL_DATA}/SST_SeaIce_degK. But one can also create ones own dataset.

When only one file exists, 'GEM_anclima' needs to get set to the full name of the file.

If several monthly files exist, 'GEM_anclima' needs to get set to the full name of the file without the '_YYYYMM' at the end!!! The scripts will fill in the current year and month.

Lateral boundary conditions (LAM only)

When GEM is run in LAM mode, a set of atmospheric data need to get provided to drive the model at the lateral boundaries. These datasets need to include the following fields:

Temperature
Moisture (relative humidity, specific humidity, or dew point temperature)
Horizontal winds in u- and v-grid(!) direction
Geopotential (pressure levels: on all levels; model levels: only lowest level)
Surface pressure
Condensates (if available; they need to match the condensation scheme used!)

The 3-D fields can be either on model or on pressure levels.

Geophysical fields

Geophysical fields are time invariant fields describing surface properties like surface fraction (ocean, glacier, lakes, vegetation fractions), mountain height and several other fields.

To get a list of fields that need to get provided have a look at the following wiki: Geophysical fields

Schemes

Land surface scheme (ISBA, CLASS, SVS)

Lake scheme (non, FLake, CSLM)

Urbain scheme (non, TEB)

Radiation

Roughness length

Limit ice /snow

Gravity wave drag

Emissivity

Condensation

Convection

Tracers to advect

Precipitation (Bourgouin)

Boundary layer

Horizontal diffusion

Output fields

Instantaneous / averages / min / max

Initial condition fields

Pilot fields (2-D / 3-D)

Output frequency

Output levels

Size of output files (monthly/daily/hourly/...)

Submission

Account

Wall clock time

On the clusters of the Alliance are different queues for jobs requesting different amounts of walltime as well as memory. To find out which type of queues exist on the cluster you want to run on check out the following wiki of the Alliance: Job_scheduling_policies

In the config file 'configexp.cfg' set 'BACKEND_time_mod' to the amount of seconds you want to request for a single job(1) - not for the whole simulation which can consist of multiple jobs.

In general, the shorter the requested time the shorter the queued time. Therefore, you want to request as little time as possible for a job. However, runtimes on clusters of the Alliance can vary by a lot, normally between -15 % to + 20 % of the average runtime. Hence, one should request runtimes that are about 25 % larger than the average expected runtime.

If you only want to run one job you can just request "enough" runtime. But when you are running a simulation with a sequence of jobs you might want to try to fit the jobs at the higher end of a queue. For Assuming, there are the following queues on a cluster:

3 hours or less,
12 hours or less,
24 hours (1 day) or less,
72 hours (3 days) or less,

And the simulation you want to run usually needs 14 hours per job.

General information

Domain & resolution

Topology (MPI & OpenMP)

Resolution → timestep

Resolution → timestep → radiation timestep

Resolution → (non) hydrostatic

Model levels

Vertical sponge & top piloting (LAM only)

Global grid → adjust global pressure

Cascadeing

Spectral nudging

Period

Greenhouse gases / Scenario

Initial conditions

Driving data (lower & lateral)

Lower boundary conditions (SST & SIC)

Lateral boundary conditions (LAM only)

Geophysical fields

Schemes

Land surface scheme (ISBA, CLASS, SVS)

Lake scheme (non, FLake, CSLM)

Urbain scheme (non, TEB)

Radiation

Roughness length

Limit ice /snow

Gravity wave drag

Emissivity

Condensation

Convection

Tracers to advect

Precipitation (Bourgouin)

Boundary layer

Horizontal diffusion

Output fields

Output fields

Instantaneous / averages / min / max

Initial condition fields

Pilot fields (2-D / 3-D)

Output frequency

Output levels

Size of output files (monthly/daily/hourly/...)

Submission

Account

Wall clock time

Memory