Vous regardez une version antérieure (v. /display/EDDSDLTEL/Troubleshooting) de cette page.

afficher les différences afficher l'historique de la page

« Afficher la version précédente Vous regardez la version actuelle de cette page. (v. 2) afficher la version suivante »

How to find where/why a simulation crashed/stopped

1) Check if the model stopped in the scripts or model job

Go into your listings directory:

    cd ~/listings/${TRUE_HOST}
For example on Beluga/Narval:
    cd ~/listings/Beluga
resp.
    cd ~/listings/Narval

List all the listings of the month that failed:

    ls -lrt ${GEM_exp}_[MS]*

Open the last listing in your editor or with 'less'

If the model stopped in the ...

a) Scripts listing ${GEM_exp}_S*

  • Jump to the end of the listings (when using 'vi', 'vim' or 'less' you can jump to the end by pressing 'G')
  • Search upwards until you find an error message

b) Model listing ${GEM_exp}_M*

Jump to the end of the listings (when using 'vi', 'vim' or 'less' you can jump to the end by pressing 'G')

Each model job consists of 3 main parts:

  • It starts with a shell code,
  • followed by the Fortran executable,
  • followed by another shell part.

If all goes well, the first shell part ends with:


INFO: MPI launch after 4 second(s)
INFO: START of listing processing : Mon Aug 30 12:33:56 EDT 2021
==============       start of parallel run       ==============


2. Fortran part

Jump to the end of the listings (when using 'vi', 'vim' or 'less' you can jump to the end by pressing 'G')


  • Aucune étiquette