-
Notifications
You must be signed in to change notification settings - Fork 19
Add docs for ARC4 compile+run with CASIM/SOCRATES #46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,200 @@ | ||
| # | ||
|
|
||
| ## A field guide to MONC with CASIM and SOCRATES on ARC4 | ||
|
|
||
| Chris Symonds, Mark Richardson, Steven Boeing, Craig Poku and Leif Denby | ||
|
|
||
| 4/5/2021 | ||
|
|
||
| Aim: instructions on how to retrieve, compile and run MONC with CASIM and SOCRATES, and notes on internal workings of the coupling between MONC and SOCRATES/CASIM | ||
|
|
||
| # _Overview_ | ||
|
|
||
| **CASIM** provides a two-moment(?) microphysics parameterisation which is called on every timestep to predict the formation and removal of water condensate (ice and liquid). | ||
|
|
||
| **SOCRATES** provides functionality to compute column-wise longwave absorption and emission of radiation represented by cooling and heating at every grid point. | ||
|
|
||
| The guide is split into three steps: a) retrieving a copy of MONC, b) compiling and running MONC on ARC4 and c) compiling and running MONC with SOCRATES and CASIM. | ||
|
|
||
| # a. Retrieving MONC | ||
|
|
||
| To get a copy of MONC from the Leeds fork on github is recommended you first [create your own fork](https://github.com/Leeds-MONC/monc/fork) and then clone that fork locally onto the computer where you are working: | ||
|
|
||
| ``` | ||
| $> git clone https://github.com/<your-github-username>/monc/ | ||
| ``` | ||
|
|
||
| # b. Compiling and running MONC on ARC4 | ||
|
|
||
| To compile and run MONC on ARC4 you will need to ensure that the correct versions of required libraries are loaded and then compilation can take place. This should all be taken care of by the script in [utils/arc/monc_compile_arc.sh](../utils/arc/monc_compile_arc.sh) which can be run with | ||
|
|
||
| ```bash | ||
| $> bash utils/arc/monc_compile_arc.sh | ||
| ``` | ||
|
|
||
| from the root of the repository. For completeness the steps the contents of that script are detailed below, after which details on running MONC on ARC4 are given. | ||
|
|
||
| ## 1. Make sure the dependencies of MONC are available | ||
|
|
||
| Issue the following commands to load the correct modules with `module`: | ||
|
|
||
| ```bash | ||
| $> module purge | ||
| $> module load user | ||
| $> module switch intel gnu/native | ||
| $> module switch openmpi mvapich2 | ||
| $> module load netcdf hdf5 fftw fcm | ||
| ``` | ||
|
|
||
| At time of writing MONC was compiled with the following versions: gnu compiler `4.8.5` (called `gnu/native` on `ARC4`), mvapich2 `2.3.1`, fcm `2019.09.0`, hdf5 `1.8.21`, netcdf `4.6.3` and fftw `3.3.8` | ||
|
|
||
| Notes on versions: | ||
|
|
||
| - gnu version: At time of writing (i.e. the version `v0.9.0` of) MONC only works with gnu compilers with version `4.x.x` (e.g. `gnu/native` on ARC4 which is `4.8.5`), but changes for gnu versions `>= 7.x.x` are being worked on by Chris Symonds (@cemac-ccs) | ||
| - MPI implementation: `openmpi` or mvapich2 doesn't allow for multithreading, but Rachel Sansom has found MONC to be more stable with `mvapich` on ARC4 | ||
| - fftw: due to licensing issues use of fftw is no longer the default Fourier transform used on the head of MOSRS `trunk`, instead ffte is used in current head-of-trunk on MOSRS (ffte is included with the MONC sourcecode on MOSRS `trunk`). In principle there is no issue of using FFTW and if your research can work with the licensing and it is installed you might be happier using it. MOSRS was not clearly labelled before they made that change. | ||
|
|
||
| ## 2. Compile your copy of MONC | ||
|
|
||
| `fcm` is here wrapping `make`, while including correct .cfg-files in `fcm-make/` define set up the compilation environment | ||
|
|
||
| ```bash | ||
| $> fcm make -j4 -f fcm-make/monc-arc4-gnu.cfg -N --ignore-lock | ||
| ``` | ||
|
|
||
| ## 3. Submitting a MONC run | ||
|
|
||
| Job submission script (an example is given in [../utils/arc/submonc.sge](submonc.sge)) should have the following important features: | ||
|
|
||
| 1. Job _walltime_ longer than _walltime_ in monc configuration | ||
| 2. Module loads (to make required external libraries available at runtime) | ||
| 3. MVAPICH variables: | ||
|
|
||
| ``` | ||
| MONC_THREAD_MULTIPLE=0 # to do with "thread-pooling" in MONC | ||
| MV2_ENABLE_AFFINITY=0 | ||
| MV2_SHOW_CPU_BINDING=1 | ||
| MV2_USE_THREAD_WARNING=0 | ||
| export MONC_THREAD_MULTIPLE MV2_ENABLE_AFFINITY \ | ||
| MV2_SHOW_CPU_BINDING MV2_USE_THREAD_WARNING | ||
| ``` | ||
|
|
||
| Then you execute your job submission script | ||
|
|
||
| ```bash | ||
| $> qsub <you-job-script.sub> | ||
| ``` | ||
|
|
||
| NOTE: before submitting the job ensure that the "standard out" path is cleared, so that the job isn't restarted from a previous run. | ||
|
|
||
| ## 4. Check that job is queued/running | ||
|
|
||
| You can now check that your job is running (the the "cluster" _group_ on ARC4 is called `c`): | ||
|
|
||
| ```bash | ||
| $> qstat -g c | ||
| ``` | ||
|
|
||
| If this run completed succesfully you can now continue onto compiling and running MONC with SOCRATES and CASIM. | ||
|
|
||
| # c. Compiling MONC with SOCRATES and CASIM | ||
|
|
||
| Due to the license of SOCRATES and CASIM the sourcecode for both resides on the MOSRS (Met Office Science Repository) for which you will need access to using MONC with CASIM/SOCRATES. Once you have your credentials you can follow the steps below to set up `fcm` so that you can retrieve SOCRATES/CASIM with `fcm` and compile MONC with either or both componeents. | ||
|
|
||
| ## 1. Setup and check SVN connection | ||
|
|
||
| Add MOSRS to `svn` (subversion) list of servers by adding the following lines to `~/.subversion/servers` | ||
|
|
||
| ``` | ||
| [groups] | ||
| metofficesharedrepos = code*.metoffice.gov.uk | ||
|
|
||
| [metofficesharedrepos] | ||
| username = <your-username> | ||
| store-plaintext-passwords=no | ||
| ``` | ||
|
|
||
| [https://code.metoffice.gov.uk/trac/monc/wiki/MoncDoc/MoncUserguide/MosrsSetup](https://code.metoffice.gov.uk/trac/monc/wiki/MoncDoc/MoncUserguide/MosrsSetup) | ||
|
|
||
| Now check the SVN connection (may be asked to cache the password, Craig found it best not to cache it encrypted) | ||
|
|
||
| ```bash | ||
| $> svn info https://code.metoffice.gov.uk/svn/test | ||
| ``` | ||
|
|
||
| Caching password is covered in above MOSRS link and here: [http://cms.ncas.ac.uk/wiki/MonsoonSshAgent](http://cms.ncas.ac.uk/wiki/MonsoonSshAgent). | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The page linked only really deals with setting up ssh keys between monsoon and puma. I believe the correct link is [https://code.metoffice.gov.uk/trac/home/wiki/AuthenticationCaching]. It should however be noted that those instructions are not correct for these purposes as they require rose to be installed to be used as a checker. For compilation on Archer2 I have added a call to cache the password, so I will include that in the compilation script for ARC4 once I have got the caching actually working
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Update - I have spoken to richard and the issue with gpg password caching was due to the version of svn on arc4. The following needs to be added to allow password caching on arc4: I have added these lines to the compilation script.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just tried giving this a go, because I was previously just mashing my password 4327 times.. it's coming up that the "mosrs-setup-gpg-agent" command is not found. Do I need to get that from somewhere else?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry, I had the permissions wrong. Give it a try now
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Correction - the command is
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Hmm, it's not finding the file..
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Maybe it is best that I add a comment that password-caching doesn't currently work on ARC4? I'd like to get these instructions into
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That might be wise in the short term Leif. Rachel, which file is it saying it can't find? Did you run all four lines? You should be able to confirm whether it worked or not by running
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's when I try the command that is
I ran all four lines and tried both variations of that command and when I run the test it fails. |
||
|
|
||
| ## 2. Check fcm keywords | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The compilation script in PR #48 includes keyword handling by using sed to modify the contents of the keyword.cfg file to point to the working folder, and sending the output of the sed command to ~/.metomi/fcm/keyword.cfg. |
||
|
|
||
| [https://code.metoffice.gov.uk/trac/monc/wiki/MoncDoc/MoncUserguide/FcmKeyWords](https://code.metoffice.gov.uk/trac/monc/wiki/MoncDoc/MoncUserguide/FcmKeyWords) | ||
|
|
||
| Check what keywords are there by doing following command: | ||
|
|
||
| ```bash | ||
| $> fcm keyword-print | ||
| ``` | ||
|
|
||
| If you don't see `socrates` or `casim` mentioned in the list of keywords you will need to give `fcm` the correct set of keywords by adding the following in a file at `~/.metomi/fcm/keyword.cfg` (you may need to create this file): | ||
|
|
||
| ``` | ||
| location{primary, type:svn}[monc.x] = https://code.metoffice.gov.uk/svn/monc/main | ||
| browser.loc-tmpl[monc.x] = https://code.metoffice.gov.uk/trac/{1}/intertrac/source:/{2}{3} | ||
| browser.comp-pat[monc.x] = (?msx-i:\A // [^/]+ /svn/ ([^/]+) /\*(.\*) \z) | ||
|
|
||
| location{primary}[casim.x] = https://code.metoffice.gov.uk/svn/monc/casim | ||
| location{primary}[monc-doc.x] = https://code.metoffice.gov.uk/svn/monc/doc | ||
| location{primary}[monc-postproc.x] = https://code.metoffice.gov.uk/svn/monc/postproc | ||
| location{primary}[monc-scripts.x] = https://code.metoffice.gov.uk/svn/monc/scripts | ||
| location{primary}[monc.x] = https://code.metoffice.gov.uk/svn/monc/main | ||
| location{primary}[socrates.x] = https://code.metoffice.gov.uk/svn/socrates/main Then run `fcm keyword-print` again to check that the keywords now include `casim` and `socrates`. | ||
| ``` | ||
|
|
||
| ## 3. Obtaining source-code for SOCRATES/CASIM and compiling MONC with these | ||
|
|
||
| In addition to the steps above, a copy of the source code for SOCRATES/CASIM is needed to compile MONC with CASIM/SOCRATES (both are treated identically in the comments that follow). The location of CASIM/SOCRATES is given through fcm configuration file (.cfg-file) which if provided through the command line (with –f) to fcm. This location can either be: a) a local filesystem path or b) a remote SVN path on MOSRS (with optional revision number from that SVN path to use). If option b) is used fcm will both check out the SOCRATES/CASIM code from MOSRS and compile with MONC | ||
|
|
||
| When calling `fcm` you need to include the `casim.cfg`, `socrates.cfg` or `casim_socrates.cfg` .cfg-files (if you're compiling with either CASIM, SOCRATES or both). Here we're including both CASIM and SOCRATES: | ||
|
|
||
| ```bash | ||
| $> fcm make -j4 -f fcm-make/monc-arc4-gnu.cfg -f fcm-make/casim_socrates.cfg -N --ignore-lock | ||
| ``` | ||
|
|
||
| You need to place the `casim`/`socrates`/`casim_socrates` fcm make config file _after_ the MONC file. | ||
|
|
||
| You can change which versions of CASIM/SOCRATES are fetched from MOSRS by changing the `casim_revision` and `socrates_revision` variables in the `.cfg`-files. At time of writing the versions to use with MONC `0.9.0` are revision `um10.8` for SOCRATES and revision `6341` for CASIM. Later versions may require changes to MONC for compatibility. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The keyword "um10.8" didn't work for me, so I just used the version Chris had been using, revision 658.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks! I'll put
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes! I was compiling with fcm-make/casim_socrates.cfg and just changed: @cemac-ccs just checking that it was socrates 658 that you have been using?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually revision 358 (which is what get checked out when you use the um10.8 flag). I might have mispoken on our call last week. Looking at the casim_socrates_mirror.cfg file however, it seems that 593 is also a good revision to use.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ok, so is the consensus that revision we should go with is
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have confirmed that it compiles with casim 6431 and socrates 658. I am just confirming that it runs on arc4 with these. |
||
|
|
||
| The fcm configuration file does a number of things: | ||
|
|
||
| 1. Instruct fcm to _extract_ code for CASIM and SOCRATES (makes sure the source files arencluded) | ||
| 2. Include interaces for CASIM and SOCRATES, instead of place-holder routines | ||
| 3. Define locations of where SOCRATES/CASIM is coming from | ||
| 4. Set environment variables needed for CASIM/SOCRATES when being compiled to run inside MONC | ||
|
|
||
| ## 2. Running MONC with CASIM and SOCRATES | ||
|
|
||
| To run MONC with CASIM and SOCRATES three things are needed in the model configuration (`.mcf`) file: | ||
|
|
||
| 1. Flags to enable CASIM/SOCRATES and disable the functionality they replace | ||
|
|
||
| ``` | ||
| # required flags for CASIM | ||
| simplecloud_enabled=.false. | ||
| casim_enabled=.false. | ||
| # required flags for SOCRATES: | ||
| socrates_couple_enabled=.true. | ||
| lwrad_exponential_enabled=.false. # turn off "bulk-calculation of radiation" | ||
| ``` | ||
|
|
||
| 2. Test-case specific parameters. These will define what microphysics processes to include specific configuration parameters for CASIM. | ||
|
|
||
| ``` | ||
| # number of scalar fields allocated, this needs to match the | ||
| # total number of tracers _required_. water vapour, graupel, etc | ||
| number_q_fields=9 | ||
| ``` | ||
|
|
||
| 3. External files providing reference profiles for radiation calculations (SOCRATES) and microphysics (CASIM). Craig Poku noted that providing _relative paths_ for these files based on where CASIM/SOCRATES is checked out within the MONC source tree (when fcm is used to fetch CASIM/SOCREATES sources from MOSRS) doesn't work and these configuration file parameters should point to where one has checked out the SOCRATES/CASIM source by hand. | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This last point might be quite vague, depending on the user, if they are not familiar with mosrs, checking out branches, and the specific files for socrates. Not a problem now, but just if the user base grows, perhaps!
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I agree. What do you think we should write instead? I didn't take more detailed notes when we typed up the field guid. But maybe we could mention the specific files required? Do you feel like writing a suggestion?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I'm happy to write a suggestion. Or what I did, at least! I'll email over or add as a comment.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would write something like this: Checking out SOCRATES/CASIM source code by handThese steps are the same for both SOCRATES and CASIM by using the appropriate name and path. First, create a branch for SOCRATES/CASIM from the command line on ARC4: $> fcm bc <branch_name> fcm:<socrates/casim>.x_tr@<version_number>Navigate to the directory you wish to store the source code and check out your branch: $> fcm co <your_branch>Where <your_branch> is the full Met Office URL printed at the end of branch create: [info] Created: https://code.metoffice.gov.uk/svn/... If you want to build MONC with these branches, you need to change the source-code location in the fcm configuration file (e.g. fcm-make/casim_socrates.cfg) to point to your branches on MOSRS. Change the SOCRATES/CASIM fcm-make .cfg file to point to your branches: Build MONC with SOCRATES/CASIM using the fcm command and correct SOCRATES/CASIM config file, e.g. $> fcm make -j4 -f fcm-make/monc-arc4-gnu.cfg -f fcm-make/casim_socrates.cfgThere are reference files for radiation calculations for SOCRATES that are hard-wired for use on MONSooN by default. The location of these needs to be updated in any MONC config (.mcf) files to point to your checked out branch. can be changed to your local branch of SOCRATES, e.g.:
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is great Rachel. A possible addition could be explicit instructions to run a fcm commit after making any changes to the casim or socrates code before compiling.
Collaborator
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thanks for writing this @eers1! I'm a bit confused though @eers1 and @cemac-ccs: is it really necessary to create a new branch on MOSRS and to commit back to it? We're not making any changes to CASIM/SOCRATES here, we're only trying to get specific revisions of the code so they can be compiled into MONC, no?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That is true, but having instructions on how to compile with a local source would be useful, and it is most likely that those who have local sources will have made changes to the code themselves. A side note that I came across when running on Archer2 is that when defining the location of the socrates spectral files, if you use the ones in the monc source it may fail due to the absence of the sp_sw_ga7_k and sp_lw_ga7_k files in the same folder. I am not sure whether the fcm download of socrates can be set up to put these files in the correct place.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just coming back to this now, but yes sorry I maybe got carried away with making and checking out new branches. So without doing that, do you just change the SOCRATES/CASIM fcm-make .cfg file revision number but keep it pointing to the trunk? Also, I'm not sure how to point to the SOCRATES files without having them locally. Like @cemac-ccs says, might be nice to keep the bit about new branches and have info about making changes to CASIM/SOCRATES, committing the changes and compiling with those branches. |
||
|
|
||
| ## 3. Looking at the output | ||
|
|
||
| Can be done through ncview or similar. The outputs are in folder `diagnostic_files` and defined in the mcf file. | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be a good idea to create a restart script, similar to the one Craig was using (and the one that exists within the /misc/continuation.sh script)? I believe that all the code that we would need to add to the continuation script is in Craigs branch submission script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A continuation script would be great! Richard Rigby did make one for ARC4 that he gave to me, I just hadn't got around to testing it since I worked through all my errors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a continuation script could be a good idea, but I feel like they often get quite long an unwieldy and it's difficult to account for all the possible use cases. Are you thinking one that work with MONC used on any HPC system? And what changes would need to be made input files before running the restart file? Maybe you could open a new issue on this and we could hash out some ideas :) What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think moving that discussion to a new issue is wise.