-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathdr2xml.py
More file actions
3196 lines (2952 loc) · 156 KB
/
dr2xml.py
File metadata and controls
3196 lines (2952 loc) · 156 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
#!/usr/bin/python
# -*- coding: iso-8859-15 -*-
"""
In the context of Climate Model Intercomparison Projects (CMIP) :
A few functions for processing
- a CMIP Data request and
- a set of settings related to a laboratory, and a model
- a set of settings related to an experiment (i.e. a set of numerical
simulations),
to generate a set of xml-syntax files used by XIOS (see
https://forge.ipsl.jussieu.fr/ioserver/) for outputing geophysical
variable fields
First version (0.8) : S.Senesi (CNRM) - sept 2016
Changes :
oct 2016 - Marie-Pierre Moine (CERFACS) - handle 'home' Data Request
in addition
dec 2016 - S.Senesi (CNRM) - improve robustness
jan 2017 - S.Senesi (CNRM) - handle split_freq; go single-var files;
adapt to new DRS ...
feb 2017 - S.Senesi (CNRM) - handle grids and remapping;
put some func in separate module
april-may 2017 - M-P Moine (CERFACS) : handle pressure axes ..
june 2017 - S.Senesi (CNRM) introduce horizontal remapping
july 2017 - S.Senesi -CNRM) improve efficieny in remapping; allow for
sampling before vert. interpolation, for filters on table, reqLink..
Adapt filenames to CMIP6 conventions (including date offset).
Handle remapping for CFsites
Rather look at git log for identifying further changes and contriubutors....
"""
####################################
# Pre-requisites
####################################
# 1- CMIP6 Data Request package retrieved using
# svn co http://proj.badc.rl.ac.uk/svn/exarch/CMIP6dreq/tags/01.00.01
# (and must include 01.00.01/dreqPy in PYTHONPATH)
from scope import dreqQuery
import dreq
# 2- CMIP6 Controled Vocabulary (available from
# https://github.com/WCRP-CMIP/CMIP6_CVs). You will provide its path
# as argument to functions defined here
# 3- XIOS release must be 1242 or above (to be fed with the outputs)
# see https://forge.ipsl.jussieu.fr/ioserver/wiki
####################################
# End of pre-requisites
####################################
version="pre-0.28"
print "* dr2xml version: ", version
conventions="CF-1.7 CMIP-6.2"
# The current code should comply with this version of spec doc at
# https://docs.google.com/document/d/1h0r8RZr_f3-8egBMMh7aqLwy3snpD6_MrDz1q8n5XUk/edit
CMIP6_conventions_version="v6.2.4"
print "CMIP6 conventions version: "+CMIP6_conventions_version
import json
import datetime
import re
import collections
import sys,os,glob
import xml.etree.ElementTree as ET
# mpmoine_merge_dev2_v0.12: posixpath.dirname ne marche pas chez moi
#TBS# from os import path as os_path
#TBS# prog_path=os_path.abspath(os_path.split(__file__)[0])
# Local packages
from vars import simple_CMORvar, simple_Dim, process_homeVars, complement_svar_using_cmorvar, \
multi_plev_suffixes, single_plev_suffixes, get_simplevar
from grids import decide_for_grids, DRgrid2gridatts,\
split_frequency_for_variable, timesteps_per_freq_and_duration
from Xparse import init_context, id2grid, idHasExprWithAt
# A auxilliary tables
from table2freq import Cmip6Freq2XiosFreq, longest_possible_period
# CFsites handling has its own module
from cfsites import cfsites_domain_id, cfsites_grid_id, cfsites_input_filedef, add_cfsites_in_defs
print_DR_errors=True
print_multiple_grids=False
dq = dreq.loadDreq()
print "* CMIP6 Data Request version: ", dq.version
cell_method_warnings=[]
sn_issues=dict()
context_index=None
# global variable : the list of Request Links which apply for 'our' MIPS and which are not explicitly excluded using settings
# It is set in select_CMORvars_for_lab and used in endyear_for_CMORvar
global_rls=None
# Next variable is used to circumvent an Xios 1270 shortcoming. Xios
# should read that value in the datafile. Actually, it did, in some
# earlier version ...
""" An example/template of settings for a lab and a model"""
example_lab_and_model_settings={
'institution_id': "CNRM-CERFACS", # institution should be read in CMIP6_CV, if up-to-date
'path_to_parse': "./", # The path of the directory which contains the iodef.xml, field_def, etc files.
# We describe the "CMIP6 source type" (i.e. components assembly) which is the default
# for each model. This value can be changed on a per experiment basis, in experiment_settings file
# However, using a 'configuration' is finer (see below)
# CMIP6 component conventions are described at
# https://github.com/WCRP-CMIP/CMIP6_CVs/blob/master/CMIP6_source_type.json
'source_types' : { "CNRM-CM6-1" : "AOGCM AER", "CNRM-CM6-1-HR" : "AOGCM AER",
"CNRM-ESM2-1": "AOGCM BGC AER CHEM" , "CNRM-ESM2-1-HR": "AOGCM BGC AER" },
# Optional : 'configurations' are shortcuts for a triplet (model, source_type, unused_contexts)
'configurations' : {
"AGCM": ("CNRM-CM6-1" ,"AGCM" , ['nemo']),
"AESM": ("CNRM-ESM2-1" ,"AGCM BGC AER CHEM" , ['nemo']),
"AOGCM": ("CNRM-CM6-1" ,"AOGCM" , [] ),
"AOESM": ("CNRM-ESM2-1" ,"AOGCM BGC AER CHEM" , [] ),
"AGCMHR": ("CNRM-CM6-1-HR","AGCM" , ['nemo']),
"AESMHR": ("CNRM-ESM2-1" ,"AGCM BGC AER" , [] ),
"AOGCMHR":("CNRM-CM6-1-HR","AOGCM" , [] ),
"AOESMHR":("CNRM-ESM2-1" ,"AOGCM BGC AER" , [] ),
"LGCM": ("CNRM-CM6-1" ,"LAND" , ['nemo']),
"LESM": ("CNRM-ESM2-1" ,"LAND BGC" , ['nemo']),
"OGCM": ("CNRM-CM6-1" ,"OGCM" , ['surfex','trip']),
"OESM": ("CNRM-ESM2-1" ,"OGCM BGC" , ['surfex','trip']) },
#'source' : "CNRM-CM6-1", # Useful only if CMIP6_CV is not up to date
'references' : "A character string containing a list of published or web-based "+\
"references that describe the data or the methods used to produce it."+\
"Typically, the user should provide references describing the model"+\
"formulation here",
'info_url' : "http://www.umr-cnrm.fr/cmip6/",
'contact' : 'contact.cmip@meteo.fr',
# We account for the list of MIPS in which the lab takes part.
# Note : a MIPs set limited to {'C4MIP'} leads to a number of tables and
# variables which is manageable for eye inspection
'mips_for_test': {'C4MIP', 'SIMIP', 'OMIP', 'CFMIP', 'RFMIP'} ,
'mips' : {
"LR" : {'AerChemMIP','C4MIP','CFMIP','DAMIP', 'FAFMIP' , 'GeoMIP','GMMIP','ISMIP6',\
'LS3MIP','LUMIP','OMIP','PMIP','RFMIP','ScenarioMIP','CORDEX','SIMIP','CMIP6', 'CMIP'},
"HR" : {'OMIP','ScenarioMIP','CORDEX','CMIP6', 'CMIP'},
},
# A character string containing additional information about the models. Will be complemented
# with the experiment's specific comment string
"comment" : "",
# Max variable priority level to be output (you may set 3 when creating ping_files while
# being more restrictive at run time); values in simulation_settings may override the one below
'max_priority' : 1,
'tierMax' : 1,
# The ping file defines variable names, which are constructed using CMIP6 "MIPvarnames"
# and a prefix which must be set here, and can be the empty string :
"ping_variables_prefix" : "CMIP6_",
# We account for a list of variables which the lab does not want to produce ,
# Names must match DR MIPvarnames (and **NOT** CMOR standard_names)
# excluded_vars_file="../../cnrm/non_published_variables"
"excluded_vars" : ['pfull', 'phalf', "zfull" ], # because we have a pressure based hydrid coordinate,
# and no fixed height levels
# Vars computed with a period which is not the basic timestep must be declared explictly,
# with that period, in order that 'instant' sampling works correctly
# (the units for period should be different from the units of any instant ouput frequency
# for those variables - 'mi' loooks fine, 'ts' may work)
"special_timestep_vars" : {
"60mi" : ['parasolRefl','clhcalipso','cltcalipso','cllcalipso','clmcalipso', \
'cfadLidarsr532','clcalipso','clcalipso2','cfadDbze94', \
'jpdftaureliqmodis','clisccp','jpdftaureicemodis','clmisr'],
},
# You can specifically exclude some pairs (vars,tables), in lab_settings or (higher priority)
# in experiment_settings
"excluded_pairs" : [ ('fbddtalk','Omon') ] ,
# For debugging purpose, if next list has members, this has precedence over
# 'excluded_vars' and over 'excluded_vars_per_config'
#"included_vars" : [ 'ccb' ],
# When atmospheric vertical coordinate implies putting psol in model-level output files, we
# must avoid creating such file_def entries if the model does not actually send the 3D fields
# (because this leads to files full of undefined values)
# We choose to describe such fields as a list of vars dependant on the model configuration
# because the DR is not in a good enough shape about realms for this purpose
"excluded_vars_per_config" : {
"AGCM": [ "ch4", "co2", "co", "concdust", "ec550aer", "h2o", "hcho", "hcl", \
"hno3", "mmrbc", "mmrdust", "mmroa", "mmrso4", "mmrss", \
"n2o", "no2", "no", "o3Clim", "o3loss", "o3prod", "oh", "so2" ],
"AOGCM": [ "ch4", "co2", "co", "concdust", "ec550aer", "h2o", "hcho", "hcl", \
"hno3", "mmrbc", "mmrdust", "mmroa", "mmrso4", "mmrss", \
"n2o", "no2", "no", "o3Clim", "o3loss", "o3prod", "oh", "so2" ],
},
#
"excluded_spshapes": ["XYA-na","XYG-na", # GreenLand and Antarctic grids we do not want to produce
"na-A", # RFMIP.OfflineRad : rld, rlu, rsd, rsu in table Efx ?????
"Y-P19","Y-P39", "Y-A","Y-na" # Not yet handled by dr2xml
],
"excluded_tables" : ["Oclim" , "E1hrClimMon" , "ImonAnt", "ImonGre" ] , # Clims are not handled by Xios yet
# For debugging purpose : if next list has members, only those tables will be processed
#"included_tables" : ["AMon" ] , # If not empty, has priority over excluded_tables
"excluded_request_links" : [
"RFMIP-AeroIrf" # 4 scattered days of historical, heavy output -> please rerun model for one day
# for each day of this request_link
],
# For debugging purpose : if next list has members, only those requestLinks will be processed
"included_request_links" : [ ],
# We account for a default list of variables which the lab wants to produce in most cases
# This can be changed at the experiment_settings level
#"listof_home_vars":"../../cnrm/listof_home_vars.txt",
# If we use extra tables, we can set it here (and supersed it in experiment settings)
#'path_extra_tables'=
# Each XIOS context does adress a number of realms
'realms_per_context' : {
'nemo': ['seaIce', 'ocean', 'ocean seaIce', 'ocnBgchem', 'seaIce ocean'] ,
'arpsfx' : ['atmos', 'atmos atmosChem', 'atmosChem', 'aerosol', 'atmos land', 'land',
'landIce land', 'aerosol','land landIce', 'landIce', ],
'trip' : [],
},
# Some variables, while belonging to a realm, may fall in another XIOS context than the
# context which hanldes that realm
'orphan_variables' : {
'trip' : ['dgw', 'drivw', 'fCLandToOcean', 'qgwr', 'rivi', 'rivo', 'waterDpth', 'wtd'],
},
'vars_OK' : dict(),
# A per-variable dict of comments valid for all simulations
'comments' : {
'rld' : 'nothing special about this variable'
},
#
'grid_choice' : { "CNRM-CM6-1" : "LR", "CNRM-CM6-1-HR" : "HR",
"CNRM-ESM2-1": "LR" , "CNRM-ESM2-1-HR": "HR" },
# Sizes for atm and oce grids (cf DR doc); Used for computing file split frequency
"sizes" : { "LR" : [292*362 , 75, 128*256, 91, 30, 14, 128],
"HR" : [1442*1021, 75, 720*360, 91, 30, 14, 128] },
#
# What is the maximum size of generated files, in number of float values
"max_file_size_in_floats" : 2000.*1.e+6 , # 2 Giga octets
# Required NetCDF compression level
"compression_level" : 0,
# Estimate of number of bytes per floating value, given the chosen compresssion level
"bytes_per_float" : 2.0,
# grid_policy among None, DR, native, native+DR, adhoc- see docin grids.py
"grid_policy" : "adhoc",
# Grids : per model resolution and per context :
# - CMIP6 qualifier (i.e. 'gn' or 'gr') for the main grid chosen (because you
# may choose has main production grid a regular one, when the native grid is e.g. unstructured)
# - Xios id for the production grid (if it is not the native grid),
# - Xios id for the latitude axis used for zonal means (mist match latitudes for grid above)
# - resolution of the production grid (using CMIP6 conventions),
# - grid description
"grids" : {
"LR" : {
"surfex" : [ "gr","complete" , "glat", "250 km", "data regridded to a T127 gaussian grid (128x256 latlon) from a native atmosphere T127l reduced gaussian grid"] ,
"trip" : [ "gn", "" , "" , "50 km" , "regular 1/2 deg lat-lon grid" ],
"nemo" : [ "gn", "" , "" , "100 km" , "native ocean tri-polar grid with 105 k ocean cells" ],},
"HR" : {
"surfex" : [ "gr","complete" , "glat", "50 km", "data regridded to a 359 gaussian grid (180x360 latlon) from a native atmosphere T359l reduced gaussian grid"] ,
"trip" : [ "gn", "" , "" , "50 km" , "regular 1/2 deg lat-lon grid" ],
"nemo" : [ "gn", "" , "" , "25 km" , "native ocean tri-polar grid with 1.47 M ocean cells" ],},
},
# "nb_longitudes_in_model": { "surfex" :"ndlon", "nemo": "" },
#
# Basic sampling timestep set in your field definition (used to feed metadata 'interval_operation')
"sampling_timestep" : {
"LR" : { "surfex":900., "nemo":1800. },
"HR" : { "surfex":900., "nemo":1800. },
},
# We create sampled time-variables for controlling the frequency of vertical interpolations
"vertical_interpolation_sample_freq" : "3h",
"vertical_interpolation_operation" : "instant", # LMD prefers 'average'
#--- Say if you want to use XIOS union/zoom axis to optimize vertical interpolation requested by the DR
"use_union_zoom" : False,
# The CMIP6 frequencies that are unreachable for a single model run. Datafiles will
# be labelled with dates consistent with content (but not with CMIP6 requirements).
# Allowed values are only 'dec' and 'yr'
"too_long_periods" : ["dec", "yr" ] ,
# Describe the branching scheme for experiments involved in some 'branchedYears type' tslice
# Just put the start year in child and the start years in parent for all members
"branching" : { "historical" : (1850, [ 2350, 2400, 2450 ]) },
# We can control the max output level set for all output files,
"output_level" : 10,
# For debug purpose, you may slim down xml files by setting next entry to False
"print_variables" : True ,
# Set that to True if you use a context named 'nemo' and the
# corresponding model unduly sets a general freq_op AT THE
# FIELD_DEFINITION GROUP LEVEL. Due to Xios rules for inheritance,
# that behavior prevents inheriting specific freq_ops by reference
# from dr2xml generated field_definitions
"nemo_sources_management_policy_master_of_the_world" : False,
}
""" An example/template of settings for a simulation """
example_simulation_settings={
# Dictionnary describing the necessary attributes for a given simulation
# Warning : some lines are commented out in this example but should be
# un-commented in some cases. See comments
# DR experiment name to process. See http://clipc-services.ceda.ac.uk/dreq/index/experiment.html
"experiment_id" : "historical",
# Experiment label to use in file names and attribute, (default is experiment_id)
#"expid_in_filename" : "myexpe",
# If there is no configuration in lab_settings which matches you case, please rather
# use next or next two entries : source_id and, if needed, source_type
'configuration' : 'AOGCM',
#'source_id' : "CNRM-CM6-1",
#'source_type' : "OGCM" ,# If the default source-type value for your source (from lab settings)
# does not fit, you may change it here.
# "This should describe the model most directly responsible for the
# output. Sometimes it is appropriate to list two (or more) model types here, among
# AER, AGCM, AOGCM, BGC, CHEM, ISM, LAND, OGCM, RAD, SLAB "
# e.g. amip , run with CNRM-CM6-1, should quote "AGCM AER"
# Also see note 14 of https://docs.google.com/document/d/1h0r8RZr_f3-8egBMMh7aqLwy3snpD6_MrDz1q8n5XUk/edit
#"contact" : "", set it only if it is specific to the simualtion
#"project" : "CMIP6", #CMIP6 is the default
#'max_priority' : 1, # a simulation may be run with a max_priority which overrides the one in lab_settings
#'tierMax' : 1, # a simulation may be run with a Tiermax overrides the one in lab_settings
# It is recommended that some description be included to help
# identify major differences among variants, but care should be
# taken to record correct information. dr2xml will add in all cases:
# 'Information provided by this attribute may in some cases be
# flawed. Users can find more comprehensive and up-to-date
# documentation via the further_info_url global attribute.'
"variant_info" : "Start date after 300 years of control run",
#
"realization_index" : 1, # Value may be omitted if = 1
"initialization_index" : 1, # Value may be omitted if = 1
"physics_index" : 1, # Value may be omitted if = 1
"forcing_index" : 3, # Value may be omitted if = 1
#
# All about the branching scheme from parent
"branch_method" : "standard", # default value='standard' meaning ~ "select a start date"
# (this is not necessarily the parent start date)
'parent_time_ref_year' : 1850, # MUST BE CONSISTENT WITH THE TIME UNITS OF YOUR MODEL(S) !!!
"branch_year_in_parent": 2150, # if your calendar is Gregorian, you can specify the branch year in parent directly
# This is an alternative to using "branch_time_in_parent"
#"branch_time_in_parent": "365.0D0", # a double precision value, in days, used if branch_year_in_parent is not applicable
# This is an alternative to using "branch_year_in_parent"
#'parent_time_units' : "" #in case it is not the same as child time units
"branch_year_in_child" : 1850, # if your calendar is Gregorian, you can specify the branch year in child directly
# This is an alternative to using "branch_time_in_child"
'child_time_ref_year' : 1850, # MUST BE CONSISTENT WITH THE TIME UNITS OF YOUR MODEL(S) !!!
# (this is not necessarily the parent start date)
# the ref_year for a scenario must be the same as for the historical
#"branch_time_in_child" : "0.0D0", # a double precision value in child time units (days),
# This is an alternative to using "branch_year_in_child"
#'parent_variant_label' :"" #Default to 'same variant as child'. Other cases should be exceptional
#"parent_mip_era" : 'CMIP5' # only in special cases (as e.g. PMIP warm
# start from CMIP5/PMIP3 experiment)
#'parent_source_id' : 'CNRM-CM5.1' # only in special cases, where parent model
# is not the same model
#
"sub_experiment_id" : "None", # Optional, default is 'none'; example : s1960.
"sub_experiment" : "None", # Optional, default in 'none'
"history" : "None", #Used when a simulation is re-run, an output file is modified ...
# A character string containing additional information about this simulation
"comment" : "",
# You can specifically exclude some pairs (vars,tables), in lab_settings or (higher priority)
# in experiment_settings (means that an empty list in experiment settings supersedes a non-empty one
# in lab_settings
# "excluded_pairs" : [ ('fbddtalk','Omon') ]
# A per-variable dict of comments which are specific to this simulation. It will replace
# the all-simulation comment
'comments' : {
'tas' : 'this is a dummy comment, placeholder for describing a special, simulation dependent, scheme for a given variable',
},
# We can supersede the default list of variables of lab_settings, which tells
# which additionnal variables/frequecny are to produce
#"listof_home_vars":"../../cnrm/home_vars_historical.txt",
# If we use extra tables, we can here supersede the value set it in lab settings
#'path_extra_tables'=
'unused_contexts' : [ ] # If you havn't set a 'configuration', you may fine tune here
}
#def hasCMORVarName(hmvar):
# for cmvar in dq.coll['CMORvar'].items:
# if (cmvar.label==hmvar.label): return True
def RequestItem_applies_for_exp_and_year(ri,experiment,lset,sset,year=None,debug=False):
"""
Returns True if requestItem 'ri' in data request 'dq' (global) is relevant
for a given 'experiment' and 'year'. Toggle 'debug' allow some printouts
"""
# Returns a couple : relevant, endyear.
# RELEVANT is True if requestItem RI applies to EXPERIMENT and
# has a timeslice wich includes YEAR, either implicitly or explicitly
# ENDYEAR is meaningful if RELEVANT is True, and is the
# last year in the timeslice (or None if timeslice ==
# the whole experiment duration)
# Acces experiment or experiment group for the RequestItem
#if (ri.label=='AerchemmipAermonthly3d') : debug=True
if (debug) : print "In RIapplies.. Checking ","% 15s"%ri.title,
item_exp=dq.inx.uid[ri.esid]
ri_applies_to_experiment=False
endyear=None
# esid can link to an experiment or an experiment group
if item_exp._h.label== 'experiment' :
if (debug) : print "%20s"%"Simple Expt case", item_exp.label,
if item_exp.label==experiment :
if (debug) : print " OK",
ri_applies_to_experiment=True
elif item_exp._h.label== 'exptgroup' :
if (debug) : print "%20s"%"Expt Group case ",item_exp.label,
exps_id=dq.inx.iref_by_sect[ri.esid].a['experiment']
for e in [ dq.inx.uid[eid] for eid in exps_id ] :
if e.label==experiment :
if (debug) : print " OK for experiment based on group"+\
group_id.label,
ri_applies_to_experiment=True
elif item_exp._h.label== 'mip' :
if (debug) : print "%20s"%"Mip case ",dq.inx.uid[mip_id].label,
exps_id=dq.inx.iref_by_sect[ri.esid].a['experiment']
for e in [ dq.inx.uid[eid] for eid in exps_id ] :
if (debug) : print e.label,",",
if e.label==experiment :
if (debug) : print " OK for experiment based on mip"+ mip_id.label,
ri_applies_to_experiment=True
else :
if (debug) :
print "Error on esid link for ri : %s uid=%s %s"%\
( ri.title, ri.uid, item_exp._h.label)
#print "ri=%s"%ri.title,
if ri_applies_to_experiment :
if year is None :
rep=True ; endyear=None
else :
rep,endyear=year_in_ri(ri,experiment,lset,sset,year,debug=debug)
#if (ri.label=="AerchemmipAermonthly3d") :
# print "reqItem=%s,experiment=%s,year=%d,rep=%s,"%(ri.label,experiment,year,rep)
#print " rep=",rep
return rep,endyear
else :
#print
return False,None
def year_in_ri(ri,experiment,lset,sset,year,debug=False):
if 'tslice' in ri.__dict__ :
rep,endyear=year_in_ri_tslice(ri,experiment,lset,year,debug=debug)
return rep,endyear
try :
ny=int(ri.nymax)
first_year=sset["branch_year_in_child"]
if (ny > 0) : endyear=first_year+ny-1
else :
# assume that it means : whole experiment duration
# TBD : year_in_ri : endyear is not meaningful for some cases
endyear=first_year+10000
applies=(year <= endyear)
return applies,endyear
except:
print "Cannot tell if year %d applies to reqItem %s -> assumes yes"%(year,ri.title)
return True,None
def year_in_ri_tslice(ri,experiment,lset,year,debug=False):
# Returns a couple : relevant, endyear.
# RELEVANT is True if requestItem RI applies to
# YEAR, either implicitly or explicitly (e.g. timeslice)
# ENDYEAR, which is meaningful if RELEVANT is True, and is the
# last year in the timeslice (or None if timeslice ==
# the whole experiment duration)
if 'tslice' not in ri.__dict__ :
if (debug) : print "No tslice for reqItem %s -> OK for any year"%ri.title
return True, None
if ri.tslice == '__unset__' :
if (debug) : print "tslice is unset for reqItem %s "%ri.title
return True, None
#
relevant=False
endyear=None
tslice=dq.inx.uid[ri.tslice]
if (debug) :
print "tslice label/type is %s/%s for reqItem %s "%(tslice.label,tslice.type,ri.title)
if tslice.type=="simpleRange" : # e.g. _slice_DAMIP20
relevant = (year >= tslice.start and year<=tslice.end)
endyear=tslice.end
elif tslice.type=="sliceList": # e.g. _slice_DAMIP40
for start in range(tslice.start,int(tslice.end-tslice.sliceLen+2),int(tslice.step)) :
if year >= start and year < start+tslice.sliceLen :
relevant = True
endyear=start+tslice.sliceLen-1
elif tslice.type=="dayList": # e.g. _slice_RFMIP2
# e.g. startList[i]: [1980, 1, 1, 1980, 4, 1, 1980, 7, 1, 1980, 10, 1, 1992, 1, 1, 1992, 4, 1]
years= [ tslice.startList[3*i] for i in range(len(tslice.startList)/3)]
if year in years :
relevant=True
endyear=year
elif tslice.type=="startRange": # e.g. _slice_VolMIP3
#start_year=experiment_start_year(experiment)
# TBD : code experiment_start_year (used for VolMIP : _slice_VolMIP3)
start_year=1850
relevant= (year >= start_year and year < start_year+nyear)
endyear=start_year + nyear - 1
elif tslice.type=="monthlyClimatology": # e.g. _slice_clim20
relevant = (year >= tslice.start and year<=tslice.end)
endyear=tslice.end
elif tslice.type=="branchedYears" : # e.g. _slice_piControl020
if tslice.child in lset["branching"] :
endyear=False
(refyear,starts)=lset["branching"][tslice.child]
for start in starts :
if ((year - start >= tslice.start - refyear) and \
(year - start < tslice.start - refyear + tslice.nyears )):
relevant=True
lastyear=start+tslice.nyears-1
if endyear is False : endyear=lastyear
else : endyear=max(endyear,lastyear)
else : dr2xml_error("For tslice %s, child %s start year is not documented"%\
(tslice.title, tslice.child))
else :
dr2xml_error("type %s for time slice %s is not handled"%(tslice.type,tslice.title))
if (debug) :
print "for year %d and experiment %s, relevant is %s for tslice %s of type %s, endyear=%s"%\
(year,experiment,`relevant`,ri.title,tslice.type,`endyear`)
return relevant,endyear
def select_CMORvars_for_lab(lset, sset=None, year=None,printout=False):
"""
A function to list CMOR variables relevant for a lab (and also,
optionnally for an experiment and a year)
Args:
lset (dict): laboratory settings; used to provide the list of MIPS,
the max Tier, and a list of excluded variable names
sset (dict): simulation settings, used for indicating source_type,
max priority (and for filtering on the simulation if
year is notNone)
if sset is None, use union of mips among all grid choices
year (int,optional) : simulation year - used to filter the request
for an experiment and a year
Returns:
A list of 'simplified CMOR variables'
"""
#
debug=False
# From MIPS set to Request links
global sc,global_rls,grid_choice
if sset and 'tierMax' in sset : tierMax=sset['tierMax']
else: tierMax=lset['tierMax']
sc = dreqQuery(dq=dq, tierMax=tierMax)
# Set sizes for lab settings, if available (or use CNRM-CM6-1 defaults)
mcfg = collections.namedtuple( 'mcfg', \
['nho','nlo','nha','nla','nlas','nls','nh1'] )
if sset :
source,source_type=get_source_id_and_type(sset,lset)
grid_choice=lset["grid_choice"][source]
mips_list=set(lset['mips'][grid_choice])
sizes=lset["sizes"][grid_choice] #sizes=lset.get("sizes",[259200,60,64800,40,20,5,100])
sc.mcfg = mcfg._make( sizes )._asdict()
else :
mips_list= set()
for grid in lset['mips'] : mips_list=mips_list.union(set(lset['mips'][grid]))
rls_for_mips=sc.getRequestLinkByMip(mips_list)
if printout :
print "Number of Request Links which apply to MIPS",
print mips_list," is: ", len(rls_for_mips)
#
excluded_rls=[]
for rl in rls_for_mips :
if rl.label in lset.get("excluded_request_links",[]) :
excluded_rls.append(rl)
for rl in excluded_rls : rls_for_mips.remove(rl)
#
excluded_rls=[]
inclinks=lset.get("included_request_links",[])
if len(inclinks) > 0 :
for rl in rls_for_mips :
if rl.label not in inclinks : excluded_rls.append(rl)
for rl in excluded_rls :
print "RequestLink %s is not included"%rl.label
rls_for_mips.remove(rl)
#
if sset and year :
experiment_id=sset['experiment_id']
#print "Request links before filter :"+`[ rl.label for rl in rls_for_mips ]`
filtered_rls=[]
for rl in rls_for_mips :
# Access all requesItems ids which refer to this RequestLink
ri_ids=dq.inx.iref_by_sect[rl.uid].a['requestItem']
for ri_id in ri_ids :
ri=dq.inx.uid[ri_id]
if debug : print "Checking requestItem ",ri.label,
applies,endyear= RequestItem_applies_for_exp_and_year(ri,
experiment_id, lset,sset,year,False)
if applies:
if debug : print " applies "
filtered_rls.append(rl)
else :
if debug : print " does not apply "
rls=filtered_rls
if printout :
print "Number of Request Links which apply to experiment ", \
experiment_id,"and MIPs", mips_list ," is: ",len(rls)
#print "Request links that apply :"+`[ rl.label for rl in filtered_rls ]`
else :
rls=rls_for_mips
global_rls=rls
# From Request links to CMOR vars + grid
#miprl_ids=[ rl.uid for rl in rls ]
#miprl_vars=sc.varsByRql(miprl_ids, pmax=lset['max_priority'])
if sset and 'max_priority' in sset :
pmax=sset['max_priority']
else :
pmax=lset['max_priority']
miprl_vars_grids=[]
for rl in rls :
rl_vars=sc.varsByRql([rl.uid], pmax=pmax)
for v in rl_vars :
# The requested grid is given by the RequestLink except if spatial shape matches S-*
gr=rl.grid
cmvar=dq.inx.uid[v]
st=dq.inx.uid[cmvar.stid]
sp=dq.inx.uid[st.spid]
if sp.label[0:2]=="S-" : gr='cfsites'
if (v,gr) not in miprl_vars_grids :
miprl_vars_grids.append((v,gr))
if printout :
print 'Number of (CMOR variable, grid) pairs for these requestLinks is :%s'%len(miprl_vars_grids)
#
inctab=lset.get("included_tables",[])
exctab=lset.get("excluded_tables",[])
incvars=lset.get('included_vars',[])
excvars=lset.get('excluded_vars',[])
if sset :
config=sset['configuration']
if ('excluded_vars_per_config' in lset) and \
(config in lset['excluded_vars_per_config']):
excvars.extend(lset['excluded_vars_per_config'][config])
excpairs=sset.get('excluded_pairs',lset.get('excluded_pairs',[]))
filtered_vars=[]
for (v,g) in miprl_vars_grids :
cmvar=dq.inx.uid[v]
ttable=dq.inx.uid[cmvar.mtid]
mipvar=dq.inx.uid[cmvar.vid]
if ((len(incvars) == 0 and mipvar.label not in excvars) or\
(len(incvars) > 0 and mipvar.label in incvars))\
and \
((len(inctab)>0 and ttable.label in inctab) or \
(len(inctab)==0 and ttable.label not in exctab))\
and \
((mipvar.label,ttable.label) not in excpairs) :
filtered_vars.append((v,g))
#if ("clwvi" in mipvar.label) : print "adding var %s, ttable=%s, exctab="%(cmvar.label,ttable.label),exctab,excvars
else:
#if (ttable.label=="Ofx") : print "discarding var %s, ttable=%s, exctab="%(cmvar.label,ttable.label),exctab
pass
if printout :
print 'Number once filtered by excluded/included vars and tables and spatial shapes is : %s'%len(filtered_vars)
# Filter the list of grids requested for each variable based on lab policy
d=dict()
for (v,g) in filtered_vars :
if v not in d : d[v]=set()
d[v].add(g)
if printout :
print 'Number of distinct CMOR variables (whatever the grid) : %d'%len(d)
multiple_grids=[]
for v in d:
d[v]=decide_for_grids(v,d[v],lset,dq)
if printout and len(d[v]) > 1 :
multiple_grids.append(dq.inx.uid[v].label)
if print_multiple_grids :
print "\tVariable %s will be processed with multiple grids : %s"%(dq.inx.uid[v].label,`d[v]`)
if not print_multiple_grids :
multiple_grids.sort()
print "\tThese variables will be processed with multiple grids "+\
"(rerun with print_multiple_grids set to True for details) :"+`multiple_grids`
#
# Print a count of distinct var labels
if printout :
varlabels=set()
for v in d : varlabels.add(dq.inx.uid[v].label)
print 'Number of distinct var labels is :',len(varlabels)
# Translate CMORvars to a list of simplified CMORvar objects
simplified_vars = []
for v in d :
svar = simple_CMORvar()
cmvar = dq.inx.uid[v]
#if cmvar.mipTable=="Ofx" : print "Got an Ofx var : ",cmvar.label
complement_svar_using_cmorvar(svar,cmvar,dq,sn_issues)
svar.Priority=analyze_priority(cmvar,mips_list)
svar.grids=d[v]
simplified_vars.append(svar)
print '\nNumber of simplified vars is :',len(simplified_vars)
print "Issues with standard names are :"
for iss in sn_issues : print "\t"+iss+" vars : "+`sn_issues[iss]`
return simplified_vars
def analyze_priority(cmvar,lmips):
"""
Returns the max priority of the CMOR variable, for a set of mips
"""
prio=cmvar.defaultPriority
rv_ids=dq.inx.iref_by_sect[cmvar.uid].a['requestVar']
for rv_id in rv_ids :
rv=dq.inx.uid[rv_id]
vg=dq.inx.uid[rv.vgid]
if vg.mip in lmips :
if rv.priority < prio : prio=rv.priority
return prio
def wr(out,key,dic_or_val=None,num_type="string",default=None) :
global print_wrv
if not print_wrv : return
"""
Short cut for a repetitive pattern : writing in 'out'
a string variable name and value
If dic_or_val is not None
if dic_or_val is a dict,
if key is in value is dic_or_val[key],
otherwise use default as value , except if default is False
otherwise, use arg dic_or_val as value if not None nor False,
otherwise use value of local variable 'key'
"""
val=None
if type(dic_or_val)==type({}) :
if key in dic_or_val : val=dic_or_val[key]
else :
if default is not None :
if default is not False : val=default
else :
print 'error : %s not in dic and default is None'%key
else :
if dic_or_val is not None : val=dic_or_val
else :
print 'error in wr, no value provided for %s'%key
if val :
if num_type == "string" :
#val=val.replace(">",">").replace("<","<").replace("&","&").replace("'","&apos").replace('"',""").strip()
val=val.replace(">",">").replace("<","<").strip()
#CMIP6 spec : no more than 1024 char
val=val[0:1024]
if num_type != "string" or len(val) > 0 :
out.write(' <variable name="%s" type="%s" > %s '%(key,num_type,val))
out.write(' </variable>\n')
def freq2datefmt(in_freq,operation,lset):
# WIP doc v6.2.3 - Apr. 2017: <time_range> format is frequency-dependant
datefmt=False
offset=None
freq=in_freq
if freq == "dec" or freq == "10y":
if not any( "dec" in f for f in lset.get("too_long_periods",[])) :
datefmt="%y"
if operation in ["average","minimum","maximum"] : offset="5y"
else : offset="10y"
else : freq="yr" #Ensure dates in filenames are consistent with content, even if not as required
if freq == "yr" or freq == "yrPt" or freq == "1y":
if not any( "yr" in f for f in lset.get("too_long_periods",[])) :
datefmt="%y"
if operation in ["average","minimum","maximum"] : offset=False
else : offset="1y"
else : freq="mon" #Ensure dates in filenames are consistent with content, even if not as required
if freq in ["mon","monC","monPt", "1mo"]:
datefmt="%y%mo"
if operation in ["average","minimum","maximum"] : offset=False
else : offset="1mo"
elif freq=="day" or freq=="1d":
datefmt="%y%mo%d"
if operation in ["average","minimum","maximum"] : offset="12h"
else : offset="1d"
elif freq=="10day" or freq=="10d":
datefmt="%y%mo%d"
if operation in ["average","minimum","maximum"] : offset="30h"
else : offset="2.5d"
elif freq=="5day" or freq=="5d":
datefmt="%y%mo%d"
if operation in ["average","minimum","maximum"] : offset="60h"
else : offset="5d"
elif freq in ["6hr","6hrPt","3hr","3hrPt","3hrClim","1hr","1hrPt","hr","6h", "3h", "1h"]:
datefmt="%y%mo%d%h%mi"
if freq=="6hr" or freq=="6hrPt" or freq=="6h":
if operation in ["average","minimum","maximum"] : offset="3h"
else : offset="6h"
elif freq in [ "3hr", "3hrPt", "3hrClim","3h"] :
if operation in ["average","minimum","maximum"] : offset="90mi"
else : offset="3h"
elif freq in ["1hr","1h", "hr", "1hrPt"]:
if operation in ["average","minimum","maximum"] : offset="30mi"
else : offset="1h"
elif freq in ["1hrClimMon" , "1hrCM" ]:
return "%y%mo%d%h%mi","0s","0s"
offset="0s"
elif freq=="subhr" or freq=="subhrPt" or freq=="1ts":
datefmt="%y%mo%d%h%mi%s"
# assume that 'subhr' means every timestep
if operation in ["average","minimum","maximum"] :
# Does it make sense ??
offset="0.5ts"
else : offset="1ts"
elif "fx" in freq :
pass ## WIP doc v6.2.3 - Apr. 2017: if frequency="fx", [_<time_range>] is ommitted
if offset is not None:
if operation in ["average","minimum","maximum"] :
if offset is not False : offset_end="-"+offset
else: offset_end=False
else : offset_end="0s"
else:
offset="0s"; offset_end="0s"
if not "fx" in freq :
raise dr2xml_error("Cannot compute offsets for freq=%s and operation=%s"%(freq,operation))
return datefmt,offset,offset_end
def write_xios_file_def(sv,year,table,lset,sset,out,cvspath,
field_defs,axis_defs,grid_defs,domain_defs,
dummies,skipped_vars_per_table,actually_written_vars,
prefix,context,grid,pingvars=None,enddate=None,
attributes=[],debug=[]) :
"""
Generate an XIOS file_def entry in out for :
- a dict for laboratory settings
- a dict of simulation settings
- a 'simplifed CMORvar' sv
- which all belong to given table
- a path 'cvs' for Controlled Vocabulary
Lenghty code, but not longer than the corresponding specification document
1- After a prologue, attributes valid for all variables are
written as file-level metadata, in the same order than in
WIP document;
2- Next, field-level metadata are written
3- For 3D variables in model levels or half-levels, also write the auxilliary
variables requested by CF convention (e.g. for hybrid coordinate, surface_pressure field
plus AP and B arrays and their bounds, and lev + lev_bnds with formula attribute)
"""
#
global sc #,nlonz
# gestion des attributs pour lesquels on a recupere des chaines vides (" " est Faux mais est ecrit " "")
#--------------------------------------------------------------------
# Put a warning for field attributes that shouldn't be empty strings
#--------------------------------------------------------------------
if not sv.stdname : sv.stdname = "missing" #"empty in DR "+dq.version
if not sv.long_name : sv.long_name = "empty in DR "+dq.version
#if not sv.cell_methods : sv.cell_methods = "empty in DR "+dq.version
#if not sv.cell_measures : sv.cell_measures = "cell measure is not specified in DR "+dq.version
if not sv.stdunits : sv.stdunits = "empty in DR "+dq.version
#--------------------------------------------------------------------
# Define alias for field_ref in file-def file
# - may be replaced by alias1 later
# - this is not necessarily the alias used in ping file because of
# intermediate field id(s) due to union/zoom
#--------------------------------------------------------------------
# We use a simple convention for variable names in ping files :
if sv.type=='perso' : alias=sv.label
else:
# MPM : si on a defini un label non ambigu alors on l'utilise comme alias (i.e. le field_ref)
# et pour l'alias seulement (le nom de variable dans le nom de fichier restant svar.label)
if sv.label_non_ambiguous: alias=lset["ping_variables_prefix"]+sv.label_non_ambiguous
else:
# 'tau' is ambiguous in DR 01.00.18 : either a variable name (stress)
# or a dimension name (optical thickness). We choose to rename the stress
if sv.label != "tau" :
alias=lset["ping_variables_prefix"]+sv.label
else:
alias=lset["ping_variables_prefix"]+"tau_stress"
if (sv.label in debug) : print "write_xios_file_def ... processing %s, alias=%s"%(sv.label,alias)
# suppression des terminaisons en "Clim" pour l'alias : elles concernent uniquement les cas
# d'absence de variation inter-annuelle sur les GHG. Peut-etre genant pour IPSL ?
# Du coup, les simus avec constance des GHG (picontrol) sont traitees comme celles avec variation
split_alias=alias.split("Clim")
alias=split_alias[0]
if pingvars is not None :
# Get alias without pressure_suffix but possibly with area_suffix
alias_ping=ping_alias(sv,lset,pingvars)
if not alias_ping in pingvars:
table=sv.mipTable
if table not in skipped_vars_per_table: skipped_vars_per_table[table]=[]
skipped_vars_per_table[table].append(sv.label+"("+str(sv.Priority)+")")
return
#
#--------------------------------------------------------------------
# Set global CMOR file attributes
#--------------------------------------------------------------------
#
project=sset.get('project',"CMIP6")
source_id,source_type=get_source_id_and_type(sset,lset)
experiment_id=sset['experiment_id']
institution_id=lset['institution_id']
#
contact=sset.get('contact',lset.get('contact',None))
#
# Variant matters
realization_index=sset.get('realization_index',1)
initialization_index=sset.get('initialization_index',1)
physics_index=sset.get('physics_index',1)
forcing_index=sset.get('forcing_index',1)
variant_label="r%di%dp%df%d"%(realization_index,initialization_index,\
physics_index,forcing_index)
variant_info_warning=". Information provided by this attribute may in some cases be flawed. "+\
"Users can find more comprehensive and up-to-date documentation via the further_info_url global attribute."
#
# WIP Draft 14 july 2016
mip_era=sv.mip_era
#
# WIP doc v 6.2.0 - dec 2016
# <variable_id>_<table_id>_<source_id>_<experiment_id >_<member_id>_<grid_label>[_<time_range>].nc
member_id=variant_label
sub_experiment_id=sset.get('sub_experiment_id','none')
if sub_experiment_id != 'none': member_id = sub_experiment_id+"-"+member_id
#
#--------------------------------------------------------------------
# Set grid info
#--------------------------------------------------------------------
if grid == "" :
# either native or close-to-native
grid_choice=lset['grid_choice'][source_id]
grid_label,target_hgrid_id,zgrid_id,grid_resolution,grid_description=\
lset['grids'][grid_choice][context]
else:
if grid == 'cfsites' :
target_hgrid_id=cfsites_domain_id
zgrid_id=None
else:
target_hgrid_id=lset["ping_variables_prefix"]+grid
zgrid_id="TBD : Should create zonal grid for CMIP6 standard grid %s"%grid
grid_label,grid_resolution,grid_description=DRgrid2gridatts(grid)
if table[-1:] == "Z" : # e.g. 'AERmonZ','EmonZ', 'EdayZ'
grid_label+="z"
# Below : when reduction was done trough a two steps sum, we needed to divide afterwards
# by the nmber of longitudes
#
# if lset.has_key("nb_longitudes_in_model") and lset["nb_longitudes_in_model"][context]:
# # Get from settings the name of Xios variable holding number of longitudes and set by model
# nlonz=lset["nb_longitudes_in_model"][context] # e.g.: nlonz="ndlon"
# elif context_index.has_key(target_hgrid_id):
# # Get the number of longitudes from xml context_index
# # an integer if attribute of the target horizontal grid, declared in XMLs: nlonz=256
# nlonz=context_index[target_hgrid_id].attrib['ni_glo']
# else:
# raise(dr2xml_error("Fatal: Cannot access the number of longitudes (ni_glo) for %s\
# grid required for zonal means computation "%target_hgrid_id))
# print ">>> DBG >>> nlonz=", nlonz
if "Ant" in table : grid_label+="a"