-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathmelixir.tex
More file actions
1027 lines (862 loc) · 95.5 KB
/
melixir.tex
File metadata and controls
1027 lines (862 loc) · 95.5 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
\documentclass[a4paper,10pt]{article}
\usepackage[margin=2.5cm]{geometry}
\usepackage[table,dvipsnames]{xcolor}
\usepackage{setup}
\usepackage[T1]{fontenc}
\usepackage{mathpartir}
\usepackage{amsmath}
\usepackage{amssymb}
\usepackage{stmaryrd}
\usepackage{inconsolata}
%\usepackage{libertine}
\usepackage{hyperref}
\usepackage{bbold}
% Add biblatex
\usepackage[style=alphabetic,backend=biber]{biblatex}
\addbibresource{biblio.bib}
\usepackage{calc}
\author{Aghilas Y. Boussaa and Giuseppe Castagna}
\sloppy
\begin{document}
\section{Introduction}
\beppe{Cut and paste from the Programming paper Appendix}
GenServer is a behaviour of Elixir standard library that abstracts the
common client-server interaction. It provides the boilerplate to
supervise sync and async calls. The documentation of the latest
version (v1.15.6) can be found at \url{https://hexdocs.pm/elixir/1.15.6/GenServer.html}.
A schematic description of its definition (written using Typespec)
is given next, where we use colors to highlight the
interesting parts
%\pagebreak
\begin{minted}[escapeinside=!!,fontsize=\footnotesize,baselinestretch=.9]{elixir}
defmodule GenServer do
# Types
@type !\color{typeAqua}option()! :: {:debug, debug()} | {:name, name()} | ... !\color{typeAqua}//transparent\label{001}!
@type !\color{typeAqua}result()! :: {:reply, reply(), state()} | ... !\color{typeAqua}//transparent\label{002}!
@type !\color{typeBlue}state()! :: term() !\color{typeBlue}//opaque\label{003}!
@type !\color{bordeaux}request()! :: term() !\color{bordeaux}//parameter\label{004}!
@type !\color{bordeaux}reply()! :: term() !\color{bordeaux}//parameter\label{005}!
!\etc!
# Callbacks
@callback init(init_arg :: term()) :: {:ok, state()} | :error
@optional_callback handle_call(request(),pid(),state()) :: result()
@optional_callback handle_cast(request(),state()) :: result()
!\etc!
# Functions
@spec start(!{\color{red}module()}!, any(), options()) :: on_start() !\color{red}//higher-order\negspace\label{009}!
def start(Mod, arg, opt) do
!\etc!
end
\end{minted}
Let us make some observations for each section:
\begin{itemize}
\item \textbf{Types.} The types \elix{option()} and \elix{result()} are completely defined and shared by all implementations of the GenServer behaviour, thus they must be transparently exported. The type \elix{state()} is the type of internal state of a server and can be manipulated only by the functions of each implementation, thus it must be opaquely exported. The types \elix{request()} and \elix{reply()} describe the messages sent for requests and replies; as for \elix{state()}, they are specific to each implementation, but they must be public so that processes can perform correct requests; therefore they are parameter of the implementations.
\item \textbf{Callbacks.} Callbacks are either mandatory or optional. From a typing point of view this is akin to the optional and mandatory fields for map types.
\item \textbf{Functions.} The first argument of the \elix{start} function must be a module, but in practice it cannot be any module: it must be the same module that implements the behaviour or, at least (to type-check), a module implementing exactly the same behaviour.
\end{itemize}
What do these observations imply on the definition of types? If
behaviours are to be promoted to (module) types, then we expect the
GenServer behaviour to be defined as follows (we use a mock-up syntax
combined with our types):
\begin{minted}[escapeinside=!!,fontsize=\footnotesize,baselinestretch=.9]{elixir}
defmodule !\negspace\textbf{\color{darkgreen}type}! GenServer(!{\color{bordeaux}request, reply}!) do
# Types
!\textcolor{typeGreen}{type}! option() = {:debug, debug()} | {:name, name()} | ... !\color{typeAqua}//transparent!
!\textcolor{typeGreen}{type}! result() = {:reply, !{\color{bordeaux}reply}!, state()} | ... !\color{typeAqua}//transparent!
!\textcolor{typeGreen}{type}! state() !\color{typeBlue}//opaque!
!{\etc}!
# Callbacks
!\textcolor{typeGreen}{callback}! init :: init_arg() -> {:ok, state()} | :error
!\textcolor{typeGreen}{callback}! optional(handle_call) :: !{\color{bordeaux}request}!, pid(), state() -> result
!\textcolor{typeGreen}{callback}! optional(handle_cast) :: !{\color{bordeaux}request}!, state() -> result
!\etc!
# Functions
!\textcolor{typeGreen}{spec}! start :: GenServer(!{\color{bordeaux}request, reply}!), init_arg(), options() -> on_start()!\negspace!
!\etc!
end
\end{minted}
The types \elix{request} and \elix{reply} are now parameters of the (module) type GenServer; the types \elix{request()} and \elix{reply()} are transparently exported, while \elix{state()} is opaque. optional callbacks are explicitly declared with a syntax reminiscent of map types. Finally, the type of the first \elix{start} function explicitly declares that the first argument of the function must be a module of the same type as the one implemented. Notice that the second argument of \elix{start} is of type \elix{init_arg()} since this argument is then passed to the function \elix{init}. We left the definition of \elix{init_arg()} unspecified, but it should probably be yet another parameter of the GenServer behaviour, exactly as \elix{request()} and \elix{reply()}.
\subsection{The PoETS Proposal}
In the paper "Principles of Elixir Type System"~\cite{CDV24}, we proposed to type the Genserver example as follows
\begin{minted}[escapeinside=!!,fontsize=\footnotesize]{elixir}
defmodule!\textbf{\color{red}type}! GenServer do !\elabel{genservertypebegin} !
# Types
!\textcolor{bordeaux}{\$param}! request
!\textcolor{bordeaux}{\$param}! reply
!\textcolor{typeGreen}{\$type}! option = {:debug, debug()} | {:name, name()} | ... !\color{typeAqua}//transparent!
!\textcolor{typeGreen}{\$type}! result = {:reply, !{\color{bordeaux}reply}!, state()} | ... !\color{typeAqua}//transparent!
!\textcolor{typeGreen}{\$opaque}! state !\color{typeBlue}//opaque!
!{\etc}!
# Callbacks
!\textcolor{typeGreen}{\$callback}! init : init_arg() -> {:ok, state} | :error
!\textcolor{typeGreen}{\$callback}! optional(handle_call) : !{\color{bordeaux}request}!, pid(), state -> result
!\textcolor{typeGreen}{\$callback}! optional(handle_cast) : !{\color{bordeaux}request}!, state -> result
!{\etc}!
end !\elabel{genservertypeend} !
defmodule GenServer do !\elabel{genserverbegin} !
# Functions
!\textcolor{typeGreen}{\$}! (Mod : GenServer[!{\color{bordeaux}request=a, reply=b}!], arg : init_arg(), opt : Mod.options) -> on_start() !\elabel{starttype}!
when a: term(), b: term()
def start(Mod, arg, opt) do
!\etc!
end !\elabel{genserverend}!
\end{minted}
Our current proposal that we describe in the rest of this document is very similar to the above, but much more detailed and formalized.
\section{The new Elixir module system}
An Elixir program is a sequence of module definitions. When a module definition contain a \elix{@callback} declaration, this module is considered a \emph{behaviour}. Behaviours are a kind of hybrid definitions since they define both a module type and, whenever they export functions, a module value. This ambiguity is resolved by the Elixir compiler, which will generate a module \emph{type} definition for the behaviour, and a module \emph{value} definition for the implementations exported by the behaviour.
So for example the definition of the \elix{GenServer} behaviour above will generate both a module type definition for the module type \elix{GenServer} (lines \ref{genservertypebegin}--\ref{genservertypeend}), and a module value definition for the module \elix{GenServer} exporting the functions defined in the behaviour (e.g., \elix{start/3}) (lines \ref{genserverbegin}--\ref{genserverend}).
For the sake of simplicity, we will suppose that this distinction is already done in the program. In the rest of the document, then, we will suppose that a program is a sequence of module type definitions, followed by the \texttt{Main} module definition, containing just a list of bindings.
\[\overline{\kwdefmt \BX \kwdo \overline{P}\, \overline{D} \kwend}\kwdefm \textit{Main} \kwdo \overline{B} \kwend\]
\beppe{Guillaume suggests: $\overline{\kwdefmt \BX \kwdo \overline{P}\, \overline{D} \kwend} \overline{B}$}
Intuitively, the list of bindings $\overline{B}$ exported by the module \textit{Main} will be the list of the toplevel modules definitions of the program, whose general form is as follows:
\begin{equation}\label{eq:module}
\kwdefm X \kwdo \overline{P}\, \overline{S}\, \overline{B} \kwend
\end{equation}
With this convention, every module defined in the program is then denoted by a unique path starting with \texttt{Main} module name.
The full syntax of the surface language is given in Figure \ref{fig:surfacesyntax}. In Elixir there is a unique syntactic category for names, although developers are invited to follow the naming convention of using capitalized names for modules and (thus) behaviours. To enhance clarity, we use $x$ to range over expression variables, $X$ for module names, $\tx$ for type names, and $\BX$ for behaviour names.
\begin{figure}
\begin{tabular}{r c ll}
$\mathcal B$ & ::= &$\kwdefmt \BX\kwdo \overline{P}\, \overline{D} \kwend$ \\
$N$ & ::= & $x$ \\
& | & $X$ \\
$S$ & ::= & $\kwbvr \BX[\,\overline{\tx=t}\,]$ \\
$P$ &::= & \textcolor{Cerulean}{$\kwprm \tx$} \\
$B$ &::= & $\kwdefm X \kwdo \textcolor{Cerulean}{\overline{P}}\, \overline{S}\, \overline{B} \kwend$ \\
& | & $x = v$\\
& | & $\kwpr x = v$ \\
& | & $\kwtp \tx = t$ \\
& | & \textcolor{Cerulean}{$\kwopq \tx = t$} \\
$M$ & ::= & $X\textcolor{Cerulean}{[\,\overline{\tx=t}\,]}$ \\
& | & $X.M$ & \text{\small José version: only the last module of the path can be instantiated}\\
& | & $\textcolor{Cerulean}{M}.M$ & \text{\small More general but not considered: every module of the path can be instantiated}\\
$E$ &::= & $v$ \\
& | & $x$ \\
& | & $\kwlet N = E\kwin E$ \\
& | & $E(\overline{E})$ \\
& | & $\texttt{\%}\!\left\{\overline{\ell=E}\right\}$ \\
& | & $E.\ell$ \\
& | & $(E\in t)?E:E$ \\
& | & $\textcolor{OrangeRed}{M.x}$ \\
& | & $\textcolor{OrangeRed}{M}$ \\
& | & $\textcolor{OrangeRed}{\self}$ \\
$v$ & ::= & $c$ \\
& | & $\texttt{\%}\!\left\{\overline{\ell=v}\right\}$ \\
& | & $\$\bigwedge \overline{(\,\overline{\,t\,}\,)\rightarrow t} \kwfn \overline{x} \rightarrow E$ \\
& | & \textcolor{OrangeRed}{$\$\bigcap \overline{\left(\overline{N:T}\right)\rightarrow T} \kwfn \overline{N}\rightarrow E$} \\
$T$ & ::= & $t$ \\
& | & $\left(\overline{N:T}\right)\rightarrow T$ \\
% & | & $[\overline{\tx:\star}]\to T$ & do we want this syntax or remove it and have $\left\{\overline P;\overline{S};\overline{D}\right\}$?\\
& | & $\left\{\textcolor{Cerulean}{\overline{P}};\overline{S};\overline{D}\right\}$ & the programmer should be able to denote this type only via \textsf{like($M$)}\\
$t$ & ::= & \texttt{:}k & singleton atom type\\
& | & \texttt{atom()} & top atom type \\
& | & $\texttt{\textcolor{OrangeRed}{module()}}$ & top module type \\
& | & $(\overline{\,t\,})\rightarrow t$\\
& | & $\texttt{\%}\!\left\{\texttt{..., }\overline{f}\right\}$ & open record type\\
& | & $\texttt{\%}\!\left\{\overline{f}\right\}$ & closed record type \\
& | & $t\vee t$ \\
& | & $t\wedge t$ \\
& | & $\neg t$\\
& | & $\alpha$ \\
& | & $\mathbb{O}$ \\
& | & $\textcolor{OrangeRed}{M.\tx}$ \\
& | & $\textcolor{OrangeRed}{\tx}$ & local type declaration or type parameter\\
& | & $\textcolor{OrangeRed}{\BX[\,\overline{\tx=t}\,]}$ \\ %& shall we move this into $T$? Consequence no $\kwtp x = \elix{Stack}$ [which seems ok]
& | & \textcolor{OrangeRed}{$\langle M\rangle$} & singleton type of $M$ (modulo path equivalence) concrete syntax: just $M$
\\
$f$ & ::= & $\ell : t$ & mandatory field \\%{\color{bordeaux}[should we use $T$ instead of $t$? Or $\BX[\overline{\tx=t}]$? Or $\{\overline{D}\}_{\overline{\BX}}$?]}\\
& | & $\ell \;\texttt{=>}\; t$ & optional field\\ %{\color{bordeaux}[same questions as above]}\\
$D$ & ::= & $\kwmdl X : T$ \\
& | & $\kwclbk x : \bigcap \overline{T}$ \\
& | & $\kwopq \tx$ \\
& | & $\kwtp \tx = t$\\
\end{tabular}
\bigskip
In \textcolor{Cerulean}{blue} the modifications of the current Elixir syntax.
\caption{Syntax of the surface language }\label{fig:surfacesyntax}
\end{figure}
\subsection{Module Types}
Let us first examine the syntax of the module types declarations, which is given by the syntax, \[\kwdefmt \BX\kwdo \overline{P}\, \overline{D} \kwend\] and which correspond to the definition of a behaviour (ie., a module with callback declarations), stripped of its function definitions (i.e., no \elix{def} or \elix{defp} definitions). The module type $\BX$ is defined by a list of type parameters $\overline{P}$, and a list of (type or module) export declarations and callback declarations $\overline{D}$. Every module implementing the behaviour $\BX$ must \emph{at least} provide:
\begin{itemize}
\item a transparent type definition of the form $\kwtp \tx = t$ for every $\kwprm \tx$ declared in $\overline{P}$,
\item an opaque type definition of the form $\kwopq \tx = t$ for every $\kwopq \tx$ declared in $\overline{D}$,
\item repeat the transparent type definition of the form $\kwtp \tx = t$ for every $\kwtp \tx = t$ declared in $\overline{D}$,
\item a definition of the form $x = v$ with $v$ of type $\bigcap\overline{T}$ for every callback $\kwclbk x : \bigcap \overline{T}$ declared in $\overline{D}$.
\end{itemize}
For instance, a \elix{StackB} behaviour for modules wanting to implement a stack parametric in the type of its elements can be defined as follows:
\begin{minted}{elixir}
!\color{blue}defmodtype! StackB do
!\tp!param elem
!\tp!opaque stack
!\tp!callback new : () -> stack
!\tp!callback push : (stack, elem) -> stack
!\tp!callback pop : stack -> {elem, stack}
end
\end{minted}
\subsection{Expressions}
In Figure~\ref{fig:surfacesyntax} we specified a minimal set of expressions---ranged over by $E$---to represent Elixir. The syntax is kept minimal since we want to focus on the module system. So our expressions form a simple functional language with records (in Elixir parlance ``maps'') and a type-case expression (in lieu of more general case expressions). Elixir uses the notation $\texttt{\%}\!\left\{\overline{\ell=E}\right\}$ for records (maps) with fields $\ell$ initialized to $E$, and we reproduce it here faithfully. For type-cases we use instead the expression $(E\in t)?E_1:E_2$ which evaluates to $E_1$ or to $E_2$ according to whether $E$ returns a value of type $t$ or not; this is not valid Elixir syntax, but it serves a similar purpose. The syntax is essentially the one by~\cite{CDV24} with three notable additions (in red in the figure) to extend expressions to modules: $M$, denoting the module (reached by the path) $M$, \self{} which returns the name of the (innermost) current module, and $M.x$ denoting the function or constant $x$ exported by the module $M$. The other expressions are standard, though their syntax may not be familiar to a reader not acquainted with Elixir. In particular, functions in Elixir are prefixed by their type declaration which, in general, is an intersection of (multi-arity) arrow types. Thus, we use $\$\bigwedge_{i\in I} \overline{(\,\overline{\,t_i\,}\,)\rightarrow t_i} \kwfn \overline{x} \rightarrow E$ to denote the anonymous function of parameters $\overline{x}$, body $E$ and annotated with the intersection type $\bigwedge_{i\in I} \overline{(\,\overline{\,t_i\,}\,)\rightarrow t_i}$ meaning that when the function is applied to arguments of type $\overline{t_i}$, we expect a result of type $t_i$. Anonymous functions are values, and we notice that besides the form $\$\bigwedge_{i\in I} \overline{(\,\overline{\,t_i\,}\,)\rightarrow t_i} \kwfn \overline{x} \rightarrow E$ above, which is the one introduced in~\cite{CDV24}, we added in red a second form $\$\bigcap \overline{\left(\overline{N:T}\right)\rightarrow T} \kwfn \overline{N}\rightarrow E$ that differs from the previous one by the annotation that precedes it. These annotations are the new ones we introduce for the module system, which have the particularity that the types in the domain may be named, and that the arrows are on module types; in particular these can use the names specified for the domains (e.g., the code in line~\ref{starttype} declares that the type of the third argument of \elix{start()}, is \elix{Mod.option}, that is the \elix{option} type exported by the first argument of the function \elix{start()}, whatever that will be). We will give more details in Section~\ref{sec:types}.
\subsubsection{Compiled expressions}
It is important to notice that the syntax of expressions given in Figure~\ref{fig:surfacesyntax} is an abstraction not of the actual syntax of Elixir, but rather of the syntax of the expressions that are generated by the Elixir compiler and passed to the type-checker. Elixir has a powerful meta-programming system for which the compiler executes everything that is not included inside a function before producing the abstract syntax tree that is passed to the type-checker and then to the BEAM machine. One of the fundamental assumptions of our proposal is that all the module definitions are known at type-checking time and that the only way for an expression to access a module is via its name (that is, via a path of module identifiers). This may come as a surprise to an Elixir developer since it is possible to write Elixir code that generates modules at runtime, with names that are not known at compile time, as in the following example:
\begin{minted}{elixir}
defmodule M do
def create_module(name) do
defmodule name do
def hello(), do: "Hello, world!"
end
end
end
\end{minted}
When compiling the code above, there are two possibilities. Either the code to compile that includes the module above has somewhere a toplevel expression that calls the function \elix{create_module} with some argument, as in \elix{M.create_module(:MyModule)}, in which case the compiler will execute this expression and generate a module \elix{MyModule} that is then available at type-checking time. Or the code to compile does not have any such a toplevel expression, in which case no module is generated at compile time and thus no module is available at type-checking time. In both cases the syntax generated by the compiler will replace the dynamic module generation by a call to \elix{Module.create/3}, as in
\begin{minted}{elixir}
defmodule M do
def create_module(name) do
Module.create(name, quote do
def hello(), do: "Hello, world!"
end, Macro.Env.location(__ENV__))
end
end
\end{minted}
which is thus conforms to the syntax of Figure~\ref{fig:surfacesyntax}, and the type-checker will check the call to \elix{Module.create/3} as any other call, and in particular it will check that the first argument is of type \elix{atom()} (since module names are atoms), that the second argument is of type \elix{Macro.t()} (since the second argument of \elix{Module.create/3} must be a quoted expression), and that the third argument is of type \elix{Macro.Env.t()} (which is the type of the options passed as third argument).
\subsubsection{Module definitions}
Let us examine in detail the definition of a module, as given in \eqref{eq:module}:
%
\[\kwdefm X \kwdo \overline{P}\, \overline{S}\, \overline{B} \kwend\]
%
The definition of a module $X$ is a sequence of three blocks: the list $\overline P$ of type parameters of the form $\kwprm \tx$, the list $\overline S$ of behaviours declarations of the form $\kwbvr \BX$, and the definition of the module itself, which is a sequence of bindings $\overline{B}$. In Elixir, these three blocks may be interleaved, but for the sake of simplicity we will suppose that they are grouped together in the order given above. The module type parameters are used to define the type of the module and are instantiated when the module is used; the list of behaviours specifies the behaviours that this module implements.
The bindings $\overline{B}$ are used to define the actual implementation of the module, which may include definitions of types, values, and modules.
In a \emph{type binding}, a type $t$ is bound to a (type) name $\tx$, and this binding may be transparent or opaque, as defined by the syntax $\kwtp \tx = t$ and $\kwopq \tx =t$, respectively. An opaque type definition makes the binding $\tx= t$ visible only inside the module, while a transparent one makes this binding visible also outside the module and it is also used to instantiate the parameters of the behaviours the module implements. Thus, for instance if the behaviour \elix{StackB} is defined as the above, then a module \elix{StackInt} implementing the behaviour for integers will be defined as
\begin{minted}{elixir}
defmodule StackInt do
!\tp!behaviour StackB[elem = integer()] # declare that it implements StackB for integers
!\tp!opaque stack = list(integer()) # define the opaque type of the stack
# callbacks implementations
!\tp! () -> stack # note that elem and stack are type identifiers which
def new(), do: [] # which is why I wrote them as stack, elem rather than
# stack() or elem() ... but this can be discussed
!\tp! (stack, elem) -> stack
def push(s, e), do: [e|s]
!\tp! (stack) -> {integer(), stack}
def pop(s), do: {hd(s), tl(s)}
end
\end{minted}
Although the current Elixir syntax does not allow it, in the system of Figure~\ref{fig:surfacesyntax} we can also define a module to be parametric in some types. For instance, we can define the module stack to be parametric in the type of
its elements, and then pass this parameter as argument to the behaviour, as in
the following example, in which we add a \elix{top} function and a
higher-order function \elix{move}
\begin{minted}{elixir}
defmodule StackModule do
!\tp!param elem
!\tp!behaviour StackB[elem = elem] # declare that it implements StackB for elem
!\tp!opaque stack = list(elem)
# callbacks implementations
!\tp! () -> stack
def new(), do: []
!\tp! (stack, elem) -> stack
def push(s, e), do: [e|s]
!\tp! (stack) -> {elem, stack}
def pop(s), do: {hd(s), tl(s)}
# extra functions
!\tp! (stack) -> elem
def top(s), do: hd(s)
!\tp! (X: StackB[elem=elem], sx: X.stack, Y: StackB[elem=elem], sy: Y.stack) -> Y.stack
def move(X, sx, Y, sy), do: Y.push(sy, X.top(sx))
\end{minted}
Now the module \elix{StackModule} is parametric in the type of its elements, and to use it we must instantiate as in \elix{StackModule[elem=integer()].new()} which returns a value of (opaque) type \elix{StackModule[elem=integer()].elem}.
\begin{remark} In order to reduce verbosity one can imagine using the aliasing mechanism of Elixir to define a shorter name for the module type \elix{StackB[elem=integer()]} as in
%\begin{minted}{elixir}
\elix{alias StackModule[elem=integer()], as: StackMInt}
%\end{minted}
\noindent
and then use \elix{StackMInt.new()} to create a new stack of integers.\hfill$\square$
\end{remark}
In a \emph{value binding}, a value $v$ is bound to a name $x$ which may be exported or private, as defined by the syntax $x = v$ and $\kwpr x = v$ respectively. In general, a value $v$ is either a constant of some basic type, or a record of constants, or a function. In the latter case, the function is defined in Elixir by the syntax \elix{def} and \elix{defp} for exported and private functions respectively, and prefixed by their type annotations as in the following example:
\begin{minted}{elixir}
defmodule X do
!\tp! (!$t_1$! -> !$t_2$!) and (!$t_3$! -> !$t_4$!)
def f(x,y), do: !$E_1$!
!\tp! (X : StackB[elem=integer()], s: X.stack) -> integer()
defp g(X,s), do: !$E_2$!
end
\end{minted}
The above examples are considered as syntactic sugar and in the syntax of Figure \ref{fig:surfacesyntax} are rendered by the following definitions
\begin{minted}{elixir}
defmodule X do
f = !\tp$(t_1\to t_2) \wedge (t_3 \to t_4)$! fn x, y -> !$E_1$!
g = !\tp!private !\tp!(X : StackB[elem=int], s: X.stack) !$\to$! int fn X, s -> !$E_2$!
end
\end{minted}
Finally, a \emph{module binding} is a simple definition of the form $\kwdefm X \kwdo \overline{P}\, \overline{S}\, \overline{B} \kwend$ as in \eqref{eq:module}, and binds the module name $X$ with its definitions.
\subsection{The consequences of parametric modules}
While having parametric behaviours is quite straightforward, the addition of parametric modules has several consequences on the syntax and semantics of the module system.
\paragraph{Syntax.} The first important consequence is that this requires an important modification of the current Elixir syntax. If we just allow parametric behaviours and not also parametric modules, then the modification of the syntax of the expressions is really minimal: while currently a module definition just defines transparent types (e.g., \elix{@type state() :: term()}) and opaque ones (\elix{@opaque state() :: term()}), the new type definitions just change this syntax to align to the Elixir new syntax of types, which specify if a type is exported transparently (i.e., both name and definition: $\kwtp \tx = t$) or opaquely (i.e., just its name: $\kwopq \tx = t$), and nothing else for what concerns the expression syntax (in particular modules with this choice are denoted by paths of module identifiers: $X_1.X_2.\cdots{.}X_n$, just like in current Elixir).
If we also allow parametric modules, then we must extend the syntax of module expressions to declare the parameters $\overline P$ in the form $\kwprm \tx$ (easy), but \textbf{above all} we have to change the syntax of paths to allow instantiations of the parameters of modules. This is done by the syntax $X[\overline{\tx=t}]$ denoting the module $X$ instantiated with the types $\overline{t}$ for its parameters $\overline{\tx}$.
There are two possible syntaxes for paths with instantiations.
\begin{enumerate}
\item The instantiation may occur at any level of a path, as in $X_1[\overline{\tx=t}].X_2[\overline{\tx=t}].X_3[\overline{\tx=t}]$, or
\item The instantiation can only occur at the end of the path, as in $X_1.X_2.X_3[\overline{\tx=t}]$, that is for the innermost module only.
\end{enumerate}
The former is more general (and it can be easily encoded in 1ML), but can be quite demanding to the programmer, especially if the system requires all parametric modules in a path to be explicitly instantiated.
The latter is simpler, it reflects the current implementation of Elixir, where the nesting of modules is just a syntactic facility since all modules are flat and the nesting is handled by the compiler by naming each nested module with the full and unique path to it. \beppe{Explain that this is important for refactoring} This second choice however imposes to carefully define the scoping rules of types in a module. In particular a set of sufficient scoping rules is the following:
\begin{itemize}
\item the parameters of a module $X$ are not visible in the modules nested in $X$;
\item a type $\tx$ defined in a module $X$ is visible in all modules nested in $X$ only if all the types used in the definition of $\tx$ are visible in the nested modules.
\item a value $x$ defined in a module $X$ is visible in all modules nested in $X$ only if all the types used in the type of $x$ are visible in the nested modules.
\end{itemize}
This implies that to use a type or a value non-visible in a nested module, we must use a full and explicitly (and terminally) instantiated path to denote it.
Actually this set of scoping rules is completely in line with the current Elixir implementation, which uses more restrictive rules: \textbf{nested modules do not have access to the types and values defined in their enclosing module(s) (nor in their submodules? question for José), unless these are explicitly denoted by their full path.}
The reader may have noticed that we used a different color in three terms of the syntax: the parameters $\kwdefm X \kwdo \textcolor{Cerulean}{\overline{P}}\, \overline{S}\, \overline{B} \kwend$ in module definitions; the type of modules $\left\{\textcolor{Cerulean}{\overline{P}};\overline{S};\overline{D}\right\}$; the instantiation $X\textcolor{Cerulean}{[\,\overline{\tx=t}\,]}$ in module denotations. The colored syntax denotes all the modifications to the syntax of Elixir \textbf{expressions} that are needed to implement parametric modules. In the current Elixir expression syntax, a module is denoted only by its name since there are no parametric modules, therefore no parameter instantiation: paths are only of the form \elix{Mod1.Mod2.Mod3} rather than, say, \elix{Mod1[x=integer()].Mod2.Mod2[z=proc()]} (in the variant that corresponds to $M::= M.M \mid X[\overline{\tx=t}]$), or its terminal variant \elix{Mod1.Mod2.Mod2[z=proc()]} (corresponding to $M::= X.M\mid X[\overline{\tx=t}]$)
If we do not want to extend the current Elixir syntax for expressions to allow parametrized modules such as \elix{StackModule}, it suffices to remove the syntax colored in blue from Figure~\ref{fig:surfacesyntax}, that is, instantiations in module, and the type parameters in module types and definitions (the two definitions of paths in that case collapse).\footnote{Of course, we will still replace Typespec specifications with our new syntax for types.} Then all the modifications to type the Elixir module system will be confined to types and type-related definitions, and the expressions will remain unchanged. In particular, type parameters will be confined to behaviours.
\paragraph{Inference.} A second important consequence of parametric modules is that we have to decide how much inference we want to (and can) do for the type arguments of a parametrized module. This is more a matter of programming style than of formalization, though the choice may have some consequences on the formalization.
In the example of \elix{StackModule} above, for instance, we have to decide whether we want the programmer always to specify the type parameters of a module when this is used, as in \elix{StackModule[elem=integer()].new()}, or whether we want the system to be able to infer the type parameters from the context, as in \elix{StackModule.push(StackModule.new(),5)}. The latter is more convenient for the programmer, but requires a more complex type system. In particular, it requires a form of local type inference, since the type of \elix{StackModule.new()} must be inferred to be \elix{StackModule[elem=integer()].stack} from the context in which it is used (the first argument of \elix{push}). This also poses some problems of deciding how precisely the type parameters must be inferred. For instance, in \elix{StackModule.push(StackModule.new(),:a)}, should the system infer \elix{elem=atom()} or \elix{elem=:a}? The former is more general (and probably what a programmer would expect), but the latter is more precise.
The easy way is to always require the programmer to specify the type parameters of a module when this is used and extend the aliasing mechanism of Elixir to allow the programmer to define local aliases for instantiated modules, as in
\begin{minted}{elixir}
defmodule MyApp do
alias StackModule[elem=integer()], as: IntStack
s = IntStack.new()
s1 = IntStack.push(s,5)
end
\end{minted}
This is simpler to formalize, but may be a bit too demanding for the programmer.
\subsection{Types}\label{sec:types}
\beppe{This section must be rewritten on the lines discussed in Section~\ref{sec:stratification}.}
The types for expressions are strictly stratified. There are the simple types, ranged over by $t$ which are essentially those of \cite{CDV24} with a single syntactic extension, $M.\tx$ denoting the simple type exported by the module $M$ as $\tx$.
%
%
\beppe{If we want this strict stratification we have to
\begin{itemize} \item Move behaviours into $T$. The consequence of this is that we cannot have a type binding of the form $\kwtp \tx ={\BX}[\,\overline{\tx= t}\,]$, that is, a module cannot export a module type. I do not see any other consequence.
\item this also means that we cannot have maps (and products if included) containing modules. If we want to have it we can modify the definition of field types $f$ so that the map labels to $T$, rather than $t$. The problem however is how to define union and intersections of map types with $T$ fields, since unions and intersections are not defined on $T$. As long as we do not have a semantic interpretation of $T$ we should mix syntactic and semantic subtyping.
\item A possible intermediate solution could be to have map types to specify intersections of instantiated behaviours, $\bigcap\overline{\BX[\,\overline{\tx= t}\,]}$ as these can be considered as new basic types (modulo the ordering of the parameter instatiations) for which it is easy to define semantic subtyping. \end{itemize}}
%
On the top of the simple types there are the module system types, ranged over by $T$, which are either (fully instantiated) behaviours $\BX[\,\overline{\tx= t}\,]$, or (anonymous) module types of the form $\left\{{\overline{P}};\overline{S};\overline{D}\right\}$,\footnote{
Programmer will not have the possibility to write this type. We are rather oriented to allow the programmer only to specify in types either some behaviour names (e.g., \felix{StackB[elem=integer()]}), or use the \texttt{Like$\;M$} syntax (e.g., \felix{Like StackInt} to denote the anonymous module type of the \felix{StackInt} module.} or the types of functions working on modules, of the form $(N_1{:} T_1,\cdots, N_n{:}T_n) \to T_{n+1}$, where $N_i$ is a parameter name, and $T_i$ is a module system type in which the names $N_j$ for $j<i$ may occur (e.g. \elix{(X : StackB[elem=integer()], s : X.stack) -> X.elem}).
These module system types have the following intended meaning:
\begin{itemize}
\item A module $X$ has a behaviour type $\BX[\,\overline{\tx= t}\,]$ if it explicitly implements the behaviour $\BX$ instantiated with the types $\overline{t}$ for its parameters $\overline{\tx}$. This means that that the definition of the module $X$ contains the declaration $\kwbvr \BX$, a transparent type definition $\kwtp \tx_i=t_i$ for each $\tx_i, t_i$, in $\overline{\tx}, \overline t$, and provides a definition for all the values, modules, and callbacks declared in $\BX$, with the types declared in $\BX$.
\item The type of a module is defined by the syntax $\left\{\overline{P};\overline{S};\overline{D}\right\}$, where $\overline{P}$ are the type parameters of the module, $\overline{S}$ is the list of behaviours implemented by the module, and $\overline{D}$ is a type environment, that is, the list of the typings of the definitions exported by the module. These typings are of the form $\kwmdl X : T$ for modules, $\kwclbk x : \bigcap \overline{T}$ for value definitions (constant, record values, and functions/callbacks), $\kwopq \tx$ for opaque types, and $\kwtp \tx = t$ for transparent types (of course, no type for private definitions). This list must contain at least the typings for all the type parameters and the callbacks of the behaviours $\overline{S}$ implemented by the module. The usual restrictions of Elixir apply: a module cannot implement two behaviours with a common callback.
\item The type $(N_1{:} T_1,\cdots, N_n{:}T_n) \to T_{n+1}$ is used to type functions working on modules. The parameters $N_i$ are names that can be used in the types $T_j$ for $j>i$ to denote types exported by the module passed as $i$-th argument (e.g., the \elix{move} function of the \elix{StackModule} example above, which has type \elix{(X : StackB[elem=elem], sx : X.stack, Y : StackB[elem=elem], sy : Y.stack) -> Y.stack}.)
\end{itemize}
One of the consequences of this stratification---actually, its main motivation---is that set-theoretic type connectives such as $\vee$, $\wedge$, and $\neg$ are not allowed in module system types, but only in simple types. The reason is that we do not have a semantic interpretation of the $T$ types and therefore we cannot define the set-theoretic connectives on them.
In practice, however we need intersection types on module system types in at least two cases: when we want to define a module $M$ that implements several behaviours (since the type of the module is its anonymous type \texttt{like($M$)} intersected with its properly instantiated behaviours), and when we want to define a multiclause function in which some clauses are defined for module system types (we expect the function to have an intersection of $(\overline{N:T})\to T$ arrows). Both cases are represented in the syntax of Figure~\ref{fig:surfacesyntax} by handling them syntactically. A module system type $\left\{\overline{S};\overline{D}\right\}$ represent the intersection of the anonymous type $\left\{\overline{D}\right\}$ with the behaviours $\overline{S}$, instantiated as indicated in $\overline{D}$. This will be clearer in the formalization where module types have the form $\{\overline{D}\}_{\overline{\BX[\,\overline{\tx=t}\,]}}$ to be interpreted as the syntactic intersection $\{\overline D\}\cap{\BX_1[\,\overline{\tx_1=t_1}\,]}\cap...\cap{\BX_n[\,\overline{\tx_n=t_n}\,]}$. For the second case, we use the syntax $\bigcap \overline{T}$ to declare callback types and annotate exported functions. We use $\bigcap$, rather than $\bigwedge$ to stress that this is an intersection different from the one on simple types, the former being defined syntactically (via some explicit subtyping rules), the latter semantically (via a set theoretic interpretation of types).
\section{Formalization of the module system}
\subsection{Syntax changes}
For our formalization we will use a slightly different syntax, which is more convenient for the type system we are going to define. This syntax is given in Figure \ref{fig:modulelangsyntax}.
\begin{figure}
\begin{tabular}{r c ll}
$\mathcal B$ &::= & $\!\!\BX: \Lambda(\overline{\tx:\star}).\{\overline{D}\}$ &\color{Gray} behaviour $\BX$ with parameters $\overline \tx$\\
$B$ &::= & $\!\!X = \textcolor{Cerulean}{\Lambda(\overline{\tx:\star}).}\{\overline{B}\}_{\overline{\mathcal{X}[\,\overline{\tx=t}\,]}}$ &\color{Gray} module $X$ with parameters $\overline \tx$ implementing $\overline{\BX[\,\overline{\tx=t}\,]}$\\
& | & $x = v$ &\color{Gray} exported definition \\
& | & $x = \textsf{private }v$ &\color{Gray} private definition\\
& | & $\tx = \textsf{type }t$ &\color{Gray} transparent type or behaviour argument\\
& | & $\tx = \textsf{opaque }t$ &\color{Gray} opaque type\\[3mm]
$M$ & ::= & $X\textcolor{Cerulean}{[\,\overline{\tx=t}\,]}$ &\color{Gray} local module\\
% & | & $M.M$ &\color{Gray} imported module\\
& | & $X.M$ &\color{Gray} imported module (flat hierarchy): later we can study $M::=M.M$\\[3mm]
$N$ & ::= & $x$ \\
& | & $X$ \\[3mm]
$E$ &::= & $v$ \\
& | & $x$ \\
& | & $\kwlet N = E\kwin E$ \\
& | & $E(\overline{N})$ \\
& | & $\texttt{\%}\!\left\{\overline{k=E}\right\}$ & $k$ ranges over keys (for starting, just atoms)\\
& | & $E.k$ \\
& | & $(E\in t)?E:E$ \\
& | & {\color{OrangeRed}\text{$M.x$}} \\
& | & {\color{OrangeRed}\text{$M$}} \\
$v$ & ::= & $k$ \\
& | & $\texttt{\%}\!\left\{\overline{k=v}\right\}$ \\
& | & $\$\bigwedge \overline{(\,\overline{t}\,)\rightarrow t} \kwfn \overline{x} \rightarrow E$ \\
& | & {\color{OrangeRed} \text{$\$\bigcap \overline{\left(\overline{N:T}\right)\rightarrow T} \kwfn \overline{N}\rightarrow E$}} &\color{OliveGreen} syntactic intersection\\[3mm]
$T$ & ::= & $t$ \\
& | & $\left(\overline{N:T}\right)\rightarrow T$ \\
& | & $\left\{\overline{D}\right\}_{\overline{\BX[\,\overline{\tx=t}\,]}}$ &{\color{OliveGreen} syntactic intersection: $\{\overline{D}\}\cap{\BX_1[\,\overline{\tx_1=t_1}\,]}\cap...\cap{\BX_n[\,\overline{\tx_n=t_n}\,]}$}: ($\BX$'s instantiations given in $\overline{D}$)\\
& | & $\textcolor{Cerulean}{(\overline{\tx:\star})\to T}$ & \text{first class parametric modules. Maybe $\textcolor{Gray}{(\overline{\tx:\star})\to \left\{\overline{D}\right\}_{\overline{\BX[\,\overline{\tx=t}\,]}}}$ might suffice} \\[3mm]
% & | & $\star$ &\text{\small I would remove this production: we do not want to pass around modules unless}\\
% &&&\text{\small they are fully instantiated} $f(X:(\tx:\star)\to ...)\to ...X[\tx=\textsf{int}] ...$\\
$t$ & ::= & $\texttt{atom}$ \\
& | & $\texttt{mod}$ & top module type \\
& | & $\texttt{fun}$ & top function type \\
& | & $(\overline{\,t\,})\rightarrow t$\\
& | & $\texttt{\%}\!\left\{\overline{f}\right\}$ & if we want optional fields then $f::= k:t\mid k:t?$\\
& | & $t\vee t$ \\
% & | & $t\wedge t$ & \aghilas{we can remove this one}\\
& | & $\neg t$\\
& | & $\alpha$ \\
& | & $\mathbb{O}$ \\
& | & \textcolor{OrangeRed}{\text{$\tx$}} &\color{Gray} a type locally declared or a type parameter\\
& | & \textcolor{OrangeRed}{\text{$M.\tx$}} &\color{Gray} the type $\tx$ imported from module $M$\\
& | & \textcolor{OrangeRed}{\text{$\mathcal{X}[\overline{\tx=t}]$}} &\color{Gray} a behaviour (essentially a new constant of the basic type module())\\
% &&&\color{bordeaux} Important: this allows maps and tuples to contain modules!\\
& | & \textcolor{OrangeRed}{\text{$\BH M$}} &\color{Gray} the singleton type of $M$ (modulo path equivalence)\\[3mm]
$D$ & ::= & $X : T$ &\color{Gray} exports a module $X$ of type $T$\\
& | & $x : \bigcap \overline{T}$ &\color{Gray} exports a value $x$ of type $\bigcap\overline T$ \\
& | & $\tx : \star$ &\color{Gray} exports an opaque type $x$\\
& | & $\tx : [=t]$ &\color{Gray} exports a transparent type $x = t$ \\[3mm]
$\Gamma$ & ::= & $\overline{D}$ &\color{Gray} type environment\\
$\Sigma$ & ::= & $\overline{\mathcal B}$ &\color{Gray} behaviour environment\\[4mm]
\end{tabular}
where $\BX$ ranges over behaviour names, $X$ over module names, $x$ over variable names, $\tx$ over type names, and $k$ over atoms.
\caption{Module language syntax}\label{fig:modulelangsyntax}
\end{figure}
The main differences with respect to the Elixir syntax are the following:
\begin{enumerate}
\item The module type $\kwdefmt \BX\kwdo \overline{\kwprm \tx}\, \overline{D} \kwend$ is written as $\BX: \Lambda(\overline{\tx:\star}).\{\overline{D}\}$.
\item The module definition $\kwdefm X \kwdo \overline{\kwprm \tx}\; \overline{\kwbvr \mathcal{X}[\,\overline{\tx=t}\,]}\; \overline{B} \kwend$ is written as $X = \Lambda(\overline{\tx:\star}).\{\overline{B}\}_{\overline{\mathcal{X}[\,\overline{\tx=t}\,]}}$
\item We changed the syntax of module binding so that all the bindings are of the form (\emph{name $=$ definition}): this is the main motivation for the change of syntax in point 2 above.
\item Likewise, the types of module bindings has been modified so that all the binding are of the form (\emph{name $:$ definition-type}). This is the main motivation for the change of syntax in point 1 above.
\item Expression can be applied only to a list of names, and not to arbitrary expressions. It is however easy to encode the expression $E_0(E_1,...,E_n)$ as $\kwlet N_1 = E'_1 \kwin ... \kwlet N_n=E'_n\kwin E'_0(N_1,\dots,N_n)$, where $E'_i$ is the encoding of expression $E_i$, and $N_i$ are fresh names.
\end{enumerate}
There is a clear mapping from the terms of Figure~\ref{fig:surfacesyntax} to those of Figure~\ref{fig:modulelangsyntax}, therefore we will not discuss this further.
\bigskip
\beppe{Open question: what about recursion in module types? Do we want to allow it? If yes, how? Few gut feelings:
\begin{itemize}
\item recursion must not traverse instantiations: that is we cannot have $\mu \alpha .\BX[\tx=\alpha[\tx=t]]$ (whatever this means);
\item we definitively want to allow mutually recursive modules, and therefore we want to allow mutually recursive module types; but this must be done in a controlled way. For instance, we do not want $M.t = M'.u$ and $M'.u = M.t$ as this creates an infinite loop when trying to expand the types; the question is does normal contractivity suffice here (infinite unfolding must traverse infinitely many arrows or map constructors)?
\end{itemize}
Shall we postpone the problem to future work? \textbf{In any case the current formalization does not handle mutually recursive modules since the well-formation rules require modules to be defined before being used. Which means that this is an orthogonal problem, the system we define here assumes modules comes as defined.}}
\subsection{Types}
Types are either the simple types $t$, or the module system types $T$. We have already discussed the syntax and intended meaning of these types before, therefore we will not repeat it here. This stratification is important to define type containment. We will use two different techniques to define the subtyping relation. For simple types we will define semantic subtyping, by interpreting them into sets of elements of a domain and defining subtyping as the containment of the interpretations. For module system types $T$ we will define a syntactic subtyping relation via a deduction system. The stratification of types allows us to define subtyping on modules types $T$ by structural induction on their syntax, using the semantic subtyping on simple types $t$ when needed. Next we are going to discuss the consequences of this stratification and then we will give the formal definitions of types and their interpretation.
\subsubsection{Stratification}\label{sec:stratification}
The important point of our type system is that types are strictly stratified: module system types $T$ can contain simple types $t$, but not vice versa. This stratification has several important consequences. On the positive side, it allows us to define a semantic interpretation of simple types $t$ as sets of values, and therefore a semantic subtyping relation on simple types $t$, and a syntactic subtyping relation on module system types $T$. For this reason hereafter we will also call simple types as \emph{semantic types} and $T$ types as syntactic types. On the negative side, the only proper subtypes of \texttt{module()} that can occur in simple types $t$ are (Boolean combinations of) fully instantiated behaviours $\BX[\overline{\tx=t}]$. This means that in a tuple or a map or a list of Elixir, only modules that implement some behaviour can occur in them. To palliate this limitation, we added to simple types the singleton types of fully instantiated modules.
\begin{remark}
In practice, one can envisage to provide the programmer some syntax to denote structural module types. A possibility is to add to the Elixir language of types the \texttt{like($M$)} syntax to denote the anonymous structural type of a (fully instantiated) module $M$ as well as a more programming friendly synatax (e.g., \texttt{behaviour\_of($M$)} or probably just $M$) to denote the singleton type $\langle M\rangle$. Of course while \texttt{behaviour\_of($M$)} or $M$ can be used in simple types $t$, the type \texttt{like($M$)} can be used only in module system types $T$. Such additions do not change the underlying type theory we present next.
\end{remark}
Next we are going to discuss the semantic interpretation of simple types $t$. As we will see, we need to introduce a further stratification that will separate simple types $t$ into the current Elixir types and the new types we added to handle modules.
\subsubsection{Semantic subtyping for simple types}
Simple types, ranged over by $t$ are essentially the types of Elixir defined in \cite{CDV24,CD25}, with the addition of the four new forms: $\tx$, $M.\tx$, $\BH M$, and $\BX[\overline{\tx=t}]$. The first is used in a module and denotes either a type locally defined or a type parameter; the second denotes the type $\tx$ exported by the module $M$; the third denotes the singleton type of the (fully instantiated) module $M$, while the fourth denotes the behaviour $\BX$ instantiated with the types $\overline{t}$ for its parameters $\overline{\tx}$.
These four additional forms are essentially different in nature from types that are currently already present in Elixir. The Elixir types have a clear semantics, that is independent of the program they are used in. Furthermore, they are conductively defined in order to account for recursive types. The four new forms instead have a semantics that is strictly related to the program they are used in: in particular, the semantics of $\tx$ depends on the module it is used in, the semantics of $M.\tx$ and $\BH M$ depends on the definition of the module $M$, while the semantics of $\BX[\overline{\tx=t}]$ depends on the definition of the behaviour $\BX$. Furthermore, we do not want to allow recursion to traverse these new forms, and this for two reasons. First, we do not want a non-recursive module or behaviour to become recursive by using instantiation: think of a recursive type definition like $\mu \alpha .\BH{X[t=\alpha]}$ where $X$ is a non-recursive module, or $\mu \alpha .\BX[\tx=\alpha]$ where $\BX$ is a non-recursive behaviour; these definitions make the module/behaviour become recursive. Second, the presence of such a recursion would complicate the semantic definitions, by introducing a new circularity: types such as $\BH{X[\overline{\tx=t}]}$ and $\BX[\overline{\tx=t}]$, must be interpreted as singletons modulo the interpretation of the types $\overline t$ used in their instantiations---equivalent types require equivalent instantiations---; so to interpret these types we must determine the equivalence of the types of their instantiations, for which we need the interpretation of all the types, them included.
For these reasons we will further stratify simple types $t$ into the current Elixir types---that we call \emph{core types}---, which are \emph{coinductively} defined by the following grammar:
$$\tau ::= k\mid\texttt{atom} \mid \texttt{mod} \mid \texttt{fun} \mid (\overline{\tau})\to\tau \mid \texttt{\%}\!\left\{\overline{k:\tau}\right\} \mid
\tau \lor \tau \mid \lnot \tau \mid \Empty\mid \alpha
$$
The simple types $t$ are then defined on the top of the core types $\tau$ \emph{inductively} by the following grammar:
$$t ::= \tau \mid (\overline{t})\to t \mid \texttt{\%}\!\left\{\overline{k:t}\right\} \mid
t \lor t \mid \lnot t \mid \tx\mid M.\tx \mid \BX[\overline{\tx=t}] \mid \BH M
$$
This stratification allows us to define the semantic interpretation of core types as sets of values, and the semantic of the simple types by giving an encoding of the new forms $\tx$, $M.\tx$, $\BH M$, and $\BX[\overline{\tx=t}]$ into core types. This encoding will be parametric in the definitions of the modules and behaviours of the program, since the semantics of these new forms depends on the definitions of the modules and behaviours they refer to. In this way instead of giving an interpretation of types parametric in the module and behaviour definitions, we just give an encoding of the new forms of types parametric into these definitions.
Formally, we have the following definitions.
\begin{definition}[Core Types]\label{def:coretypes}
Let $k$ range over atoms and $\alpha$ over type variables.
The set $ \Coretypes $ of \emph{core types} is the set of terms $ \tau $
that are \underline{coinductively} produced by the following grammar
\[
\tau ::= k\mid \texttt{atom} \mid \texttt{mod} \mid \texttt{fun} \mid (\overline{\tau}) \to \tau \mid \texttt{\%}\!\left\{\overline{k:\tau}\right\} \mid
\tau \lor \tau \mid \lnot \tau \mid \Empty\mid \alpha
\]
and that satisfy two additional constraints: $(1)$ \emph{regularity}:
the term must have a finite number of different sub-terms; $(2)$ \emph{contractivity}:
every infinite branch must contain an infinite number
of occurrences of the product or arrow type constructors.
\end{definition}
Core types are a relevant subset the polymorphic types of Elixir as defined in \cite{CDV24,CD25}. They include a singleton type $k$ for each atom $k$, the basic types \texttt{atom} of all atoms, \texttt{mod} of all modules, and \texttt{fun} of all functions, the arrow type $(\overline{\tau}) \to \tau$ for multi-arity functions, the open map type $\texttt{\%}\!\left\{\overline{k:\tau}\right\}$ mapping atom keys to core types, the set-theoretic connectives union $t_1 \lor t_2$, negation $\lnot t$, and the empty type $\Empty$, as well as type variables, ranged over by $\alpha$. The types defined in \cite{CDV24,CD25} also include tuples, lists, possibly closed maps with non-atom keys and optional fields, as well as a larger range of basic types (e.g., integers, floats, processes, ports, strings, ...) but we do not need them for our formalization.
Coinduction accounts for recursive types, and it is coupled with a
contractivity condition which excludes infinite terms
that do not have a meaningful interpretation as types or sets of values:
for instance, the trees satisfying the equation
$ \tau = \tau \lor \tau $
(which gives no information on which values are in it)
or $ \tau = \lnot \tau $ (which cannot represent any set of values) do not
satisfy the contractivity condition.
Contractivity also gives an induction principle on the set of core types that allows us to apply the induction hypothesis
below type connectives (union and negation),
but not below type constructors (records and arrows).
The subtyping relation for core types is defined semantically by interpreting them as sets of values, and defining subtyping as the containment of the interpretations. The interpretation of functional core types is standard and can be found in \cite[Sections 3.1 and 4.2]{DubocThesis26} (see~\cite[Section 3.2]{Cas24} for a simpler presentation), and so is the interpretation of map types, which are interpreted as quasi-constant functions from labels to sets of values as in \cite{Cas23}. The only difference with the cited work is that in our case the set of labels contains not only all the atoms, but also some further labels that we use to encode simple types into core types, namely, the set of all type names $\tx$, as well as three distinguished labels \texttt{@p}, \texttt{@o}, and \texttt{@n}. For the reader convenience we resume the interpretation of our core types in Appendix~\ref{app:coretypes}.
\begin{definition}[Pretypes]\label{def:types}
Let $\tau$ range over core types, $X$ over module names, $\BX$ over behaviour names, $\tx$ over type names, and $M$ over module paths.
The set $ \Types $ of \emph{pretypes} is the set of terms $ t $
that are \underline{inductively} produced by the following grammar
\[
t ::= \tau \mid (\overline{t}) \to t \mid \texttt{\%}\!\left\{\overline{k:t}\right\} \mid
t \lor t \mid \lnot t \mid \tx \mid M.\tx \mid \BX[\overline{\tx=t}] \mid \BH M
\]
\end{definition}
%
Pretypes include all core types along with their type constructors and connectives. Additionally, they include four new module-related types: $\tx$ (the name of a type parameter or of a locally declared type), $M.\tx$ (the type $\tx$ exported by module $M$), $\BX[\overline{\tx=t}]$ (the behaviour $\BX$ instantiated with types $\overline{t}$ for its parameters $\overline{\tx}$), and $\BH M$ (the singleton module type). We call the types of Definition \ref{def:types} ``pretypes'' because we need to identify which ones are well-formed for a given set of module and behaviour definitions as we show below (e.g., to ensure that all paths point to valid definitions).
We use the abbreviations
$
t_1 \land t_2 \eqdef \lnot (\lnot t_1 \lor \lnot t_2)
$, $
t_1 \setminus t_2 \eqdef t_1 \land (\lnot t_2)
$, and $
\Any \eqdef \lnot \Empty
$ (in particular, $\Any$ is the supertype of all types, denoted in Elixir by \elix{term()}).
We refer to \texttt{atom}, \texttt{mod}, \texttt{fun}, $\texttt{\%}\!\left\{\right\}$, and $ \to $ as \emph{type
constructors}, and to $ \lor $, $ \lnot $, $ \land $, and $ \setminus
$ as \emph{type connectives.} As customary, connectives have priority
over constructors and negation has the highest priority of all---e.g., $\neg
s{\vee}t\to u{\wedge}v$ denotes $((\neg s){\vee}t)\to (u{\wedge}v)$.
\subsubsection{Encoding of simple types into core types}\label{sec:encoding}
To proceed, we need to identify which pretypes are well-formed for a given set of module and behaviour definitions. Once we have identified the well-formed pretypes, we can encode them into core types. Both of these tasks require us to first formally associate to each behaviour its definition and to each module its type. These are given by the behaviour environments $\Sigma$ and the type environments $\Gamma$, defined in Figure~\ref{fig:modulelangsyntax}. A behaviour environment $\Sigma$ associates to each behaviour name $\BX$ its definition $\Lambda(\overline{\tx:\star})\{\overline D\}$, that is, the list of its parameters $\overline\tx$ and of its declarations $\overline D$. A type environment $\Gamma$ associates to each module name $X$ its type $(\overline{\tx:\star})\to\{\overline{D}\}_{\overline{\BX[\,\overline{\tx=t}\,]}}$, that is, the list of its parameters $\overline{\tx:\star}$, of the behaviours it implements $\overline{\BX[\,\overline{\tx=t}\,]}$, and of its declarations $\overline D$. A type environment $\Gamma$ also contains associations of the form $\tx:\star$, stating that the type name $\tx$ is locally available.
Given a behaviour environment $\Sigma$ and a type environment $\Gamma$, we can define the well-formed types inductively as those pretypes that satisfy the following conditions:
\begin{itemize}
\item a core type is always well-formed;
\item a type name $\tx$ is well-formed if $\Gamma$ contains the association $\tx:\star$;
\item every type occurring in the pre-type is well-formed;
\item every behaviour $\BX$ occurring in the pre-type is fully instantiated (i.e., all its parameters are given a type);
\item every path $M$ occurring in the pre-type is well-formed, that is, it effectively points to a module and it fully instantiates it.
\end{itemize}
Notice that we do not require $\Sigma$ and $\Gamma$ to be well-formed. In particular, we do not require that they contain only well-formed module system types. This will be required for typing, but for defining the encoding of well-formed types into core types we just need to check that paths point somewhere and that all parameters declared for a module or a behaviour are instantiated with well-formed simple types: module system types $T$ do not play any role in the encoding. In practice however all environments that we will use will be well-formed, since they will be all generated by the typing rules, which enforce well-formedness for environments.
We can now give the intuition of the encoding of well-formed types into core types. For core types, the encoding is just the identity. The simplest encoding is the one for behaviours. In Elixir behaviours use name subtyping: the only proper subtype of a (fully instantiated) behaviour is the same behaviour instantiated with equivalent types.\footnote{Two type are equivalent if they have the same interpretation, that is if they are one subtype of each other.} If we denote by \atom{\BX} the atom of the name of the behaviour $\BX$,\footnote{As a curiosity, Elixir provides syntax for it, namely \texttt{:"$\BX$"} since, say, \felix{Atom == :"Elixir.Atom"} returns \felix{true}.} then the idea is to encode $\BX[\overline{\tx=t}]$ by a record with a special ``name'' field \texttt{@n} containing the \emph{negation} of the atom \atom{\BX} (the reason of the negation will become clear when encoding singleton module types), together with a field $\tx$ for each parameter $\tx$ of $\BX$, containing the encoding of the type $t$ used to instantiate $\tx$. While the record type abstracts over the order of the instantiations, there are still two caveats: first the type used to for the parameter fields must be invariant; second, the encoding must not be empty even if an instantiation is empty. To this end given a type $t$, we define: \( \texttt{eq}(t) \eqdef (t\cup\{\mho\})\times (\neg t\cup\{\mho\})\), where $\mho$ is a distinguished constant of the interpretation domain. The construction $\texttt{eq}(t)$ is invariant on the type $t$ and it is never empty (since it contains $(\mho,\mho)$).
\begin{lemma}
Let $t$ and $t'$ be two types, then $\texttt{eq}(t)\leq\texttt{eq}(t')$ if and only if $t\simeq t'$.
\end{lemma}
Then the encoding of $\BX[\overline{\tx=t}]$ is the record type $\texttt{\%}\!\left\{\texttt{@n: $\neg\atom{\BX}$},\;\overline{\tx:\texttt{eq}(t')}\right\}$, where $t'$ is the encoding of $t$. As a concrete example consider the behaviour \texttt{StackB} with a single parameter \texttt{elem}, then the encoding of \texttt{StackB[elem=integer()]} is the record type $\texttt{\%}\!\left\{\texttt{@n: $\neg$:StackB},\;\texttt{elem}:\texttt{eq}(\texttt{integer()})\right\}$.
The encoding of the singleton type $\BH M$ of a fully instantiated module $M$ is similar to the one of behaviours, but instead of using a special field \texttt{@n} for the behaviour name, it uses the special ``path'' field \texttt{@p} to encode the path of the module $M$. The encoding of $\BH{X.Y[\tx=t]}$ is then a record type with at least the fields $\texttt{@p}: \atom{X.Y}$ and $\tx:\texttt{eq}(t')$, where $t'$ is the encoding of $t$. But besides the fields for the path and the instantiation, there may be further fields in the case $M$ implements some behaviour $\BX$, since we want the encoding of the singleton type $\BH M$ to be a subtype of all (the encodings of) the behaviours it implements. So if $\Gamma$ states that $M$ implements $\overline{\BX[\overline{\tx=t}]}$, then the encoding of $\BH M$ will also contain the field (\texttt{@n: $\bigwedge_{\BX\in\overline{\BX}} \neg \atom{\BX}$}) to account for the name of the behaviours implemented by $M$, as well as a field $\tx:\texttt{eq}(t')$ for each parameter $\tx$ of each behaviour $\BX$ implemented by $M$, where $t'$ is the encoding of the type $t$ used to instantiate $\tx$ in the implementation of $\BX$ by $M$. We now see the reason of the use of a negation in the encoding of behaviours: by specifying in the field \texttt{@n} of the encoding of $\BH M$ the intersection of the negations of the atoms of the names of the behaviours implemented by $M$, we ensure that the encoding of $\BH M$ is a subtype of the encoding of all the behaviours implemented by $M$ (without the negation such an intersection would be empty).
The encoding of $M.\tx$ is slightly more complicated and it depends on whether the type exported by $M.\tx$ is opaque or transparent. If the type exported by $M.\tx$ is opaque, then we can use exactly the same technique we used for the previous two forms, this time using the \texttt{@o} ``opaque'' label to store the full path: for example the encoding of $X.Y[\tx=t].\tz$ will be $\texttt{\%}\!\left\{\texttt{@o}: \atom{X.Y.\tz},\tx:\texttt{eq}(t')\right\}$, where $t'$ is the encoding of $t$. Intuitively, this corresponds to introduce a new fresh type constant to represent the opaque type, which to all effects will behave exactly like a fresh monomorphic type variable. If the type exported by $M.\tx$ is transparent, then we just unfold the path and encode it as the encoding of the type $t$ that $M.\tx$ is transparently equal to. The definition is straightforward since there are two possible cases: either the type exported by $M.\tx$ is of the form $[=\tau]$, then we encode $M.\tx$ by $\tau$, or it is of the form $M'.\ty$, in which case we encode $M.\tx$ by the encoding of $M'.\ty$. In general such an unfolding may not always be possible: even if pretypes are inductively defined, the $\Gamma$ environment can not be well-formed, and therefore have circular or dangling definitions. However, when typing we will always use well-formed environments that will ensure that such an unfolding always succeeds.
Finally, we encode a type names $\tx$ by the encoding it as $M.\tx$ where $M$ is the absolute path of the current module, instantiated with $[\overline{\tx=\tx}]$, and where type parameters are treated as opaque type declarations. This is just a technical trick to be able to identify inside a module $M$, the types $\tx$ and $M.\tx$ as being the same type.
As the very last thing we have to define the encoding of the top module type \texttt{mod} and of the top record type. Since we want \texttt{mod} to be a supertype both of all behaviors and of all singleton module types (but not of the opaque variable), then it is encoded as the open record type $\texttt{\%}\!\left\{\texttt{@n:} \neg\texttt{atom},\texttt{@o}:\bot\right\}$. For the top record type $\texttt{\%}\!\left\{\right\}$ we want it to be a supertype of all records in which the special fields are undefined: therefore it will be encoded as the is the open record type $\texttt{\%}\!\left\{\texttt{@n}:\bot,\texttt{@p}:\bot, \texttt{@o}:\bot\right\}$.
% \begin{itemize}
% \item encoding of well-formed types into core types, parametric in $\Sigma$ and $\Gamma$
% \end{itemize}
% \aghilas{ there is still a question for the encoding to core types: if $M.\tx$
% is transparent equal to $t$ how to encode $t$ to a core type as it is not a
% subtype of $M.\tx$ and that it is not defined in the same $\Gamma$. Modifying
% the syntax of declaration types to have: $\tx:[=\tau]$ would fix the issue. If
% we note $\Sigma;\Gamma\vdash \wr t\wr =\tau$ the encoding. Then we express this
% idea as the following rules.
% \begin{mathpar}
% \inferrule[Bind-Type]
% {\Sigma;\Gamma\vdash t:\star\\
% \Sigma;\Gamma\vdash \wr t\wr=\tau\\
% \Sigma;\Gamma, x:[=\tau]\vdash\overline{B}:\overline{D}}
% {\Sigma;\Gamma\vdash\left(x=\textsf{type }t\right)\overline{B}: \left(x:[=\tau]\right)\overline{D}}
% \\
% \inferrule[Bind-Opaque]
% {\Sigma;\Gamma\vdash t:\star\\
% \Sigma;\Gamma\vdash \wr t\wr=\tau\\
% \Sigma;\Gamma, x:[=\tau]\vdash\overline{B}:\overline{D}}
% {\Sigma;\Gamma\vdash\left(x =\textsf{opaque }t\right)\overline{B}: \left(x:\star\right)\overline{D}}
% \end{mathpar}
% }
% Let us start with the interpretation of $\BX[\overline{\tx=t}]$. Notice that we can deal with the type \texttt{module}, the top module type, in the same way as we did for \texttt{atom} in \cite{CD25}, as it can be considered as the infinite union of all the singleton types that form it. In the case of \texttt{atom} the singleton types are the single atoms. In the case of \texttt{module} the singleton types are all the fully instantiated behaviours, modulo the equivalence of their instantiation. Then every type contained in \texttt{module} is either a finite or a cofinite union of fully instantiated behaviours.
% Therefore, we can define the interpretation of $\BX[\overline{\tx=t}]$ as the set containing all the equivalent instantiations of the behaviour $\BX$, where $\BX[\overline{\tx=t}]\simeq\BX[\overline{\tx=t'}]$ iff for every $\tx_i=t_i$, we have $t_i\simeq t'_i$ (the order of the parameters does not matter, just their names and instantiations). Adding the \texttt{behaviour\_of($M$)} singleton types as suggested in Section \ref{sec:stratification} does not change the theory as these behaviours are singletons, modulo the equivalence on $M$ (see Figure~\ref{fig:pathequivalence}). Bottom line, behaviours are a new basic type handled exactly as \elix{atom()}.
% For what concerns the interpretation of $M.\tx$, we can always use the type of the module names that form $M$ to deduce the simple type $M.\tx$ denotes. This is either a transparent type $t$, in which case we can define the interpretation of $M.\tx$ as the interpretation of the type $t$, or it is an opaque type, in which case we consider $M.\tx$ as a type variable (modulo the equivalence of paths defined in Figure~\ref{fig:pathequivalence}). It is important to notice that $M.\tx$ must be well-formed, that is the last module in the path $M$ (or the also all the intermediates modules of the path if we use the more general syntax for paths) must be completely instantiated, so that we can always deduce the type bound to $\tx$.
% The types are essentially the same as in \cite{CDV24}, with the addition of the three new forms for modules: $\BX[\overline{\tx=t}]$, $\BH M$, and $M.\tx$. These additions require the definition of type well-formation: $\BX[\overline{\tx=t}]$ requires the full instantiation of $\BX$ (i.e., $\BX$ is a behaviour with parameters $\overline\tx$ instantiated by well-formed types $\overline{t}$); $\BH M$ requires the path $M$ to be well-formed, namely that it effectively points to a fully instantiated module
\beppe{In future we may study whether fully instantiation is necessary/desirable, at least in singleton modules but even in paths};
%$M.\tx$ requires the path $M$ to be well-formed and the type $\tx$ to be defined in the module $M$. All the other types are well-formed if all their sub-terms are well-formed. Clearly well-formedness depends on definition of the modules and of the behaviours. Therefore, type good formation is given by judgment of the form $\Sigma,\Gamma\vdash t:\star$, where $\Sigma$ is a behaviour environment mapping behaviour names $\BX$ to their definition (of the form $\Lambda(\overline{\tx:\star}).\{\overline D\}$) and $\Gamma$ is a typing environment mapping expressions names (i.e., module names and expression variables) into module types (in particular it maps module names into types either of the form $\left\{\overline{D}\right\}_{\overline{\BX}}$ or of the form $(\overline{\tx:\star})\to \left\{\overline{D}\right\}_{\overline{\BX}}$. These environments must also be well-formed, meaning that all the types occurring in them are well-formed and fully instantiated where needed.
\paragraph{Module scoping rules.} A module has access to all its definitions, including the private ones, and to all the definitions of the modules in its scope. Order is important thus at a given point of the module only the preceding definitions are available. A module can access to the definitions of outer modules only by using an absolute path starting from a top-level module name; in that case however it has access to all (public) definitions of the outer module, even those defined after the inner module.
This may come to a surprise to an Elixir programmer, since in a Elixir program it is possible to refer to the definition of an outer module without using an absolute path. For instance, in the following code snippet the inner module
\begin{minted}{elixir}
defmodule Outer do
def module A do
def f() do :ok end
end
def module B do
def g() do A.f() end
end
end
\end{minted}
can refer to the definition of the module \elix{A} without using the absolute path\elix{Outer.A}. However, this just syntactic sugar that the Elixir compiler replaces with the full path, \emph{before} performing type checking. Therefore, from the point of view of the type system all paths are either absolute or are relative to the current module.
Finally, public local definitions of a module can be accessed within the module by using either by just their name or by their absolute path. Our system allows both and makes no distinction between the two, even in practice the paths may point to different definitions in case of hot code reloading, but we do not want to complicate the theory with this additional case, for now.
\aghilas{\paragraph{Applicative vs. Generative} Type application should be
applicative. We might want general application (i.e. when a function returns a
module) to be generative by default. We could add an option to let users
declare that a function should be applicative (how to decide equality ?). }
\subsubsection{Syntactic subtyping for module system types}
The subtyping relation on module system types $T$ is defined syntactically,
since we do not have a semantic interpretation of these types. This is possible
since module system types are stratified and therefore we can define subtyping
on the module types $T$ by structural induction on their syntax, using the
semantic subtyping on simple types $t$ when needed.
We have two different environments: $(1)$ the behaviour environment
(noted $\Sigma$) maps behaviour names ($\BX,\BY,\ldots$) to behaviours:
$\mathcal{B}$; $(2)$ the value environment (noted $\Gamma$) which contains
four types of mappings: $D$.\beppe{I would prefer "type environment" instead of "value environment" since it declares the types of the names (not the values)}
A value environment is, in fact, a list of declarations, the good formation
rules for both are thus the same.
We define the following good formation judgments:
\begin{itemize}
\item Good formation of behaviour environments: $\Sigma\vdash\Sigma'$ (Figure~\ref{wf-behenv});
\item Good formation of value environments: $\Sigma;\Gamma\vdash\Gamma'$ (Figure~\ref{wf-valenv});
\item Good formation of module system types: $\Sigma;\Gamma\vdash T$ (Figure~\ref{wf-modtype});
\item Good formation of semantic types: $\Sigma;\Gamma\vdash t:\star$, we will
replace $t$ by the inductive layer of our stratification, the rules will
be the same as in Figure~\ref{wf-type}; \beppe{Je ne comprends pas ce que tu veux dire}
\item Good formation and typing of paths: $\Sigma;\Gamma\vdash M:\{\overline{D}_{\mathcal{X}}\}$.
\end{itemize}
\paragraph{Good formation of environments and types}
The rules in Figure~\ref{wf-behenv} define when a behaviour environment is well-formed. Rule \Rule{WFBehEnv-Empty} states that the empty behaviour environment is well-formed. Rule \Rule{WFBehEnv-Beh} states that if from the assumption that the type names parameters of the behaviour {\BX} are well-formed types, we can deduce the good-formation of the list of declarations $\overline{D}$, then adding $\BX:\Lambda(\overline{\tx:\star}).\{\overline{D}\}$ to the current well-formed behaviour environment yields a well-formed environment. Notice that no behaviour instantiation is given for $\{\overline{D}\}$ (i.e., we used $\{\overline{D}\}$ rather than $\{\overline{D}\}_{\overline{\BX[\overline{\tx=t}]}}$), since in Elixir behaviours cannot implement other behaviours. Also, for the sake of simplicity we examine only the case in which a behaviour definition depends only on the behaviour definitions that precede it, but not on any module definition, even though in Elixir it is possible to have a behaviour definition that depends on a module definition (e.g., in Elixir the \elix{Logger.Backend} behaviour declares callbacks that use the type \elix{Logger.Event.t()} exported by the \elix{Logger.Event} module). We can easily extend the system to allow for such dependencies, but we do not want to complicate the presentation with this additional case, which would require the interleaving of behaviour and module definitions: with this presentation we can just require that the behaviour definitions are all given at the beginning of the program, and then the module definitions can follow in any order.
\begin{figure}[ht]
\begin{mathpar}
\inferrule[WFBehEnv-Empty]{\\}{\vdash\epsilon}\qquad\qquad
\inferrule[WFBehEnv-Beh]{\vdash\Sigma \\ \Sigma;\overline{\tx:\star}\vdash\overline{D}}{\vdash\Sigma,\BX:\Lambda(\overline{\tx:\star}).\{\overline{D}\}}
\end{mathpar}
\caption{Good formation of behaviour environments}\label{wf-behenv}
\end{figure}
\paragraph{Good formation for value environments}
Figure~\ref{wf-valenv} defines well-formed value environments. Rule \Rule{WFVEnv-Empty} states that the empty value environment is well-formed for any well-formed behaviour environment. Rule \Rule{WFVEnv-Opaque} states that if $\Gamma$ is well-formed, then adding an opaque type declaration $\tx:\star$ to $\Gamma$ yields a well-formed environment. Rule \Rule{WFVEnv-Trans} requires the good formation of the transparent type $t$. Likewise, rules \Rule{WFVEnv-Module} and \Rule{WFEnv-Value} require the good formation of the type(s) $T$ of the module $X$ and value $x$ being declared. Again, for the sake of simplicity we examine only the case in which a module definition depends only on the module definitions that precede it, though in Elixir is possible to have mutually recursive module definitions.
% \aghilas{
% We miss a rule for values $x$, we may rename \Rule{WFVEnv-Value} to \Rule{WFVEnv-Module} and add the following rule.
% \begin{mathpar}
% \inferrule[WFVEnv-Value]{\overline{\Sigma;\Gamma\vdash T}}
% {\Sigma;\Gamma\vdash\Gamma, x:\bigcap \overline{T}}
% \end{mathpar}
% In prose, it could be something like: Likewise, rules \Rule{WFVEnv-Module}
% and \Rule{WFVEnv-Value} require the good formation of the types $T$ of the
% value $X$ or $x$ being declared.
% Also, is the judgement really of the form $\Sigma;\Gamma\vdash\Gamma'$? Isn't
% $\Sigma\vdash \Gamma$ a better formulation? For behaviour environments, the
% judgement has type $\vdash\Sigma$, not $\Sigma\vdash\Sigma'$. In all cases, the
% \Rule{WFVEnv-Opaque} doesn't follow the same convention as the others in its
% hypothesis.
% Just replace the figure with the following one to take into account the
% previous comments.
\begin{figure}[h]
\begin{mathpar}
\inferrule[WFVEnv-Empty]{\vdash\Sigma}{\Sigma\vdash\epsilon}\hfill
\inferrule[WFVEnv-Opaq]{\Sigma\vdash\Gamma}{\Sigma\vdash\Gamma,\tx:\star}\hfill
\inferrule[WFVEnv-Trans]{\Sigma\vdash\Gamma\\\Sigma;\Gamma\vdash t:\star}{\Sigma\vdash\Gamma,\tx:[= t]}\hfill
\inferrule[WFVEnv-Module]{\Sigma\vdash\Gamma\\\Sigma;\Gamma\vdash T}{\Sigma\vdash\Gamma,X:T}\hfill
\inferrule[WFVEnv-Value]{\Sigma\vdash\Gamma\\\overline{\Sigma;\Gamma\vdash T}}
{\Sigma\vdash\Gamma, x:\bigcap \overline{T}}
\end{mathpar}
\beppe{Problem: a module can access to the definitions of outer modules only by using an absolute path starting from a top-level module name; now the rule \Rule{WFVEnv-Module} allows to check the type $T$ of a module $X$ by using the current environment $\Gamma$, which may contain definitions of the current module, which thus are not accessible by $X$. Probably we need to change the rule for good formation of $\overline{D}$ (Figure~\ref{wf-modtype})with a special case for $X:T$ \begin{mathpar}
\inferrule[WFModType-DeclList]
{\Sigma;\Gamma\vdash \overline{D} \\
\Sigma;\Gamma\vdash X:T}
{\Sigma;\Gamma\vdash \overline{D},X:T}\qquad\qquad
\inferrule[WFOtherType-DeclList]
{\Sigma;\Gamma\vdash \overline{D} \\
\Sigma;\Gamma,\overline{D}\vdash D'}
{\Sigma;\Gamma\vdash \overline{D},D'}{D'\neq X:T}
\end{mathpar}
in words, when we check the good formation of $X:T$ we can use the current environment $\Gamma$, but not the definitions that pertain the current module. }
\caption{Good formation of value environments}\label{wf-valenv}
\end{figure}
% }
% \begin{figure}[h]
% \begin{mathpar}
% \inferrule[WFVEnv-Empty]{\vdash\Sigma}{\Sigma;\epsilon\vdash\epsilon}\hfill
% \inferrule[WFVEnv-Opaque]{\Sigma\vdash\Gamma}{\Sigma;\Gamma\vdash\Gamma,\tx:\star}\hfill
% \inferrule[WFVEnv-Trans]{\Sigma;\Gamma\vdash t:\star}{\Sigma;\Gamma\vdash\Gamma,\tx:[= t]}\hfill
% \inferrule[WFVEnv-Value]{\Sigma;\Gamma\vdash T}{\Sigma;\Gamma\vdash\Gamma,X:T}
% \end{mathpar}
% \caption{Good formation of value environments}\label{wf-valenv}
% \end{figure}
\paragraph{Good formation for module system types}
A module-system type $T$ is well formed when it satisfies the rules in Figure~\ref{wf-modtype}. These rules enforce two requirements: $(1)$ in a function type $(\overline{N:T}) \to T'$, each name $N_i$ may appear only to its right, namely in $T'$ and in the types $T_j$ with $j>i$; $(2)$ in a module type $\{\overline{D}\}_{\overline{\BX[\overline{\tx=t}]}}$, each behaviour in $\overline{\BX[\overline{\tx=t}]}$ must be fully instantiated, and $\overline{D}$ must include at least the declarations of all callbacks required by those behaviours.
The first requirement is enforced by rule \Rule{WFModType-Arrow}: for each $j$, it checks that $T_{j+1}$ is well formed in the environment extended with $N_i:T_i$ for all $i\leq j$.
\begin{figure}[h]
\begin{mathpar}
\inferrule[WFModType-Arrow]{\text{for } j=0..n\quad\Sigma;\Gamma,\overline{N_i:T_i}^{i=1..j}\vdash T_{j+1}}
{\Sigma;\Gamma\vdash \left(\overline{N_i:T_i}^{i=1,\ldots,n}\right)\to T_{n+1}}
\qquad
\inferrule[WFModType-TArrow]{\Sigma;\Gamma,\overline{\tx:\star}\vdash T}{\Sigma;\Gamma\vdash(\overline{\tx:\star})\to T}
\qquad
\inferrule[WFModType-DeclList]
{\Sigma;\Gamma\vdash D \\
\Sigma;\Gamma,D\vdash \overline{D}}
{\Sigma;\Gamma\vdash D,\overline{D}}
\\
\inferrule[WFModType-Module]
{ \Sigma;\Gamma\vdash\overline{D}\\
(\text{for }i=1..n\quad \overline{\Sigma;\Gamma\vdash{t:\star}}^i\\\BX_i:\Lambda(\overline{\tx:\star}^i).\{\overline{D}^i\}\in\Sigma)\\
\forall i\neq j.
(\dom(\overline{D}^i)\#\dom(\overline{D}^j)\text{ and } \overline{x}^i\#\overline{x}^j)\\
\Sigma;\Gamma\vdash\left\{\overline{D}\right\} \preccurlyeq
\left\{\overline{\overline{D}^i[\overline{\tx:=t}^i]}^{i=1..n}\right\}\\
}
{\Sigma;\Gamma\vdash\{\overline{D}\}_{\overline{\BX_i[\;\overline{\tx=t}^i\,]}^{i=1..n}}}
\end{mathpar}
\caption{Good formation of module system types}\label{wf-modtype}
\end{figure}
The second condition is checked by rule \Rule{WFModType-Module}, by far the most complex one. This rule validates the type of a module that implements behaviours $\BX_i$ with type instantiations $\overline{\tx=t}^i$.
The rule performs the following checks in sequence.
First, it verifies that the module's declarations $\overline{D}$ are well-formed using rule \Rule{WFModType-DeclList}, and that all type instantiations $\overline{t}^i$ are well-formed types.
Second, for each implemented behaviour $\BX_i$, it confirms that a corresponding behaviour definition exists in $\Sigma$ with the expected type parameters $\overline{\tx}^i$ (i.e., $\BX_i:\Lambda(\overline{\tx:\star}^i).\{\overline{D}^i\}\in\Sigma$).
Third, it performs a sanity check to ensure that different implemented behaviours do not declare conflicting names: the declarations of behaviour $i$ must be disjoint from those of behaviour $j$ when $i\neq j$ (i.e., $\dom(\overline{D}^i)\#\dom(\overline{D}^j)$) \beppe{\textbf{Not sure that the following is needed:} and that the parameters of different implemented behaviours do not have the same name (i.e., $\overline{x}^i\#\overline{x}^j$ for $i\neq j$)}. Finally, the rule constructs the module type "expected" by the behaviours, by merging all behaviour declarations $\overline{D}^i$, substituting each type parameter $\tx$ with its instantiation $t$ from $\overline{\tx=t}^i$. It then verifies that the module's actual type $\{\overline{D}\}$ is a subtype of this expected type, ensuring that the module provides all required declarations with compatible types. Note that instantiation is applied to behaviour declarations, not to the module type itself, since the module cannot reference behaviour parameters.
We also have the rule \Rule{WFModType-TArrow} which is straightforward since just checks the good formation of the body of a parametric module type, under the assumption that the parameters are well-formed types.
\paragraph{Good formation of semantic types} To complete the good formation of types, we need to check that the pretypes are well-formed, that is they satisfy the conditions listed at the beginning of this section. This is done by the rules in Figure~\ref{wf-type}.
Core types are always well-formed (provided that they are checked under well-formed environments) and the rules for type connectives and type constructors just check the good formation of their subterms, which is possible since pretypes are inductively defined (these checks are performed by the first five rules in Figure~\ref{wf-type}).
A type name $\tx$ is well-formed if $\Gamma$ contains an association for it (rule \Rule{WFType-Name}).
The behaviour $\BX[\overline{\tx=t}]$ is well-formed if $\Sigma$ contains the definition of $\BX$ with parameters $\overline{\tx}$ and these are instantiated by well-formed types (rule \Rule{WFType-Behaviour}, which also checks the good formation of the environments---$\Sigma\vdash\Gamma$---in case there were no instantiation).
\begin{figure}[ht]
\begin{mathpar}
\inferrule[WFType-Core]{\Sigma\vdash\Gamma}
{\Sigma;\Gamma\vdash\tau:\star}\hfill
\inferrule[WFType-Union]
{\Sigma;\Gamma\vdash t:\star\\\Sigma;\Gamma\vdash t':\star}
{\Sigma;\Gamma\vdash t\vee t':\star}\hfill
\inferrule[WFType-Times]
{\Sigma;\Gamma\vdash t:\star\\\Sigma;\Gamma\vdash t':\star}
{\Sigma;\Gamma\vdash t\times t':\star}\hfill
\inferrule[WFType-Function]
{\overline{\Sigma;\Gamma\vdash t:\star}\\\Sigma;\Gamma\vdash t':\star}
{\Sigma;\Gamma\vdash(\overline{t})\to t':\star}
\\
\inferrule[WFType-Record]
{\overline{\Sigma;\Gamma\vdash t:\star}}
{\Sigma;\Gamma\vdash \texttt{\%}\!\left\{\overline{k:t}\right\}:\star}
\hfill
\inferrule[WFType-Name]
{\Sigma\vdash\Gamma\\\tx\in\dom(\Gamma)}
{\Sigma;\Gamma\vdash \tx:\star}
\hfill
\inferrule[WFType-Behaviour]
{\Sigma\vdash\Gamma\\\BX:\Lambda(\overline{\tx:\star}).\{\overline{D}\}\in\Sigma\\
\overline{\Sigma;\Gamma\vdash t:\star}}
{\Sigma;\Gamma\vdash\BX[\overline{x=t}]:\star}\\
\inferrule[WFType-Singleton]
{\Sigma;\Gamma\vdash M:\texttt{mod}}
{\Sigma;\Gamma\vdash\langle M\rangle:\star}\\
\inferrule[WFType-Projection]
{\\}{\\}{\text{\aghilas{We need to choose a way to consider projection in path.}}}
\end{mathpar}
\caption{Good formation of types}\label{wf-type}
\end{figure}
\aghilas{
\paragraph{Good formation for paths}
We may want to change the name $M$ to $P$ in the syntax, as all not paths are
valid modules as they may lack some instantiations, and there is no ``empty
module'', even though we would like to have an empty path. To have more natural
definitions, I redefine paths as: $M::= \epsilon| M.X | M.X[\overline{x=t}]$.
The fact that there is no instantiation except for the last module has to be
added as a side condition, however the rules I give later do not rely on that
fact.
The rule has the type $\Sigma;\Gamma\vdash M:\overline{D}$, meaning that $M$ is
well-formed in environments $\Sigma$ and $\Gamma$ and exports $\overline{D}$.
The projection $M.\tx$ is then well-formed if and only if $\Sigma;\Gamma\vdash
M:\overline{D}$ and $\tx\in \dom(\overline{D})$ (notice that since $\tx$ ranges over type names, then in $\overline{D}$ can only be declared as $\tx:\star$ or $\tx:[=t]$). The rule \Rule{Path-Empty} for
the empty path states that the empty path is always well-formed and exports what
is in the current value environment, so that $\tx$ is a valid type in
$\Sigma;\Gamma$ if and only if it is declared in $\Gamma$.
\begin{mathpar}
\inferrule[Path-Empty]{\vdash\Sigma\\\Sigma\vdash\Gamma}
{\Sigma;\Gamma\vdash \epsilon:\Gamma}
\end{mathpar}
There are three cases of projections: $(1)$, if $M$ exports $X$ as a module of
type ${\{\overline{D'}\}}_{\_}$ without any parameters, then $M.X$ is well
formed and exports $\overline{D'}$ (rule \Rule{Path-NoParam}); if $M$ exports
$X$ as a module with parameters of type $(\overline{\tx:\star})\to
{\{\overline{D'}\}}_{\_}$ there are two cases: $(2)$, $M.X$ exports only the
modules present in $\overline{D'}$ using a function from value environments to
value environments called \textsf{restrict} (rule \Rule{Path-NoInstantiation}),
and $(3)$ $M.X[\overline{\tx = t}]$ exports all $\overline{D'}$ if the
paramaters are well-formed (rule \Rule{Path-Instantiation}).
\begin{mathpar}
\inferrule[Path-NoParam]{\Sigma;\Gamma\vdash M:\overline{D}\\
\overline{D}(X)={\{\overline{D'}\}}_{\_}}
{\Sigma;\Gamma\vdash M.X:\overline{D'}}\qquad
\inferrule[Path-NoInstantiation]{\Sigma;\Gamma\vdash M:\overline{D}\\
\overline{D}(X)=(\overline{\tx:\star})\to{\{\overline{D'}\}}_{\_}}
{\Sigma;\Gamma\vdash M.X:\textsf{restrict}(\overline{D'})}\\
\inferrule[Path-Instantiation]{\Sigma;\Gamma\vdash M:\overline{D}\\
\overline{D}(X)=(\overline{\tx:\star})\to{\{\overline{D'}\}}_{\_}\\
\overline{\Sigma;\Gamma\vdash t:\star}}
{\Sigma;\Gamma\vdash M.X[\overline{x=t}]:\overline{x:[=t]},\overline{D'}}
\end{mathpar}
We define \textsf{restrict} as follows.
\begin{align*}
\textsf{restrict}(\epsilon) &:= \epsilon\\
\textsf{restrict}(X:t,\overline{D}) &:= X:t,\textsf{restrict}(\overline{D})\\
\textsf{restrict}(\_,\overline{D}) &:= \textsf{restrict}(\overline{D})\\
\end{align*}
The case $(2)$ could change to export more declarations by changing the
\textsf{restrict} function, I gave the simplest one that is still sound and not
useless. The only limitations on the choice of \textsf{restrict} is that it
should respect the following lemma, and that
$\textsf{restrict}(\overline{D})\subseteq \overline{D}$ for all $D$. For more
involved rules for projection we might want to allow \textsf{restrict} to depend
on $\Sigma$ and $\Gamma$.
\begin{lemma}
For all $\Sigma,\Gamma, M$ and $\overline{D}$, if $\Sigma;\Gamma\vdash M:D$,
then $\Sigma\vdash\Gamma,\overline{D}$.
\end{lemma}}
\begin{figure}
\begin{mathpar}
\inferrule[Bind-DefModule]
{\Sigma; \Gamma,\overline{\tx:\star}\vdash\overline{B}:\overline{D}\\
\Sigma; \Gamma,\left(X:(\overline{\tx:\star})\rightarrow
\{\overline{D}\}_{\overline{\BX[\,\overline{\tx=t}\,]}}\right)\vdash
\overline{B_0}:\overline{D_0}}
{\Sigma; \Gamma\vdash(X=\Lambda(\overline{\tx:\star}).\{\overline{B}\}_{\overline{\BX[\,\overline{\tx=t}\,]}}) \overline{B_0}:\left(X:(\overline{\tx:\star})\rightarrow
\{\overline{D}\}_{\overline{\BX[\,\overline{\tx=t}\,]}}\right)\overline{D_0}}\\
\inferrule[Bind-Type]
{\Sigma;\Gamma\vdash t:\star\\
\Sigma;\Gamma, x:[=t]\vdash\overline{B}:\overline{D}}
{\Sigma;\Gamma\vdash\left(x=\textsf{type }t\right)\overline{B}: \left(x :[=t]\right)\overline{D}}\hfill
\inferrule[Bind-Opaque]
{\Sigma;\Gamma\vdash t:\star\\
\Sigma;\Gamma, x:[=t]\vdash\overline{B}:\overline{D}}
{\Sigma;\Gamma\vdash\left(x =\textsf{opaque }t\right)\overline{B}: \left(x:\star\right)\overline{D}}\\
\inferrule[Bind-Empty]
{\\}{\Sigma;\Gamma\vdash\epsilon:\epsilon}\hfill
\inferrule[Bind-Value]
{\Sigma;\Gamma\vdash v:\cap\overline{T}\\
\Sigma;\Gamma,x:\cap\overline{T}\vdash\overline{B}:\overline{D}}
{\Sigma;\Gamma\vdash\left(x=v\right)\overline{B}:
\left(x:\cap\overline{T}\right)\overline{D}}\hfill
\inferrule[Bind-Private]
{\Sigma;\Gamma\vdash v:\cap\overline{T}\\
\Sigma;\Gamma,x:\cap\overline{T}\vdash\overline{B}:\overline{D}}
{\Sigma;\Gamma\vdash\left(x=\textsf{private }v\right)\overline{B}:\overline{D}}
\end{mathpar}
\caption{Typing rules for bindings \fbox{$\Sigma;\Gamma\vdash\overline{B}:\overline{D}$}}
\end{figure}
\begin{figure}
\begin{mathpar}
\inferrule[Sub-BehMod]{\\}
{\Sigma;\Gamma\vdash \BX\left[\overline{x=t}\right]\preccurlyeq \Sigma(\BX)\left(\overline{t}\right)}\hfill
\inferrule[Sub-ModBeh]{\\}
{\Sigma;\Gamma\vdash\left\{\overline{D}\right\}_{\overline{\BX[\,\overline{\tx=t}\,]}}\preccurlyeq\BX[\,\overline{\tx=t}\,]}\\
\inferrule[Sub-ModMod]{\forall i.\exists j. \BX_i = \BY_j \wedge \overline{\tx=t}^i \simeq \overline{\ty=t'}^j}
{\Sigma;\Gamma\vdash\left\{\overline{D}\right\}_{\overline{\BX_i[\,\overline{\tx=t}^i\,]}}\preccurlyeq \left\{\overline{D'}\right\}_{\overline{\BY_j[\,\overline{\ty=t'}^j\,]}}}\hfill
\inferrule[Sub-Elixir]{t\preccurlyeq t'}{\Sigma;\Gamma\vdash t\preccurlyeq t'}\hfill
\inferrule[Sub-Intersection]{\exists i\in I, T_i\preccurlyeq T}{\Sigma;\Gamma\vdash\cap_I\overline{T_i}\preccurlyeq T}\\
\inferrule[Sub-BigFunction]{\forall i.\Sigma;\Gamma,X_1:R_1,\ldots, X_{i-1}:R_{i-1}\vdash T_{i}\succcurlyeq R_i\\
\Sigma;\Gamma,\overline{X_i:R_i}\vdash T'\preccurlyeq R'}{\Sigma;\Gamma\vdash\left(X_i:T_i\right)\to T'\preccurlyeq\left(X_i:R_i\right)\to R'}
\end{mathpar}
\caption{Subtyping rules}
\end{figure}
The equivalence of two lists $\overline{\tx=t}$ and $\overline{\ty=t'}$, noted
$\overline{\tx=t} \simeq \overline{\ty=t'}$ means that both lists have the same
set of labels and that the types associated to a same label are equivalent.
\subsubsection{Interpreting the new types}
\paragraph{Enconding equality of types} Let $t$ be a type, we define:
\[
\texttt{eq}(t) := (t\cup\{\mho\})\times (\neg t\cup\{\mho\}).
\]
The type $\texttt{eq}(t)$ is never empty as it contains $(\mho,\mho)$.
\begin{lemma}
Let $t$ and $t'$ be two types, then $\texttt{eq}(t)\leq\texttt{eq}(t')$ if
and only if $t\simeq t'$.
\end{lemma}
\subsection{Operational semantics or encoding? That's the question}
There are four possible ways to proceed from here, listed here in the order of complexity (from the simplest to the most complex):
\begin{itemize}
\item Define an encoding of the module system into 1ML. This would be the simplest solution. It would give both the typing and the operational semantics for free. However, it would not give a direct understanding of the module system, and it would be difficult to reason directly on it (for instance to give meaningful typing error messages).
\item Define a type system for the module system and an encoding into 1ML. Then prove that well typed programs are encoded into well-typed 1ML expressions This would give a direct understanding of the module system type system, and it would be possible to reason directly on it, but the operational semantics would still be given by the encoding.
\item Define a type system and an operational semantics for the module system and an encoding into 1ML. Then prove that well typed programs are encoded into well-typed 1ML expressions and that the operational semantics is simulated by the encoding. This would give a direct understanding of both the type system and the operational semantics of the module system, but it would be more complex to define and to prove. (My preference would be for this solution.)
\item Define a type system and an operational semantics for the module system and prove type soundness directly on the module system. This would be the most complex solution, but it would give a direct understanding of both the type system and the operational semantics of the module system. It would also lack the connection with 1ML.
\end{itemize}
\begin{figure}
\begin{tabular}{r c ll}
$\tau$ & ::= & $t$ &\color{Gray} all simple types (including fully instantiated behaviours)\\
& | & $\left(\overline{N:\tau}\right)\rightarrow \tau$ &\color{Gray} types of functions on modules\\
& | & $\kwlike(M)$ &\color{Gray}anonymous module type\\
& | & $\kwbeha(M)$ &\color{Gray}nominal behaviour type\\
& | & $\cap\overline{\tau}$ &\color{Gray} syntactic intersection (programmer should no see any difference with the semantic one) \\
% & | & $X\left[\overline{\tx=t}\right]$\\