diff --git a/book/FPLean/GettingToKnow/DatatypesPatterns.lean b/book/FPLean/GettingToKnow/DatatypesPatterns.lean index 007a812..dda7a29 100644 --- a/book/FPLean/GettingToKnow/DatatypesPatterns.lean +++ b/book/FPLean/GettingToKnow/DatatypesPatterns.lean @@ -44,9 +44,11 @@ inductive Bool where This definition has two main parts. The first line provides the name of the new type ({anchorName Bool}`Bool`), while the remaining lines each describe a constructor. -As with constructors of structures, constructors of inductive datatypes are mere inert receivers of and containers for other data, rather than places to insert arbitrary initialization and validation code. -Unlike structures, inductive datatypes may have multiple constructors. -Here, there are two constructors, {anchorName Bool}`true` and {anchorName Bool}`false`, and neither takes any arguments. +Inductive datatypes have multiple constructors, unlike structures that contain only one, for their creation. +Constructors for Inductive datatypes are mere inert receivers of and containers for other data. +Here, there are two constructors, {anchorName Bool}`true` and {anchorName Bool}`false`, and neither takes any arguments, neither hold any data. +The two constructors have their type explicitly declared for clarity but this is facultative here, as they must construct the Type under which they are defined. + Just as a structure declaration places its names in a namespace named after the declared type, an inductive datatype places the names of its constructors in a namespace. In the Lean standard library, {anchorName BoolNames}`true` and {anchorName BoolNames}`false` are re-exported from this namespace so that they can be written alone, rather than as {anchorName BoolNames}`Bool.true` and {anchorName BoolNames}`Bool.false`, respectively. ::: @@ -81,6 +83,7 @@ The names {anchorName NatNames}`zero` and {anchorName NatNames}`succ` are in a n Argument names, such as {anchorName Nat}`n`, may occur in Lean's error messages and in feedback provided when writing mathematical proofs. Lean also has an optional syntax for providing arguments by name. Generally, however, the choice of argument name is less important than the choice of a structure field name, as it does not form as large a part of the API. +having {anchorName newNat}`succ : Nat -> Nat` would be faily similar, since {anchorName newNat}`succ` don't use the variable itself. In C# or Java, {CSharp}`Nat` could be defined as follows: ```CSharp @@ -112,6 +115,8 @@ type Nat = Zero | Succ; ``` Just like C# and Java, this encoding ends up with more types than in Lean, because {typescript}`Zero` and {typescript}`Succ` are each a type on their own. It also illustrates that Lean constructors correspond to objects in JavaScript or TypeScript that include a tag that identifies the contents. +This identification is crucial, {anchorName Bool}`true` and {anchorName Bool}`false` are not _equal_ to {anchorName Bool}`Bool`, they are _of type_ {anchorName Bool}`Bool`. +Pattern matching check for exhaustiveness, matching against a value of type {anchorName Bool}`Bool` means branching all the possible ways that value could have been constructed # Pattern Matching @@ -412,3 +417,5 @@ h✝ : ¬n < k This message means that {anchorName div}`div` requires a manual proof of termination. This topic is explored in {ref "division-as-iterated-subtraction"}[the final chapter]. ::: + +You may be interested in reading about how numbers [are treated at runtime](https://lean-lang.org/doc/reference/latest//Basic-Types/Natural-Numbers/#nat-runtime) for efficiency. \ No newline at end of file diff --git a/book/FPLean/GettingToKnow/FunctionsDefinitions.lean b/book/FPLean/GettingToKnow/FunctionsDefinitions.lean index f16eecd..80bc8c2 100644 --- a/book/FPLean/GettingToKnow/FunctionsDefinitions.lean +++ b/book/FPLean/GettingToKnow/FunctionsDefinitions.lean @@ -109,19 +109,30 @@ if 13 < 14 then 14 else 13 Expressions that evaluate to natural numbers, integers, and strings have types that say this ({anchorName Nat}`Nat`, {anchorName Positivity}`Int`, and {anchorName Book}`String`, respectively). This is also true of functions. -A function that accepts a {anchorName Nat}`Nat` and returns a {anchorName Bool}`Bool` has type {anchorTerm evenFancy}`Nat → Bool`, and a function that accepts two {anchorName Nat}`Nat`s and returns a {anchorName Nat}`Nat` has type {anchorTerm currying}`Nat → Nat → Nat`. +A function that accepts a {anchorName Nat}`Nat` and returns a {anchorName Bool}`Bool` has type {anchorTerm evenFancy}`Nat → Bool`, as it transform a {lit}`Nat` into a {lit}`Bool`. +A function that accepts two {anchorName Nat}`Nat`s and returns a {anchorName Nat}`Nat` has type {anchorTerm currying}`Nat → Nat → Nat`, because : + +behind the scenes, all functions actually expect precisely one argument. +Functions like {anchorName maximum3Type}`maximum` that appear to take more than one argument are in fact functions that take one argument and then return a new function. +This new function takes the next argument, and the process continues until no more arguments are expected. +This can be seen by providing one argument to a multiple-argument function: {anchorTerm maximum3Type}`#check maximum 3` yields {anchorInfo maximum3Type}`maximum 3 : Nat → Nat`, giving it another Nat yields {anchorInfo maximum3Type}`maximum 3 4 : Nat`. {anchorTerm stringAppendHelloType}`#check spaceBetween "Hello "` yields {anchorInfo stringAppendHelloType}`spaceBetween "Hello" : String → String`, giving it another String yields {anchorInfo stringAppendHelloType}`spaceBetween "Hello" "world" : String`. +Using a function that returns a function to implement multiple-argument functions is called _currying_ after the mathematician Haskell Curry, and Lean use currying by default as a core design choice related to its type system, a concept that will be explained later (Is it ?) +You can read Nat -> Nat -> Nat incorrectly as "takes two Nats and returns a Nat", but what it's really saying is "takes a Nat and returns something of the type Nat -> Nat", that is, it returns a function that takes a Nat and returns a Nat. {margin}[Adapted from [Haskell's wiki](https://wiki.haskell.org/Currying)] +Function arrows associate to the right, which means that {anchorTerm currying}`Nat → Nat → Nat` should be parenthesized {anchorTerm currying}`Nat → (Nat → Nat)`. As a special case, Lean returns a function's signature when its name is used directly with {kw}`#check`. Entering {anchorTerm add1sig}`#check add1` yields {anchorInfo add1sig}`add1 (n : Nat) : Nat`. However, Lean can be “tricked” into showing the function's type by writing the function's name in parentheses, which causes the function to be treated as an ordinary expression, so {anchorTerm add1type}`#check (add1)` yields {anchorInfo add1type}`add1 : Nat → Nat` and {anchorTerm maximumType}`#check (maximum)` yields {anchorInfo maximumType}`maximum : Nat → Nat → Nat`. -This arrow can also be written with an ASCII alternative arrow {anchorTerm add1typeASCII}`->`, so the preceding function types can be written {anchorTerm add1typeASCII}`example : Nat -> Nat := add1` and {anchorTerm maximumTypeASCII}`example : Nat -> Nat -> Nat := maximum`, respectively. - -Behind the scenes, all functions actually expect precisely one argument. -Functions like {anchorName maximum3Type}`maximum` that seem to take more than one argument are in fact functions that take one argument and then return a new function. -This new function takes the next argument, and the process continues until no more arguments are expected. -This can be seen by providing one argument to a multiple-argument function: {anchorTerm maximum3Type}`#check maximum 3` yields {anchorInfo maximum3Type}`maximum 3 : Nat → Nat` and {anchorTerm stringAppendHelloType}`#check spaceBetween "Hello "` yields {anchorInfo stringAppendHelloType}`spaceBetween "Hello " : String → String`. -Using a function that returns a function to implement multiple-argument functions is called _currying_ after the mathematician Haskell Curry. -Function arrows associate to the right, which means that {anchorTerm currying}`Nat → Nat → Nat` should be parenthesized {anchorTerm currying}`Nat → (Nat → Nat)`. +This arrow can also be written with an ASCII alternative arrow {anchorTerm add1typeASCII}`->` directly when writing a type signature. +Moving arguments into the signature just as how lean does when putting a function in parentheses, means that the values passed are not bound to a local variable for use in the function. +Rather, the function should return a function whose type match its type signature as it can't treat by itself its argument as it is having none : +```anchor myAppend +def originalAppend : String -> String -> String := String.append +def myAppend (a : String) (b : String) : String := a++b +#check (originalAppend) +#check (myAppend) +``` +Both functions have the same type signature, but only {anchorTerm myAppend}`String.append` and {anchorTerm myAppend}`myAppend` are equipped to deal with the values passed to them. ## Exercises diff --git a/book/FPLean/GettingToKnow/Polymorphism.lean b/book/FPLean/GettingToKnow/Polymorphism.lean index b55dd83..30ba880 100644 --- a/book/FPLean/GettingToKnow/Polymorphism.lean +++ b/book/FPLean/GettingToKnow/Polymorphism.lean @@ -183,8 +183,8 @@ Evaluation can occur both in the expression and its type: # Linked Lists :::paragraph -Lean's standard library includes a canonical linked list datatype, called {anchorName fragments}`List`, and special syntax that makes it more convenient to use. -Lists are written in square brackets. +Lean's standard library includes a canonical linked list datatype, called {anchorName fragments}`List`, and special syntaxes that makes it more convenient to use. +Lists can be written in square brackets. For instance, a list that contains the prime numbers less than 10 can be written: ```anchor primesUnder10 @@ -202,14 +202,14 @@ inductive List (α : Type) where | cons : α → List α → List α ``` +Theses names are abbreviations for the Latin word "Nihil", meaning _nothing_, and "construct". The actual definition in the standard library is slightly different, because it uses features that have not yet been presented, but it is substantially similar. This definition says that {anchorName List}`List` takes a single type as its argument, just as {anchorName PPoint}`PPoint` did. This type is the type of the entries stored in the list. According to the constructors, a {anchorTerm List}`List α` can be built with either {anchorName List}`nil` or {anchorName List}`cons`. The constructor {anchorName List}`nil` represents empty lists and the constructor {anchorName List}`cons` is used for non-empty lists. -The first argument to {anchorName List}`cons` is the head of the list, and the second argument is its tail. -A list that contains $`n` entries contains $`n` {anchorName List}`cons` constructors, the last of which has {anchorName List}`nil` as its tail. - +Because it is defined in that order, that is, the first argument to {anchorName List}`cons` being one {lit}`α`, and the second argument being a list of other {lit}`α`, a {anchorTerm List}`List` constructed through {anchorName List}`cons` must have been against its signature, it is therefore true to say that the first argument to {anchorName List}`cons` is the head of the list, the first one, and the second argument is its tail, the rest. +A list that contains $`n` entries contains $`n` {anchorName List}`cons` constructors, each fitting into the second argument of its {anchorName List}`cons` predecessor with the help of parentheses, the last of which has {anchorName List}`nil` as its tail, or its second argument. ::: :::paragraph @@ -220,7 +220,20 @@ def explicitPrimesUnder10 : List Nat := List.cons 2 (List.cons 3 (List.cons 5 (List.cons 7 List.nil))) ``` -These two definitions are completely equivalent, but {anchorName primesUnder10}`primesUnder10` is much easier to read than {anchorName explicitPrimesUnder10}`explicitPrimesUnder10`. +These two definitions are completely equivalent, behind the scene the notation {lit}`[a,b,c]` used by {anchorName primesUnder10}`primesUnder10` is a _macro_ that output to {anchorName explicitPrimesUnder10}`explicitPrimesUnder10`. +Macro are explored later in {ref "Ref to macro"}[Typed Queries], simply remember that they allow Lean to be extended by translating new syntax into existing syntax and can therefore be used in place where the underlying syntax must be used, with greater comfort. +For example, each group of {lit}`#eval` expresion give similar values: +```anchor comfy +#eval List.cons 3 List.nil +#eval List.singleton 3 +#eval [3] + +#eval [1,2,3,4,5] +#eval 1::2::3::4::5::List.nil +#eval 1::2::3::[4,5] +#eval %[1,2,3|[4,5]] +``` +Substituing (or not) numbers for variables in pattern matching works, as they appear as original {lit}`List`'s constructors, which pattern matching look for when given a value of type {anchorName List}`List` to match for. ::: :::paragraph @@ -479,15 +492,15 @@ tag := "prod" %%% The {anchorName Prod}`Prod` structure, short for “Product”, is a generic way of joining two values together. -For instance, a {anchorTerm fragments}`Prod Nat String` contains a {anchorName fragments}`Nat` and a {anchorName fragments}`String`. -In other words, {anchorTerm natPoint}`PPoint Nat` could be replaced by {anchorTerm fragments}`Prod Nat Nat`. +For instance, a {anchorTerm fragments}`Prod Nat String` contains a {anchorName fragments}`Nat` and a {anchorName fragments}`String` in a {anchorName Prod}`Prod` structure. +Using {anchorTerm fragments}`Prod Nat Nat` would computally be the same as using our {anchorTerm natPoint}`PPoint Nat`. +However, many applications are best served by defining their own structures, even for simple cases like {anchorName Point}`Point`, because using domain terminology can make it easier to read the code. +Additionally, defining structure types helps catch more errors by assigning different types to different domain concepts, even if they are of a similar body, preventing them from being mixed up. {anchorName fragments}`Prod` is very much like C#'s tuples, the {Kotlin}`Pair` and {Kotlin}`Triple` types in Kotlin, and {cpp}`tuple` in C++. -Many applications are best served by defining their own structures, even for simple cases like {anchorName Point}`Point`, because using domain terminology can make it easier to read the code. -Additionally, defining structure types helps catch more errors by assigning different types to different domain concepts, preventing them from being mixed up. On the other hand, there are some cases where it is not worth the overhead of defining a new type. -Additionally, some libraries are sufficiently generic that there is no more specific concept than “pair”. -Finally, the standard library contains a variety of convenience functions that make it easier to work with the built-in pair type. +Additionally, a standard library can contains a variety of convenience functions that make it easier to work with the built-in pair type of the language. +Finally, some libraries are sufficiently generic that there is no more specific concept than “pair”. :::paragraph The structure {anchorName Prod}`Prod` is defined with two type arguments: @@ -501,11 +514,12 @@ structure Prod (α : Type) (β : Type) : Type where ::: :::paragraph -Lists are used so frequently that there is special syntax to make them more readable. +These names are abbreviations for "first" and "second". +Lists are used so frequently that there is special syntax to make them more readable ({lit}`[a, b, c]`, {lit}`::`, etc). For the same reason, both the product type and its constructor have special syntax. -The type {anchorTerm ProdSugar}`Prod α β` is typically written {anchorTerm ProdSugar}`α × β`, mirroring the usual notation for a Cartesian product of sets. -Similarly, the usual mathematical notation for pairs is available for {anchorName ProdSugar}`Prod`. -In other words, instead of writing: +The type {anchorTerm ProdSugar}`Prod α β` is typically written {anchorTerm ProdSugar}`α × β`, mirroring the usual notation for a [Cartesian product](https://en.wikipedia.org/wiki/Cartesian_product) of sets. +Similarly, the usual [mathematical notation](https://en.wikipedia.org/wiki/Ordered_pair) for pairs is available for {anchorName ProdSugar}`Prod`. +Meaning, instead of writing: ```anchor fivesStruct def fives : String × Int := { fst := "five", snd := 5 } @@ -592,6 +606,7 @@ def howManyDogs (pets : List PetName) : Nat := Function calls are evaluated before infix operators, so {anchorTerm howManyDogsAdd}`howManyDogs morePets + 1` is the same as {anchorTerm howManyDogsAdd}`(howManyDogs morePets) + 1`. As expected, {anchor dogCount}`#eval howManyDogs animals` yields {anchorInfo dogCount}`3`. +Infix operators and precedences values will be explained in {ref "Refer to modad -> Infix Operators"}[Infix operator] ::: ## {anchorName Unit}`Unit` @@ -621,9 +636,19 @@ inductive ArithExpr (ann : Type) : Type where | times : ann → ArithExpr ann → ArithExpr ann → ArithExpr ann ``` -The type argument {anchorName ArithExpr}`ann` stands for annotations, and each constructor is annotated. -Expressions coming from a parser might be annotated with source locations, so a return type of {anchorTerm ArithExprEx}`ArithExpr SourcePos` ensures that the parser put a {anchorName ArithExprEx}`SourcePos` at each subexpression. -Expressions that don't come from the parser, however, will not have source locations, so their type can be {anchorTerm ArithExprEx}`ArithExpr Unit`. +The type argument {anchorName ArithExpr}`ann` stands for annotations, and each constructor having it is therefore _annotated_. +An anotation can be anything useful, expressions coming from a parser might be annotated with source locations (a line and column number from a source code file), so a return type of {anchorTerm ArithExprEx}`ArithExpr SourcePos` ensures that the parser put a {anchorName ArithExprEx}`SourcePos` at each subexpression, allowing access to the location of each in a hypothetical source code for different reasons. + +Expressions that don't come from the parser, however, may not have source locations, so their type can be {anchorTerm ArithExprEx}`ArithExpr Unit`: + +```anchor ArithExprVal +def toEval := (ArithExpr.times () (ArithExpr.int () 2) (ArithExpr.int () 3)) +#check toEval +``` +```anchorInfo ArithExprVal +toEval : ArithExpr Unit +``` +Unit's constructor can be written as empty parentheses: {anchorTerm unitParens}`() : Unit`. ::: @@ -631,7 +656,6 @@ Additionally, because all Lean functions have arguments, zero-argument functions In a return position, the {anchorName ArithExprEx}`Unit` type is similar to {CSharp}`void` in languages derived from C. In the C family, a function that returns {CSharp}`void` will return control to its caller, but it will not return any interesting value. By being an intentionally uninteresting value, {anchorName ArithExprEx}`Unit` allows this to be expressed without requiring a special-purpose {CSharp}`void` feature in the type system. -Unit's constructor can be written as empty parentheses: {anchorTerm unitParens}`() : Unit`. ## {lit}`Empty` @@ -652,7 +676,20 @@ These terms are related to sums and products used in ordinary arithmetic. The relationship is easiest to see when the types involved contain a finite number of values. If {anchorName SumProd}`α` and {anchorName SumProd}`β` are types that contain $`n` and $`k` distinct values, respectively, then {anchorTerm SumProd}`α ⊕ β` contains $`n + k` distinct values and {anchorTerm SumProd}`α × β` contains $`n \times k` distinct values. For instance, {anchorName fragments}`Bool` has two values: {anchorName BoolNames}`true` and {anchorName BoolNames}`false`, and {anchorName Unit}`Unit` has one value: {anchorName BooloUnit}`Unit.unit`. -The product {anchorTerm fragments}`Bool × Unit` has the two values {anchorTerm BoolxUnit}`(true, Unit.unit)` and {anchorTerm BoolxUnit}`(false, Unit.unit)`, and the sum {anchorTerm fragments}`Bool ⊕ Unit` has the three values {anchorTerm BooloUnit}`Sum.inl true`, {anchorTerm BooloUnit}`Sum.inl false`, and {anchorTerm BooloUnit}`Sum.inr Unit.unit`. +The product {anchorTerm fragments}`Bool × Unit` can have the two values {anchorTerm BoolxUnit}`(true, Unit.unit)` and {anchorTerm BoolxUnit}`(false, Unit.unit)`, and the sum {anchorTerm fragments}`Bool ⊕ Unit` can have the three values {anchorTerm BooloUnit}`Sum.inl true`, {anchorTerm BooloUnit}`Sum.inl false`, and {anchorTerm BooloUnit}`Sum.inr Unit.unit`: + +```anchor SumType +def SumType := Bool ⊕ Unit +def one : SumType := Sum.inr Unit.unit +def two : SumType := Sum.inl true +def three: SumType:= Sum.inl false +``` + +```anchor ProdType +def ProdType := Bool × Unit +def one : ProdType:= (false, Unit.unit) +def two : ProdType:= (true, Unit.unit) +``` Similarly, $`2 \times 1 = 2`, and $`2 + 1 = 3`. # Messages You May Meet diff --git a/book/FPLean/GettingToKnow/Types.lean b/book/FPLean/GettingToKnow/Types.lean index f38643a..da41bf9 100644 --- a/book/FPLean/GettingToKnow/Types.lean +++ b/book/FPLean/GettingToKnow/Types.lean @@ -44,11 +44,11 @@ operator inside parentheses: ``` -Here, {anchorName onePlusTwoEval}`Nat` is the type of _natural numbers_, which are arbitrary-precision unsigned integers. +Here, {anchorName onePlusTwoEval}`Nat` is the type of _natural numbers_, which are [arbitrary-precision unsigned integers](https://en.wikipedia.org/wiki/Arbitrary-precision_arithmetic). In Lean, {anchorName onePlusTwoEval}`Nat` is the default type for non-negative integer literals. -This default type is not always the best choice. +This default type is not always the best choice, and Lean support [fixed-width integer](https://lean-lang.org/doc/reference/latest/Basic-Types/Fixed-Precision-Integers/#fixed-ints) defined using [modular arithmetic](en.wikipedia.org/wiki/Modular_arithmetic). In C, unsigned integers underflow to the largest representable numbers when subtraction would otherwise yield a result less than zero. -{anchorName onePlusTwoEval}`Nat`, however, can represent arbitrarily-large unsigned numbers, so there is no largest number to underflow to. +{anchorName onePlusTwoEval}`Nat`, can represent arbitrarily-large unsigned numbers, so there is no largest number to underflow to. Thus, subtraction on {anchorName onePlusTwoEval}`Nat` returns {anchorName Nat}`zero` when the answer would have otherwise been negative. For instance,