Conversation
| - The `const` keyword is used for subject which value cannot change. | ||
| ```cpp | ||
| const name = "John" | ||
| name = "Doe" // Invalid! | ||
| ``` |
There was a problem hiding this comment.
I don't support using const for constants because it is verbose and would be tiresome to keep typing since it is going to be used a lot code. let as a constant is not a new syntax on the block. They have been popularised by Swift and Rust. We could also use val as popularised by Kotlin and Scala, but I think let is more expressive.
In modern languages, constants tend to be used quite a lot, because there is these notion of immutability by default, therefore these languages try to keep the keyword as short as possible. Usually as three letter words. val and letbeing the notable ones.
If you do a quick grep on any repo with code written in modern languages, you will notice constants are declared way more than variables.
There was a problem hiding this comment.
i think the case you present for let OR val is quite subjective. Maybe because you prefer or are more used to let than const. Nonetheless, const (i believe) is more descriptive of the use. However, whatever we choose for Ratio will have to be debated more on concrete and objective reason than subjective preference.
There was a problem hiding this comment.
"let is used by several modern languages" is a good enough objective reason to choose it. It is true that I prefer let over const but if const is what the majority wants, then we go with it.
| * int > a sequence of symbols from the standard number set (e.g. 2489) | ||
| * long > two or more `int` type(s) put together as a sequence (e.g. 1267383994747489948947333344747) | ||
| * float > | ||
| * char > a single symbol from the standard chracter set ('$') | ||
| * str > two or more `char` type(s) put together ('$rw32') | ||
| * bool > true, false | ||
| * byte > |01011010, |10010011 |
There was a problem hiding this comment.
I advocate consistency. I suggest that builtin types and user-defined types look the same. In short, all types ought to start with uppercase letters. There shouldn't be a distinction.
Part of this principle is reinforced by having operator overloading to blur the difference between builtin types and user-defined types. Whatever you can do with a builtin type, you should be able to do with a user-defined one. Builtin types shouldn't have special properties or names.
For example, why should bool have a special connotation from a user-defined type named Switch
type Switch {
value: byte,
}
They are not really different. They occupy the same space in memory.
Another point is that, the primitive types defined above are not enough for a statically-typed language. Statically-typed languages usually provide signed and unsigned variations of integers and integers with different bit-width for low-level flexibility.
I also have a case against the str data type, but that will be discussed below.
There was a problem hiding this comment.
A byte is 8 bits (10010110) and a bool (1 or 0 | true or false) is a flag or 1 bit. I need more clarification on how you arrived at the conclusion that byte and bool occupy the same memory space.
Yes, i did intentionally leave that out as the focus was syntax and not semantics or compiler specifics. I would be happy to add info about signed and unsigned integers.
About str, i actually restrained myself a lot from including it. i knew that an array of char type could do the job as it is like in C. However, i need much more ideas and discussion around this
There was a problem hiding this comment.
bool is in fact a byte. Practically all modern CPUs are byte-addressable.
The least number of bits you can load from or store to in memory is 8 bits. So bools are stored as bytes.
https://stackoverflow.com/questions/4626815/why-is-a-boolean-1-byte-and-not-1-bit-of-size
There was a problem hiding this comment.
Oh... well yes you are right (in terms of addressable memory and not actual storage actually). You do know that the remaining seven bits are useless especially if we can make use of bit fields in C.
Can't we ?
There was a problem hiding this comment.
I mean actual storage. bools are stored as bytes. You need bitwise operations to access a bit in a byte. Bitwise operations are expensive. It is faster to just access a byte, so bools are stored as bytes and this what most languages do.
Bitfields are opt-in feature available to the user through bitwise ops.
This is how C++ does it. Same goes for practically every other statically-typed languages. Some languages even store it in spaces larger than a byte. In C'sstdbool.h, bool is represented by an int which is larger than a byte. https://sites.uclouvain.be/SystInfo/usr/include/stdbool.h.html
| - Subjects can be explicitly type-annotated. | ||
|
|
||
| ```js | ||
| var int identification = 60 |
There was a problem hiding this comment.
I don't support this annotation method because it is confusing.
var x = 23x is an identifier here.
var x y = 23It is easy to confuse x as the identifier here as well.
A punctuator would help vividly show the separation between an identifier and a type.
There was a problem hiding this comment.
I don't understand how you mean. There are languages that do this. The int token is a reserved keyword in such languages and can't be used as a user-defined identifier. I would need evidence from you on such confusion that might arise from this syntax production. I do however agree with you on the need for a punctuator. I will work on that
There was a problem hiding this comment.
The example I gave
var x y = 23Either x or y are both valid identifiers. So it is easy to confuse which one is the type and which one is the subject name.
There was a problem hiding this comment.
I do not support the syntax var int identification = 60 if we are going with that syntax the declaration keyword var will just be useless and just extra typing.
There was a problem hiding this comment.
Alright... we do agree on one thing though. The variables defined / declared can have implicit OR explicit static type(s). I think we should go with @appcypher 's proposal on this
| >mutable arrays (lists) are defined using `var` | ||
| ```js | ||
|
|
||
| var int<4> list = [1, 2, 3, 4] |
There was a problem hiding this comment.
The syntax int<8> looks special to the array type. So I am assuming this creates a collection of type array with 8 elements.
What if I have my own collection type, say hashset. How do I create a collection of type hashset with 8 elements. Can I take advantage of this special syntax?
Is there a way to generalize this to other collection types?
There was a problem hiding this comment.
To your first question: Yes
To your second question: No... but this could be further discussed
| >the pipe operator `~>` can also be used to sort the resulting list | ||
| ```elixir | ||
|
|
||
| var int<8> numbers = [1,3,5,7] ++ [0,2,4,6] ~> sort |
There was a problem hiding this comment.
If this is an array, which I assume in your proposal means it has a fixed length, then how do you determine statically (i.e., at compile time) that this array will have a length of 8 items if you do runtime concatenation operation on it.
If I assumed wrong and your arrays don't have fixed length at compile-time, then what would be the point of specifying it?
Will there be runtime checks to make sure an array's length stays the same, which I frankly think would be inefficient considering we have runtime bounds checking to worry about as well.
There was a problem hiding this comment.
You are correct... At compile time, this program will raise an error in a language like Java or C++ as at compile time the size of the array cannot be determined until runtime.
But i wrote that line in the hope that we could design a multipass and AST-driven compiler such that the AST can be modified at the entry point into the Code Generation (Backend) to find expressions that can be "executed" or "reduced statically" to a literal value (So in the example above, the AST nodes are reduced to "[" "1" "," "3" "," "5" "," "7" "," "0" "," "2" "," "4" "," "6" "]" "~" ">" "sort").
However this feature will work for only literal primitives and not non-literal primitives to avoid unbearable/excessive overhead to discover such expressions and "reduce" them. There are other use cases for such a feature... but this is just a simple example.
|
|
||
| ```js | ||
|
|
||
| const [...chars] = "Happy Birthday!" |
There was a problem hiding this comment.
What will chars type and value be after the destructure?
There was a problem hiding this comment.
This is another example of the feature i spoke of. The literal ( "Happy Birthday!" ) on the RHS will be "reduced statically" into the LHS. on the AST, the "Happy Birthday!" will be "pre-executed" and we will have the below:
chars -> identifier
char<15> -> type
[ "H", "a", "p", "p", "y", " ", "B", "i", "r", "t", "h", "d", "a", "y", "!" ] -> value
There was a problem hiding this comment.
This feature allows users/programmers to write less verbose code. I am not naive however and i believe this feature might be very difficult or challenging to implement yet however worth discussing and considering.
There was a problem hiding this comment.
Nice idea, however, it is going to be inconsistent with the semantic of other destructuring syntax.
... is like a rest or spread syntax in JavaScript.
let [first, ...remaining] = [1, 2, 3]
first == 1 // type Int
remaining == [2, 3] // type ListSo if at all we want that string destructuring syntax, then it should follow the same semantics.
let [first, ...remaining] = "abc"
first == 'a' // type Char
remaining == "bc" // type StringHowever, I have an issue with that syntax, because it is already reserved for list destructuring. It would be nice to find another nice syntax for string destructuring.
There was a problem hiding this comment.
I agree @appcypher.
Ok i will work on that and bring my alternate proposal for string destructuring back here by tomorrow (Saturday [ 16th, Nov 2019 ]) evening so we can have further discussions.
| ```cargo | ||
|
|
||
| var num = 5 | ||
| until num === 0 { |
There was a problem hiding this comment.
What is the triple equal operator for? It wasn't explained.
|
|
||
| ```js | ||
|
|
||
| var char<5,4> names = ["Steve", "Kunle", "Chris", "Azeez"] // a list of strings with a length of 4 and each string a length of 5 |
There was a problem hiding this comment.
So we have some sort of nested length specification.
A collection has 4 elements. In turn, each element is a collection of 5 elements
How is this verified statically though? Is there going to be runtime checks looping deep into collections to verify their length?
If that isn't the case, why specify them statically if they cannot be statically-verified?
| * `&&` (and) | ||
| * `and` (and) | ||
| * `||` (or) | ||
| * `or` (or) |
There was a problem hiding this comment.
I advocate one obvious way. I'd say we go with one here. I don't see the point of having the pairs. I'd go with the keyword operators, because they are explicit.
There was a problem hiding this comment.
Ok... i think that is fair. and , or are okay
| * long > two or more `int` type(s) put together as a sequence (e.g. 1267383994747489948947333344747) | ||
| * float > | ||
| * char > a single symbol from the standard chracter set ('$') | ||
| * str > two or more `char` type(s) put together ('$rw32') |
There was a problem hiding this comment.
I disagree with this approach to implementing strings. Modern languages support UTF-8 strings. UTF-8 strings aren't just sequence of characters because the character (actually codepoint) bit-width varies. They have a multi-byte encoding which uses between 1 and 4 bytes per character.
Your char type cannot be 8-bit because it cannot represent the entire Unicode range.
If your char type is 16-bit, then your char can represent UTF-16 (which is what Java did), but that is really limited as well, because Unicode 2019 has codepoints that can't fit in two bytes. So most languages tend to opt for 32-bit char types, because it covers the entire Unicode.
The issue with making your string just a sequence of char types is that the encoding is either going to be very wasteful or very limited. UTF-8 solves wastage issue and gives up O(1) access. The internet (heck the world) is in support of UTF-8.
In essence, string should be a UTF-8 encoding, not an array of characters.
I'd reserve this type of implementation for ascii-only string types or go-like rune types.
There was a problem hiding this comment.
Yes, you are very very correct! Which is why i included the str type against all restraint on my part. I wanted to discuss with everyone all the edge cases (as you have mentioned with code-points - especially code-points from Arab or German characters) with the rest of the team and seek out the best implementation.
I believe you have just made enough case not to go with a UTF-16 or UTF-8 fixed byte encoding > multi-byte encoding perhaps
No description provided.