Isocroft-syntax.md by ladaposamuel · Pull Request #3 · theratioproject/syntax-proposals

ladaposamuel · 2019-11-12T21:59:40Z

No description provided.

appcypher · 2019-11-12T22:06:42Z

isocroft-syntax.md

+- The `const` keyword is used for subject which value cannot change.
+    ```cpp
+    const name = "John"
+    name = "Doe" // Invalid!
+    ```


I don't support using const for constants because it is verbose and would be tiresome to keep typing since it is going to be used a lot code. let as a constant is not a new syntax on the block. They have been popularised by Swift and Rust. We could also use val as popularised by Kotlin and Scala, but I think let is more expressive.

In modern languages, constants tend to be used quite a lot, because there is these notion of immutability by default, therefore these languages try to keep the keyword as short as possible. Usually as three letter words. val and letbeing the notable ones.

If you do a quick grep on any repo with code written in modern languages, you will notice constants are declared way more than variables.

i think the case you present for let OR val is quite subjective. Maybe because you prefer or are more used to let than const. Nonetheless, const (i believe) is more descriptive of the use. However, whatever we choose for Ratio will have to be debated more on concrete and objective reason than subjective preference.

"let is used by several modern languages" is a good enough objective reason to choose it. It is true that I prefer let over const but if const is what the majority wants, then we go with it.

appcypher · 2019-11-12T22:27:14Z

isocroft-syntax.md

+* int > a sequence of symbols from the standard number set (e.g. 2489)
+* long > two or more `int` type(s) put together as a sequence (e.g. 1267383994747489948947333344747)
+* float >
+* char > a single symbol from the standard chracter set ('$')
+* str > two or more `char` type(s) put together ('$rw32')
+* bool > true, false
+* byte > |01011010, |10010011


I advocate consistency. I suggest that builtin types and user-defined types look the same. In short, all types ought to start with uppercase letters. There shouldn't be a distinction.

Part of this principle is reinforced by having operator overloading to blur the difference between builtin types and user-defined types. Whatever you can do with a builtin type, you should be able to do with a user-defined one. Builtin types shouldn't have special properties or names.

For example, why should bool have a special connotation from a user-defined type named Switch

type Switch { value: byte, }

They are not really different. They occupy the same space in memory.

Another point is that, the primitive types defined above are not enough for a statically-typed language. Statically-typed languages usually provide signed and unsigned variations of integers and integers with different bit-width for low-level flexibility.

I also have a case against the str data type, but that will be discussed below.

A byte is 8 bits (10010110) and a bool (1 or 0 | true or false) is a flag or 1 bit. I need more clarification on how you arrived at the conclusion that byte and bool occupy the same memory space.

Yes, i did intentionally leave that out as the focus was syntax and not semantics or compiler specifics. I would be happy to add info about signed and unsigned integers.

About str, i actually restrained myself a lot from including it. i knew that an array of char type could do the job as it is like in C. However, i need much more ideas and discussion around this

bool is in fact a byte. Practically all modern CPUs are byte-addressable.

The least number of bits you can load from or store to in memory is 8 bits. So bools are stored as bytes.

https://stackoverflow.com/questions/4626815/why-is-a-boolean-1-byte-and-not-1-bit-of-size

Oh... well yes you are right (in terms of addressable memory and not actual storage actually). You do know that the remaining seven bits are useless especially if we can make use of bit fields in C.

Can't we ?

I mean actual storage. bools are stored as bytes. You need bitwise operations to access a bit in a byte. Bitwise operations are expensive. It is faster to just access a byte, so bools are stored as bytes and this what most languages do.

Bitfields are opt-in feature available to the user through bitwise ops.

This is how C++ does it. Same goes for practically every other statically-typed languages. Some languages even store it in spaces larger than a byte. In C'sstdbool.h, bool is represented by an int which is larger than a byte. https://sites.uclouvain.be/SystInfo/usr/include/stdbool.h.html

appcypher · 2019-11-12T22:34:25Z

isocroft-syntax.md

+- Subjects can be explicitly type-annotated.
+
+    ```js
+    var int identification = 60


I don't support this annotation method because it is confusing.

var x = 23

x is an identifier here.

var x y = 23

It is easy to confuse x as the identifier here as well.

A punctuator would help vividly show the separation between an identifier and a type.

I don't understand how you mean. There are languages that do this. The int token is a reserved keyword in such languages and can't be used as a user-defined identifier. I would need evidence from you on such confusion that might arise from this syntax production. I do however agree with you on the need for a punctuator. I will work on that

The example I gave

var x y = 23

Either x or y are both valid identifiers. So it is easy to confuse which one is the type and which one is the subject name.

I do not support the syntax var int identification = 60 if we are going with that syntax the declaration keyword var will just be useless and just extra typing.

Alright... we do agree on one thing though. The variables defined / declared can have implicit OR explicit static type(s). I think we should go with @appcypher 's proposal on this

appcypher · 2019-11-12T22:42:49Z

isocroft-syntax.md

+>mutable arrays (lists) are defined using `var`
+```js
+
+var int<4> list = [1, 2, 3, 4]


The syntax int<8> looks special to the array type. So I am assuming this creates a collection of type array with 8 elements.

What if I have my own collection type, say hashset. How do I create a collection of type hashset with 8 elements. Can I take advantage of this special syntax?

Is there a way to generalize this to other collection types?

To your first question: Yes

To your second question: No... but this could be further discussed

appcypher · 2019-11-12T22:47:54Z

isocroft-syntax.md

+>the pipe operator `~>` can also be used to sort the resulting list
+```elixir
+
+var int<8> numbers = [1,3,5,7] ++ [0,2,4,6] ~> sort


If this is an array, which I assume in your proposal means it has a fixed length, then how do you determine statically (i.e., at compile time) that this array will have a length of 8 items if you do runtime concatenation operation on it.

If I assumed wrong and your arrays don't have fixed length at compile-time, then what would be the point of specifying it?

Will there be runtime checks to make sure an array's length stays the same, which I frankly think would be inefficient considering we have runtime bounds checking to worry about as well.

You are correct... At compile time, this program will raise an error in a language like Java or C++ as at compile time the size of the array cannot be determined until runtime.

But i wrote that line in the hope that we could design a multipass and AST-driven compiler such that the AST can be modified at the entry point into the Code Generation (Backend) to find expressions that can be "executed" or "reduced statically" to a literal value (So in the example above, the AST nodes are reduced to "[" "1" "," "3" "," "5" "," "7" "," "0" "," "2" "," "4" "," "6" "]" "~" ">" "sort").

However this feature will work for only literal primitives and not non-literal primitives to avoid unbearable/excessive overhead to discover such expressions and "reduce" them. There are other use cases for such a feature... but this is just a simple example.

appcypher · 2019-11-12T22:48:41Z

isocroft-syntax.md

+
+```js
+
+  const [...chars] = "Happy Birthday!"


What will chars type and value be after the destructure?

This is another example of the feature i spoke of. The literal ( "Happy Birthday!" ) on the RHS will be "reduced statically" into the LHS. on the AST, the "Happy Birthday!" will be "pre-executed" and we will have the below:

chars -> identifier
char<15> -> type
[ "H", "a", "p", "p", "y", " ", "B", "i", "r", "t", "h", "d", "a", "y", "!" ] -> value

This feature allows users/programmers to write less verbose code. I am not naive however and i believe this feature might be very difficult or challenging to implement yet however worth discussing and considering.

Nice idea, however, it is going to be inconsistent with the semantic of other destructuring syntax.

... is like a rest or spread syntax in JavaScript.

let [first, ...remaining] = [1, 2, 3] first == 1 // type Int remaining == [2, 3] // type List

So if at all we want that string destructuring syntax, then it should follow the same semantics.

let [first, ...remaining] = "abc" first == 'a' // type Char remaining == "bc" // type String

However, I have an issue with that syntax, because it is already reserved for list destructuring. It would be nice to find another nice syntax for string destructuring.

I agree @appcypher.

Ok i will work on that and bring my alternate proposal for string destructuring back here by tomorrow (Saturday [ 16th, Nov 2019 ]) evening so we can have further discussions.

appcypher · 2019-11-12T22:49:27Z

isocroft-syntax.md

+```cargo
+
+var num = 5
+until num === 0 {


What is the triple equal operator for? It wasn't explained.

appcypher · 2019-11-12T22:54:55Z

isocroft-syntax.md

+
+```js
+
+    var char<5,4> names = ["Steve", "Kunle", "Chris", "Azeez"] // a list of strings with a length of 4 and each string a length of 5


So we have some sort of nested length specification.

A collection has 4 elements. In turn, each element is a collection of 5 elements

How is this verified statically though? Is there going to be runtime checks looping deep into collections to verify their length?

If that isn't the case, why specify them statically if they cannot be statically-verified?

appcypher · 2019-11-12T22:56:17Z

isocroft-syntax.md

+* `&&` (and)
+* `and` (and)
+* `||` (or)
+* `or` (or)


I advocate one obvious way. I'd say we go with one here. I don't see the point of having the pairs. I'd go with the keyword operators, because they are explicit.

Ok... i think that is fair. and , or are okay

appcypher · 2019-11-12T23:15:58Z

isocroft-syntax.md

+* long > two or more `int` type(s) put together as a sequence (e.g. 1267383994747489948947333344747)
+* float >
+* char > a single symbol from the standard chracter set ('$')
+* str > two or more `char` type(s) put together ('$rw32')


I disagree with this approach to implementing strings. Modern languages support UTF-8 strings. UTF-8 strings aren't just sequence of characters because the character (actually codepoint) bit-width varies. They have a multi-byte encoding which uses between 1 and 4 bytes per character.

Your char type cannot be 8-bit because it cannot represent the entire Unicode range.

If your char type is 16-bit, then your char can represent UTF-16 (which is what Java did), but that is really limited as well, because Unicode 2019 has codepoints that can't fit in two bytes. So most languages tend to opt for 32-bit char types, because it covers the entire Unicode.

The issue with making your string just a sequence of char types is that the encoding is either going to be very wasteful or very limited. UTF-8 solves wastage issue and gives up O(1) access. The internet (heck the world) is in support of UTF-8.

In essence, string should be a UTF-8 encoding, not an array of characters.

I'd reserve this type of implementation for ascii-only string types or go-like rune types.

Yes, you are very very correct! Which is why i included the str type against all restraint on my part. I wanted to discuss with everyone all the edge cases (as you have mentioned with code-points - especially code-points from Arab or German characters) with the rest of the team and seek out the best implementation.

I believe you have just made enough case not to go with a UTF-16 or UTF-8 fixed byte encoding > multi-byte encoding perhaps

Create isocroft-syntax.md

6cdf659

appcypher reviewed Nov 12, 2019

View reviewed changes


		```js

		var char<5,4> names = ["Steve", "Kunle", "Chris", "Azeez"] // a list of strings with a length of 4 and each string a length of 5

Conversation

ladaposamuel commented Nov 12, 2019

Uh oh!

appcypher Nov 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 15, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

isocroft Nov 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 14, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

appcypher Nov 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

appcypher Nov 12, 2019 •

edited

Loading

appcypher Nov 15, 2019 •

edited

Loading

appcypher Nov 12, 2019 •

edited

Loading

appcypher Nov 14, 2019 •

edited

Loading

appcypher Nov 12, 2019 •

edited

Loading

appcypher Nov 14, 2019 •

edited

Loading

appcypher Nov 12, 2019 •

edited

Loading

appcypher Nov 12, 2019 •

edited

Loading

isocroft Nov 14, 2019 •

edited

Loading

appcypher Nov 14, 2019 •

edited

Loading

appcypher Nov 12, 2019 •

edited

Loading

appcypher Nov 12, 2019 •

edited

Loading