– Typee Documentation –

3 – Typee Language Description

3.2 – Types

(notice: we use colored code for all examples in this section while this is not the case in some of the next sections. This is a transitory state. All sections of this documentation will evenutally get colored code for their examples.)

Here are the different kinds of types that are available in Typee – see below.

<TYPE>           ::= ['const'] <TYPE'>   
<TYPE'>          ::= <auto type>                  |
                     <container type>             |
                     <enum type>                  |
                     <NONE>                       |
                     <scalar type> [<dimensions>] |
                     <types list>

<auto type>      ::= '?' ['in' <types list>]

<container type> ::= <array_type> | <file type> |
                     <list type>  | <map type>  | <set type>
<array type>     ::= 'array' '<' <TYPE> '>'

<file type>      ::= 'file' ['<' <TYPE> '>']
<list type>      ::= 'list' ['<' <TYPE> '>']
<map type>       ::= 'map'  ['<' <TYPE> '>']
<set type>       ::= 'set'  ['<' <TYPE> '>']

<enum type>      ::= 'enum'

<NONE>           ::= 'None'  |  'none'    

<scalar type>    ::= 'bool'  |  'char'   |  'char16' |
                     'float32' |  'float64' |
                     'int8'  |  'int16'  |  'int32'  |  'int64'   |
                     'uint8' |  'uint16' |  'uint32' |  'uint64'  |
                     '_float_' |  '_int_'   |  '_uint_'  |  '_numeric_' |
                     'slice' |  'str'    |  'str16'
<dimensions>     ::= ( '[' <integer number> | <dotted name> ']' )*

<types list>     ::= '(' <TYPE> (',' <TYPE>)* ')'

The reader is informed that some of those types may be templated. Templates are well known of C++ programmers. In Java they are named generics. The equivalent in Python would be to pass a type or a class name as an argument of functions and methods.
But we shall not describe any templated notation right now, since templates will be specified later in this documentation.

This said, and before dealing with any other concept of Typee, we will first talk about built_in types. These are very simple to master. You will use them to specify the type of every variable (variables are described in next section) as well as the type of the returned value of every function and method of class.

3.2.1 Booleans

The boolean type in Typee is bool. This type is part of scalar types.

Booleans take one from two values, True and False.
To ease typing as well as to take into account programming habits, those language boolean constants may be written true and false.

Booleans can only be compared with booleans. Some OOP languages consider False to be similar to 0 (e.g. in C++) or to None (e.g. in Python), while anything different will be considered similar to True. This is NOT the case in Typee for which boolean variables cannot be compared with variables of any other type.

The only valid operators with booleans are:

assignment operator
equality and non-equality operators
boolean operators and, or and not

These are explained elsewhere but they act in Typee as in any programming language (same preceedence and same results).

Typee make no assumption on used memory space for storing booleans. This is definitively dependent on the targetted programming language at translation then compilation time.

3.2.2 Integers

Typee defines integers as coded on 8, 16, 32 and 64 bits, signed or unsigned. To be definitively unambiguous about size of storage and signed/unsigned status, 8 types names are specified which should be self explanatory. These types are part of scalar types.

int8 and uint8 – 8-bits coded, signed / unsigned
int16 and uint16 – 16-bits coded, signed / unsigned
int32 and uint32 – 32-bits coded, signed / unsigned
int64 and uint64 – 64-bits coded, signed / unsigned

A special case is about generic scalar types. They are of two kinds about integers:

_int_
_uint_ and
_numeric_

These generic scalar types have been added to the initial syntax to ease the shortening of list of types when all listed types are integers. _int_ stands for all signed integer types. _uint_ stands for all unsigned types. Those keywords are inserted in underscores to mark them as being unusual types for Typee. Programmers are strongly encouraged to use them only as shortcuts for code as, for instance: ? in (int8, int16, int32, int64) (the whole can be safely replaced with _int_ – the same stands with unsigned types).

_numeric_ can be substituted to lists of ALL integer types, signed and unsigned, plus floats. This is a very generic numeric type which may greatly help the writing of long code but with a big loss of precision on types verification.

All integer types are compatible with all other integer types, but overflow may result at run-time when operating on integers. This is not managed by Typee. Integers may also be operated with floats (see next sub-section on floats). They will first be promoted to floats before operation takes place.

Integer constants are coded as usual. They must begin with a digit (from 0 to 9), with digits again next and with allowed separating underscores, while they must be ended with a digit and not with an underscore. Constants may be binary, octal, decimal or hexadcimal. Here are Examples:

binary     : 0b0101, 0b1111_1111
octal      : 017, 0123_777
decimal    : 123, 9_876_543_210
hexadecimal: 0x1f, 0xffff_ffff_ffff_ffff

Operators applicable on integers are:

assignment operator
all comparison operators
artihmetic operators
bitwise operators
shifting operators

Precedence on operators are the same as for any other programming language. This is fully described later, when discussing Typee operators.

3.2.3 Floats

Floating numbers in Typee are 32- and 64- bits coded. Corresponding types are resp. float32 and float64. These types are part of scalar types.

Both types are compatible but overflow as well as underflow and precision loss may happen when mixing floating types. When integer and floating numbers appear in arithmetic expressions, integers are converted first into the largest floating type used in this expression.

A special case is about generic scalar types:

_float_ and
_numeric_

These generic types has been added to the initial syntax to ease the shortening of list of types when all listed types are floats. The keywords associated with geberic scalar types are inserted in underscores to mark them as being unusual types for Typee. Programmers are strongly encouraged to use them only as shortcuts for code as, for instance: ? in (float32, float64) (the whole can be safely replaced with _float_).

_numeric_ can be substituted to lists of ALL float types plus ALL integer types, signed and unsigned. This is a very generic numeric type which may greatly help the writing of long code but with a big loss of precision on types verification.

Floating constants are coded as usual, with separating underscores allowed as for integer constants. Examples are:

0.1, 1., 1.5e-1, 123_456e+18, 5.678_901e207

Operators applicable on floats are:

assignment operator
all comparison operators
arithmetic operators

Precedence on operators are the same as for any other programming language. This is fully described later, when discussing Typee operators.

The built-in library Maths defines many functions and methods that can be applied to floating numbers. This library is described later in this documentation.

3.2.4 Slices

Slices are well known of Python programmers. This is a built-in type in Typee.
Their formal EBNF syntax is this:

<slice clause> ::= '[' <expression> ':' [<expression>]
                         [ ':' [<expression>] ]
                         ( ']' | ')' )

The first expression is the starting index of the slice.

The second expression, after : and if present, is the ending index of the slice. This end is included if the slice is closed with a ] and is excluded if the slice is closed with a ). If this second expression is not present after :, the ending index is the very last available index in the container or the max positive number when looping on numbers. With current version of language Typee, this is the max positive value of an int32, i.e. 0x7fff_ffff_ffff_ffff.

The third expression, after second : and if present, is the step to be applied when running numbers from the starting to the ending indexes. If this last expression is not present, either because the related : is not present in the slice or because the step value itsel is not present, its default value is 1.

Indexes may be negative. Index -1 indexes the very last item of a string or of a container. Meanwhile, a negative stepping indicates a backward running of umbers (indexes) from start downto end – end value must then be lower than start value.

We provide full examples of slicing in sub-section dedicated to strings, since slices are used to index parts of strings and other containers. Here is a short summary on slices:

[start:end]
// means "indexing of all items ranging
// from index start up to index end included".

[start:end)
// means "indexing of all items ranging
// from index start up to index end excluded".

[start:end:step]
// means "indexing of all items ranging
// from index start up to index end included
// by steps of value step".

[start:end:step)
// means "indexing of all items ranging
// from index start up to index end excluded
// by steps of value step".

Notice: slices may be used in for statements also, to loop on numbers, from start to end with stepping – in which case numbers may either be integers or floats.

3.2.5 Characters

Typee defines two types of characters: char and char16. These types are part of scalar types.

Characters are NOT strings. This type only defines variables which will contain a single character. Type char corresponds to 8-bits characters (e.g. ASCII ones). Type char16 corresponds to 16-bits characters (e.g. Unicode ones).

Characters can only be compared with characters and with strings (see below for strings). The only valid operators with characters are:

assignment operator
all comparison operators

Assignment are valid when assigning a char to a char or a char16, and when assigning a char16 to a char16.
Assigning a char16 to a char will raise an overflow error.

Comparison between char and char16 is always possible, but is strongly discouraged since chars coding may be really different, leading to unpredictable results.

Constant values for characters are enclosed between either quotes or double-quotes; both notations are valid in Typee. A few examples of such constants are: 'a', "Z", '\n', "\\", '\0x08', '\0x007f', "\0377". This is much usual with programming languages and Typee does not much derogate from classical programming here. Formal specification of character constants can be found in the Extended BNF description of Typee.

Type casting is possible on characters. Type casting is a programming concept that allows for the modification of the type of a constant or of a variable. The only allowed type casting with characters is their casting to integers. See an example in the sub-section on integers.
A dedicated built-in function returns the 8-bits ou 16-bits coding of a character. This is function ord, which is usual also in programming languages. It returns an 8-bits or a 16-bits unsigned integer according to the type of the character.

3.2.6 Strings

Typee defines two types for strings: str and str16. These types are part of scalar types.
Strings of type str may embed only characters ot type char. Strings of type str16 may embed only characters ot type char16. They can be modified by applying to them built-in operators and functions. If a char is assigned to a string of type str16, it will be transformed into its 16-bits coded equivalent. On the opposite, no 16-bits character may be applied to an 8-bits string.

Constant values for strings are embedded between quotes or double-quotes:

'This is a valid string constant'
"This is another valid string constant"

Constant strings may also be split on many code lines:

"This is a "
"multi-lines string.\n"
"This kind of notation is useful when "
"specifying very long string constants."

As shown in above example, classical escape characters may be used in strings: e.g. '\'', "\"", '\0xfe', '\0777'

Avaliable operators for strings are:

assignment operator
all comparison operators
concatenation operator
indexing operator
slicing operator
containing operator

Assignment, comparison, concatenation and containing operators may only be applied to same type of strings. Mixing str and str16 variables with them will raise a type error at translation time.

Let’s have a very short explanation about concatenation, indexing, slicing and containing operators. These concepts will be fully explained later.

The concatenation operator for string is operator +. It creates a new string which contains the left string content concatenated with the right string content next. Do not attempt to concatenate strings to different sizes. This will be an error type.

The indexing operator is, as usual, []. It is used at the right of a string and contains a single signed integer value. Examples: "abcdef"[0] is character 'a', "abcdef"[3] is character 'd', "abcdef"[-1] is character "f" (the last one), "abcdef"[-3] is character 'd'. So, indexing a string returns a character : a char when indexing a str, or a char16 when indexing a str16.

The slicing operator will be of no difficulty for Python programmers. It returns a string of same type than the sliced string. This operator is the indexing one but with a slice specified in it. Examples:
"abcdef"[0:2] returns "abc" (CAUTION : very different from Python for which last index is considered as being one above the targeted index); "abcdef"[0:5:2] returns "ace"; "abcdef"[5:3:-1] returns "fed". To get slicing as usual in Python, one must use the specific indexing operator [) which is more usual in mathematics notation: last index is the excluded from the sliced interval. Examples: "abcdef"[0:2) returns "ab" (as with Python slicing expression [0:2]); "abcdef"[5:3:-1) returns "fe".

The containing operator is in. It is used as with Python. Examples: 'a' in "abcdef" returns True; 'z' in "abcdef" returns False; "abc" in "abcdef" returns True; "cdez" in "abcdef" returns False; "fe" in "abcdef"[-1:0:-1] returns True. There, characters as well as sub-strings may be searched into a string, but characters and strings must be of the same characters size, either both 8-bits or both 16-bits. Any mixing of sizes will raise a type error.

Finally, the built-in library String provides functions and methods for the manipulation of strings. For instance, to get the length (number of characters) of a string, to format it with values, to put it in upper case or lower case, to find first or all indexes of a sub-string in a string, etc. We describe library String later in this documentation. Just be aware that it exists.

3.2.7 Specific Types

Typee offers three other specific types that are also available in other computer languages, as does C++ for instance.

Automatic type evaluation at run time

It may happen that a variable cannot get a static type at its declaration time. This is the case when keyword auto is used in C++. We will later see this when explaining some of the looping Typee statements (for, just to name only one).

The declaration of variables, functions and methods, for which type will be known at run-time only, use token ? preceeding the identifier of the variable, function or method in the declaration clause. A list of allowed types may be specified to limit possibilities and to help type checking at translation time. Formal EBNF Specification is:

<auto type>      ::= '?' ['in' <types list>]
<types list>     ::= '(' <TYPE> (',' <TYPE>)* ')'

Example:

my_list = [0, 1., "a-string"];
for( ? obj in my_list )
    print( obj );  // prints 0 1.0 a-string
for( ? in (float32, str) obj in my_list )
    print( obj );  // prints 0.0 1.0 a-string due to integer
                   // promotion of value 0 to float 0.0

Here, variable obj will be first an integer, second a float and third a string. The for loop will be fully described later in this documentation and is used right now just for illustration purpose.

Variable number of arguments

This concept has been available for a while, starting with language C (varargs). It allows the declaration of a variable number of arguments. In Typee, this is considered as a built-in type. It is named with ellipsis: .... This very specific type name has to preceede the identifier of the variable number of arguments, in the list of arguments of functions or methods.

Example:

int32 sum( ... nums ){
    int32 s = 0;
    for( int32 num in nums )
        s += num;
    return s;
}
print( sum(1,2,3), sum(10,20), sum(100,110,120,130,140) );

Above code, once translated in some other programming language and run, should print

6 30 600

Here, variable num is declared as being of type int32, so variable argument nums should only contain variables of types that are compatible with type int32. Type checking might not be possible at translation time. This is the reason why this kind of coding is discouraged. Nevertheless, it may be usefull also and provided for their convenience to careful programmers.

When discussing functions, methods and arguments declaration, we will see that variable arguments have not to be the ending argument in the list of arguments but that this should be programmed with care.

No returned value

It will eventually happen that functions and methods run some task without returning any value. This is the case when printing values, for instance, or when displaying results. In such cases, keyword none is used, preceeding the identifier of the function or of the method when it is declared. Keyword None is also accepted, since this is a usual keyword in Python (but while this is not its use in this programming language).

Example:

forward none print( ...args );

Well, we introduce here a new concept: forward. It will be fully explained later and is used here just for illustration purpose. forward allows forward declarations.

none is something like void in C and C++. We have decided to force its use when functions or methods do not return any value. While it might appear to be a little bit verbose, it helps certifying that a function or a method will NOT return any value. In former versions of C and C++, not using keyword void could lead to ambiguities: a function or a method declared with no return type was supposed to return an integer value and if no value was returned, compiler was supposed to silently return 0 which might not be used at function return-time. We prefer things to be very clear in Typee, event if they get more verbose.

3.2.8 Enumerated Type

Enumerations are sets of identifiers which each get a different internal value automatically set by the system (here, the Typee translator). Those identifiers may be considered as labels. These labels are not scalars, but can be compared for equality. Within containers (see next sub-section about containers types), stored items that have some enumerated labels as attributes may be clustered according to these labels (i.e. grouped together within the container by sorting them on the labels values).

<enum type> ::= 'enum'

enum declares an enumerated type. We will see how enumerated types are declared when explaining what are compound statements. Let’s first say that after keyword enum should be placed the identifier of the enumerated type and then a list of identifiers – the “labels” – separated by commas and enclosed within curly brackets.

3.2.9 Container Types

Containers in Typee are of many kinds. They all share a same characteristic: they contain items. Contained items can be of any type. We describe here the containers types, not their use.

<container type> ::= <array_type> | <file type> |
                     <list type>  | <map type>  | <set type>

<array type>     ::= 'array' '<' <types args> '>'
<file type>      ::= 'file' ['<' <types args> '>
']
<list type>      ::= 'list' ['<' <types args> '>']
<map type>       ::= 'map'  ['<' <types args> '>']
<set type>       ::= 'set'  ['<' <types args> '>']

<types args>     ::= <TYPE> (',' <TYPE>)*

array is a strongly typed container. The types of the contained items has to be declared, associated with the keyword. The declared types can be any kind of types: scalar types, container types, classes names and types aliases (see next sub-section about types aliasing).
Arrays are one- and multi-dimensional containers. Items are stored contiguously in them. 2-dimensional arrays are well suited to store pixels of an image, while 1-dimensional arrays are vectors, for instance.
Arrays are indexable, which means that the contained items can be directly accessed by indexing the array, one index per declared dimension.

Within a single dimension of an array, items can be sorted if they are comparable – this is the case for scalar types for instance; we will see how instances of a class can be made comparable items when we will present classes in Typee, later in this documentation. We will see there also all the mechanisms and goodies associated with arrays when discussing containers.

Notice: 1-dimensional arrays are somewhat lists with contained items forced to be typed, while this is not mandatory for lists.

list is the type of linear containers which may contain any type of objects. This is the equivalent of a 1-dimensional array with no constraints on types of contained items. Meanwhile, Typee specifies the possibility of declaring types for contained items.

Lists may be sorted, as long as the contained items are comparable all together. Built-in operators and methods are specified for copying, referencing, manipulating and running-through lists. They will be described when describing lists in Typee.

We will see how to declare lists variables and initialize them when presenting compound statements, later in this document.

set is the type of mathematical sets. A set is something like a bag. It contains definitively unordered items of any type. Moreover, it contains one and only one instance of each item. For instance, inserting twice in a set the integer 1 will insert it only once. Meanwhile, Typee specifies also the declaration of types for the items that are inserted in a set.

So, sets may not be sorted. Built-in operators and methods are specified for copying, referencing, manipulating and running-through sets. They will be described when describing sets in Typee.

See the section compound statements, later in this documentation, to get an overview of how declaring and initializing sets.

map is the type of containers that index their content with keys. This is exactly as a dictionary: to a word corresponds its definition. In a map, every contained items correspond to their owned and unique keys. This way, maps may be indexed on the keys and items can be found back very easily.
While contained items may be of any kind in a same map instance, Typee specifies also the declaration of types for the contained items as well as for their keys.

Maps may be sorted, either on their keys or on their items as long as keys or items are comparable. Built-in operators and methods associated with maps will be described later in this document.

The way to declare and to initialize maps will be described in later section compound statements of this documentation.

Notice: it might be that in some future specific kinds of maps will be specified in Typee. The first one will be hashmap, for which keys will automatically evaluated as the hash value of the inserted items.

file is a specific container type. As the reader may guess, files are linear structures that are stored on permanent memory. It deals with the file system of the Operating System. Files entities store characters but they may store any other kind of objects if the types of these objects are specified at declaration time. Furthermore, if contained objects are all of the same type, files may be indexed and items may be directly addressed.

Files are not supposed to be sorted. They can be read and written, one item after the other or whole items at a time. Built-in operators and methods associated with maps will be described later in this document.

Declaration of files will be described in later section compound statements of this documentation.

3.2.10 Type aliasing

In Typee, types may be aliased. A type alias is a declaration of a synonym for a type. This does ease the shortening of types names as well as does it allow the renaming of a built-in Typee type into some more conventional name.

The formal EBNF specifcation for this is:

<type alias> ::= 'type' <TYPE> 'as' <identifier> 
                     (',' <TYPE> 'as' <identifier>)*

Examples:

type float32 as float
type float64 as double
type uint8 as byte
type ClassA.ClassB.ClassC as My_ClassABC

3.2.11 Built-in module type.ty

Typee provides a built-in module that can be found in directory Libs: type.ty. This built-in module, when imported in your code, offers pre-defined aliases for Typee types. These types aliases should help programmers program in Typee without forgetting their own habits with types naming.

Programmers are nevertheless encouraged to prefer to use Typee types everywhere in their own code. This should help exchange of code between programmers of a same team or company as well as this should help and ease the delivering of code to Pen Source projects.

Here is the code of the types.ty built-in module, with its copyright notice.

/***
Copyright (c) 2018-2019 Philippe Schmouker, Typee project, http://www.typee.ovh

Permission is hereby granted,  free of charge,  to any person obtaining a copy
of this software and associated documentation files (the "Software"),  to deal
in the Software without restriction, including  without  limitation the rights
to use,  copy,  modify,  merge,  publish,  distribute, sublicense, and/or sell
copies of the Software,  and  to  permit  persons  to  whom  the  Software  is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS",  WITHOUT WARRANTY OF ANY  KIND,  EXPRESS  OR
IMPLIED,  INCLUDING  BUT  NOT  LIMITED  TO  THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT  SHALL  THE
AUTHORS  OR  COPYRIGHT  HOLDERS  BE  LIABLE  FOR  ANY CLAIM,  DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT,  TORT OR OTHERWISE, ARISING FROM,
OUT  OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
***/

//=============================================================================
// Built-in Types Aliases

type int8    as byte  ;
type uint8   as ubyte ;
type int16   as short ;
type uint16  as ushort;
type int32   as int   ;
type uint32  as uint  ;
type int64   as long  ;
type uint64  as ulong ;
type float32 as float ;
type float64 as double;

//=====   end of built-in module   types.ty   =====//

To use these types aliases, just import types.ty module in your code. We have not yet seen how to do this, but we nevertheless provide a simple example here, to be fully understood later:

from types import all;
  // automatically searches in directory Libs a module types.ty
long   val64;  // this is a 64-bits signed integer variable
double flt64;  // this is a 64-bits floating point variable
ubyte  uval8;  // this is an 8-bits unsigned byte: 0..255

Next section formerly explains the construction of literals – what we have named ‘constants’ up to now in current section.

< previous (3.1 language generalities) | (3.3 literals) next >