Post reply

Warning: this topic has not been posted in for at least 120 days.
Unless you're sure you want to reply, please consider starting a new topic.
Name:
Email:
Subject:
Message icon:

Verification:

shortcuts: hit alt+s to submit/post or alt+p to preview


Topic Summary

Posted by: Alex Rakov
« on: April 15, 2015, 08:33:11 am »

In this tutorial we will see how could we read the literal a character by character.

Let's consider the simple example: we will read each literal character and print it on the screen.

Code: [Select]
#define system.
#define extensions.

// ...

        #var l := "Hello world".
        #var i := 0.
        #loop (i < l length)?
        [
            console write:(l@i).
           
            i := i + 1.
        ].

So far so good. Let's now make our example a little bit more difficult:

Code: [Select]
        l := "Привет Мир".
        #var i := 0.
        #loop (i < l length)?
        [
            console write:(l@i).
           
            i := i + 1.
        ].

The first loop works but the second one fails.

Let's find what breaks our code.

Starting from 1.9.19 LiteralValue is UTF-8. So the literal is actually twice as long. All Russian characters are encoded by two bytes. So why the code was not broken in the first loop? Because CharValue is UTF-32. It has enough place to store any Unicode characters. But when we read the second byte CharValue raises the exception because the code is invalid.

Note that it would works well if "l" is WideLiteralValue. But we will have the similar problem for Chinese symbols.

Fortunately we could easily fix the problem if we will use CharValue.length method. It returns how much bytes it take to encode the symbol.

Code: [Select]
        i := 0.
        #loop (i < l length)?
        [
            console write:(l@i).
           
            i := i + l@i length.
        ].

Or we could use an enumerator:

Code: [Select]
        #var enum := l enumerator.
        #loop (enum next)?
        [
            console write:(enum get).
        ].

And the simplest way would be to use extensions'control helper:

Code: [Select]
       control run:l &forEach: ch [ console write:ch. ].