Post Reply 
ToUpper() ?
03-09-2015, 01:02 PM
Post: #1
ToUpper() ?
Hello,

I was wondering if we already have a ToUpper() function in the prime. I didn't find it in the catalogue, but chances are high I overlooked it.

ToUpper("aaa") -> "AAA"
Find all posts by this user
Quote this message in a reply
03-09-2015, 06:41 PM
Post: #2
RE: ToUpper() ?
I don't know a ready made function, but by combination of ASC and CHAR the result is the same.
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".
Find all posts by this user
Quote this message in a reply
03-09-2015, 07:50 PM (This post was last modified: 03-09-2015 07:51 PM by PANAMATIK.)
Post: #3
RE: ToUpper() ?
(03-09-2015 06:41 PM)Thomas_Sch Wrote:  I don't know a ready made function, but by combination of ASC and CHAR the result is the same.
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".

This obviously doesn't work with mixed upper and lower case strings like "Abcdefg".

That's one small step for a man - one giant leap for mankind.
Find all posts by this user
Quote this message in a reply
03-09-2015, 08:25 PM (This post was last modified: 03-09-2015 08:26 PM by Mark Hardman.)
Post: #4
RE: ToUpper() ?
(03-09-2015 07:50 PM)PANAMATIK Wrote:  
(03-09-2015 06:41 PM)Thomas_Sch Wrote:  I don't know a ready made function, but by combination of ASC and CHAR the result is the same.
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".

This obviously doesn't work with mixed upper and lower case strings like "Abcdefg".

Nor would it work with accented Latin characters and other non-ASCII characters (e.g. œ to Œ).

The case mapping and other special cases (e.g. for the German eszett [ß]) are fully defined and supported by the Unicode Standard under Case Mappings. It would certainly not be a trivial task to implement.

Ceci n'est pas une signature.
Find all posts by this user
Quote this message in a reply
03-09-2015, 08:45 PM
Post: #5
RE: ToUpper() ?
How about this:

- Define an array variable like Table(i) with the required dimension, for instance, 256 entries (or any other value, depending on the character table size to use).
- Initialize each array position with the required Uppercase value.
- As input, use the lowercase character binary weight as an index to the Table(i) array list and get the corresponding uppercase value.

I didn't try to implement this kind of algorithm in the Prime myself, though.

Jose Mesquita
RadioMuseum.org member

Find all posts by this user
Quote this message in a reply
03-10-2015, 06:27 AM (This post was last modified: 03-10-2015 06:28 AM by Angus.)
Post: #6
RE: ToUpper() ?
I went with the following because mainly I am interessted in the alphanumeric keyboard. Initially I had problems while playing with integers and I wanted to have string constants converted to upper case. Poorly described problem. My fault.

Code:

//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged
EXPORT TOUPPER(s)
BEGIN
  LOCAL off, ch;

  IF TYPE(s)==2 THEN
    FOR off FROM 1 TO SIZE(s) DO
      ch:=s(off);
      IF ch>=97 AND ch<=122 THEN
         ch:=ch-32;
      END;
      s(off):=ch;
    END;
  END;
END
Find all posts by this user
Quote this message in a reply
03-10-2015, 06:28 AM
Post: #7
RE: ToUpper() ?
Hello,

They are ~65536 chars in Prime, your solution would use 128KB of table...
Clearly the fastest solution, but not the most memory friendly one...

Cyrille
Find all posts by this user
Quote this message in a reply
03-10-2015, 07:51 AM
Post: #8
RE: ToUpper() ?
(03-09-2015 06:41 PM)Thomas_Sch Wrote:  
Code:
CHAR(ASC("abcdefg")-32)
results in "ABCDEFG".

(03-10-2015 06:27 AM)Angus Wrote:  
Code:
//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged
EXPORT TOUPPER(s)
BEGIN
  LOCAL off, ch;

  IF TYPE(s)==2 THEN
    FOR off FROM 1 TO SIZE(s) DO
      ch:=s(off);
      IF ch>=97 AND ch<=122 THEN
         ch:=ch-32;
      END;
      s(off):=ch;
    END;
  END;
END

Combining the two above:
Code:
//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged
EXPORT TOUPPER(s)
BEGIN
  IFTE(TYPE(s)==2, CHAR(EXECON("IFTE(&1>=97 AND &1<=122,&1-32,&1)",ASC(s))),s);
END;
Find all posts by this user
Quote this message in a reply
03-10-2015, 08:10 AM (This post was last modified: 03-10-2015 08:25 AM by Angus.)
Post: #9
RE: ToUpper() ?
hello Didier,

do you have any experience if avoiding loops is a good idea? I mean performancewise? Like in matlab, I mean.
Find all posts by this user
Quote this message in a reply
03-10-2015, 08:10 AM
Post: #10
RE: ToUpper() ?
(03-10-2015 07:51 AM)Didier Lachieze Wrote:  Combining the two above:
Code:
//convert 'a'..'z' to 'A'..'Z'. Leave others unchanged
EXPORT TOUPPER(s)
BEGIN
  IFTE(TYPE(s)==2, CHAR(EXECON("IFTE(&1>=97 AND &1<=122,&1-32,&1)",ASC(s))),s);
END;
Great!!
many thanks,
i was thinking about a simple solution, but I trip over my feet ;-)
please, please write a little tutorial about that, specifically about EXECON examples!
Find all posts by this user
Quote this message in a reply
03-10-2015, 08:26 AM
Post: #11
RE: ToUpper() ?
yes, I forgot. THANK YOU! Great inspiration, indeed.
Find all posts by this user
Quote this message in a reply
03-10-2015, 08:34 AM
Post: #12
RE: ToUpper() ?
(03-10-2015 08:10 AM)Angus Wrote:  hello Didier,

do you have any experience if avoiding loops is a good idea? I mean performancewise? Like in matlab, I mean.
I would say that it depends on what you do in the loop. Here, as it is a simple operation, the loop is faster than the combination of string to list (ASC), list processing (EXECON) and list to string (CHAR).
Find all posts by this user
Quote this message in a reply
03-10-2015, 09:16 AM
Post: #13
RE: ToUpper() ?
Just a word about EXECON:

I would adress its elements as &11 instead &1 even if there is only a single list. When dealing with a second the &21 notation is used. In my eyes that is not straight. A straight and clear notation is worth a lot in my eyes. I tried &11 with a single list and it works.
Do I understand correctly that you can only use relative offsets up to 9? I mean &11 is interpreted as first element, first list and not (implied first list) 11th relative element.
Find all posts by this user
Quote this message in a reply
03-10-2015, 12:12 PM (This post was last modified: 03-10-2015 12:17 PM by Didier Lachieze.)
Post: #14
RE: ToUpper() ?
(03-10-2015 09:16 AM)Angus Wrote:  Just a word about EXECON:
I would adress its elements as &11 instead &1 even if there is only a single list. When dealing with a second the &21 notation is used. In my eyes that is not straight. A straight and clear notation is worth a lot in my eyes. I tried &11 with a single list and it works.
The notation is quite flexible but you can choose to explicitly specify the list number and the relative element position in the list.

For example, with one list a single number after '&' specifies the relative position in the list:
EXECON("&1+&2",{1,3,5}) is the same as
EXECON("&11+&12",{1,3,5})

With two lists a single number after '&' specifies the list used (the relative position is 1 by default):
EXECON("&1+&2",{1,3,5},{2,4,6}) is the same as
EXECON("&11+&21",{1,3,5},{2,4,6})

(03-10-2015 09:16 AM)Angus Wrote:  Do I understand correctly that you can only use relative offsets up to 9? I mean &11 is interpreted as first element, first list and not (implied first list) 11th relative element.
Yes, this is clearly stated in the manual and the calculator help.

Btw there is an error in the latest manual (but not in the calculator help): in the first example a '1' is missing after '&'.
[Image: a0np6m]

[Image: a0np68]
Find all posts by this user
Quote this message in a reply
03-10-2015, 12:55 PM
Post: #15
RE: ToUpper() ?
(03-10-2015 12:12 PM)Didier Lachieze Wrote:  Btw there is an error in the latest manual (but not in the calculator help): in the first example a '1' is missing after '&'.

Thanks. Looks like they missed that during the file conversion work last time.

TW

Although I work for HP, the views and opinions I post here are my own.
Find all posts by this user
Quote this message in a reply
03-11-2015, 11:44 PM
Post: #16
RE: ToUpper() ?
Old-timer, assembler programmers would do something like this:
Code:
CHAR(BITOR(ASC("ABcde"),#20h)) -> "abcde"

and

CHAR(BITAND(ASC("ABcde"),BITNOT(#20h))) -> "ABCDE"
Find all posts by this user
Quote this message in a reply
03-12-2015, 06:25 AM (This post was last modified: 03-12-2015 06:30 AM by bobkrohn.)
Post: #17
RE: ToUpper() ?
Here's my versions of UCase and LCase.
Easy to understand.
Actually this concept can be adapted to many other uses.
Like a Filter.
The BITAND & BITOR versions above didn't work correctly.
This version works with spaces, numbers, etc.

Code:

EXPORT UCase(MyText)
BEGIN

LOCAL i,test,temp,c;

test:="abcdefghijklmnopqrstuvwxyz";
temp:="";
c:="";

FOR i FROM 1 TO DIM(MyText) DO
  c := MID(MyText,i,1);
  IF INSTRING(test,c) THEN
    temp := temp + CHAR((ASC(c)-32)); 
  ELSE
    temp := temp + c; 
  END;  
END;

RETURN temp; 
//-----
END;



EXPORT LCase(MyText)
BEGIN

LOCAL i,test,temp,c;

test := "ABCDEFGHIJKLMNOPQRSTUVWXYZ";
temp := "";
c := "";

FOR i FROM 1 TO DIM(MyText) DO
  c := MID(MyText,i,1);
  IF INSTRING(test,c) THEN
    temp := temp + CHAR((ASC(c)+32)); 
  ELSE
    temp := temp + c; 
  END;  
END;

RETURN temp; 
//-----
END;
Find all posts by this user
Quote this message in a reply
03-12-2015, 03:49 PM
Post: #18
RE: ToUpper() ?
Hello,
I think that this should work...

for A:= 1 to size(string) do
if instring("abce....yz", string(A))
string(A):= string(A)+32;
end;
end;

Cyrille
Find all posts by this user
Quote this message in a reply
03-12-2015, 04:19 PM (This post was last modified: 03-12-2015 04:58 PM by Claudio L..)
Post: #19
RE: ToUpper() ?
(03-10-2015 06:28 AM)cyrille de brébisson Wrote:  Hello,

They are ~65536 chars in Prime, your solution would use 128KB of table...
Clearly the fastest solution, but not the most memory friendly one...

Cyrille

Here's my solution in C (somebody can translate to Prime? Han perhaps?). These tables were prepared by me based on the Unicode standard, with help from a public document about case folding. I did it a few years back, so they might have added more symbols/ranges afterwards.

Code:

static const struct {
   unsigned short start;
   unsigned short end;
   signed int diff;
} folding_table16[] = {
{0x0041,0x005A,32},
{0x00B5,0x00B5,775},
{0x00C0,0x00D6,32},
{0x00D8,0x00DE,32},
{0x0100,0x012E,1},
{0x0132,0x0136,1},
{0x0139,0x0147,1},
{0x014A,0x0176,1},
{0x0178,0x0178,-121},
{0x0179,0x017D,1},
{0x017F,0x017F,-268},
{0x0181,0x0181,210},
{0x0182,0x0184,1},
{0x0186,0x0186,206},
{0x0187,0x0187,1},
{0x0189,0x018A,205},
{0x018B,0x018B,1},
{0x018E,0x018E,79},
{0x018F,0x018F,202},
{0x0190,0x0190,203},
{0x0191,0x0191,1},
{0x0193,0x0193,205},
{0x0194,0x0194,207},
{0x0196,0x0196,211},
{0x0197,0x0197,209},
{0x0198,0x0198,1},
{0x019C,0x019C,211},
{0x019D,0x019D,213},
{0x019F,0x019F,214},
{0x01A0,0x01A4,1},
{0x01A6,0x01A6,218},
{0x01A7,0x01A7,1},
{0x01A9,0x01A9,218},
{0x01AC,0x01AC,1},
{0x01AE,0x01AE,218},
{0x01AF,0x01AF,1},
{0x01B1,0x01B2,217},
{0x01B3,0x01B5,1},
{0x01B7,0x01B7,219},
{0x01B8,0x01B8,1},
{0x01BC,0x01BC,1},
{0x01C4,0x01C4,2},
{0x01C5,0x01C5,1},
{0x01C7,0x01C7,2},
{0x01C8,0x01C8,1},
{0x01CA,0x01CA,2},
{0x01CB,0x01DB,1},
{0x01DE,0x01EE,1},
{0x01F1,0x01F1,2},
{0x01F2,0x01F4,1},
{0x01F6,0x01F6,-97},
{0x01F7,0x01F7,-56},
{0x01F8,0x021E,1},
{0x0220,0x0220,-130},
{0x0222,0x0232,1},
{0x023A,0x023A,10795},
{0x023B,0x023B,1},
{0x023D,0x023D,-163},
{0x023E,0x023E,10792},
{0x0241,0x0241,1},
{0x0243,0x0243,-195},
{0x0244,0x0244,69},
{0x0245,0x0245,71},
{0x0246,0x024E,1},
{0x0345,0x0345,116},
{0x0370,0x0372,1},
{0x0376,0x0376,1},
{0x0386,0x0386,38},
{0x0388,0x038A,37},
{0x038C,0x038C,64},
{0x038E,0x038F,63},
{0x0391,0x03A1,32},
{0x03A3,0x03AB,32},
{0x03C2,0x03C2,1},
{0x03CF,0x03CF,8},
{0x03D0,0x03D0,-30},
{0x03D1,0x03D1,-25},
{0x03D5,0x03D5,-15},
{0x03D6,0x03D6,-22},
{0x03D8,0x03EE,1},
{0x03F0,0x03F0,-54},
{0x03F1,0x03F1,-48},
{0x03F4,0x03F4,-60},
{0x03F5,0x03F5,-64},
{0x03F7,0x03F7,1},
{0x03F9,0x03F9,-7},
{0x03FA,0x03FA,1},
{0x03FD,0x03FF,-130},
{0x0400,0x040F,80},
{0x0410,0x042F,32},
{0x0460,0x0480,1},
{0x048A,0x04BE,1},
{0x04C0,0x04C0,15},
{0x04C1,0x04CD,1},
{0x04D0,0x0526,1},
{0x0531,0x0556,48},
{0x10A0,0x10C5,7264},
{0x10C7,0x10C7,7264},
{0x10CD,0x10CD,7264},
{0x1E00,0x1E94,1},
{0x1E9B,0x1E9B,-58},
{0x1E9E,0x1E9E,-7615},
{0x1EA0,0x1EFE,1},
{0x1F08,0x1F0F,-8},
{0x1F18,0x1F1D,-8},
{0x1F28,0x1F2F,-8},
{0x1F38,0x1F3F,-8},
{0x1F48,0x1F4D,-8},
{0x1F59,0x1F59,-8},
{0x1F5B,0x1F5B,-8},
{0x1F5D,0x1F5D,-8},
{0x1F5F,0x1F5F,-8},
{0x1F68,0x1F6F,-8},
{0x1F88,0x1F8F,-8},
{0x1F98,0x1F9F,-8},
{0x1FA8,0x1FAF,-8},
{0x1FB8,0x1FB9,-8},
{0x1FBA,0x1FBB,-74},
{0x1FBC,0x1FBC,-9},
{0x1FBE,0x1FBE,-7173},
{0x1FC8,0x1FCB,-86},
{0x1FCC,0x1FCC,-9},
{0x1FD8,0x1FD9,-8},
{0x1FDA,0x1FDB,-100},
{0x1FE8,0x1FE9,-8},
{0x1FEA,0x1FEB,-112},
{0x1FEC,0x1FEC,-7},
{0x1FF8,0x1FF9,-128},
{0x1FFA,0x1FFB,-126},
{0x1FFC,0x1FFC,-9},
{0x2126,0x2126,-7517},
{0x212A,0x212A,-8383},
{0x212B,0x212B,-8262},
{0x2132,0x2132,28},
{0x2160,0x216F,16},
{0x2183,0x2183,1},
{0x24B6,0x24CF,26},
{0x2C00,0x2C2E,48},
{0x2C60,0x2C60,1},
{0x2C62,0x2C62,-10743},
{0x2C63,0x2C63,-3814},
{0x2C64,0x2C64,-10727},
{0x2C67,0x2C6B,1},
{0x2C6D,0x2C6D,-10780},
{0x2C6E,0x2C6E,-10749},
{0x2C6F,0x2C6F,-10783},
{0x2C70,0x2C70,-10782},
{0x2C72,0x2C72,1},
{0x2C75,0x2C75,1},
{0x2C7E,0x2C7F,-10815},
{0x2C80,0x2CE2,1},
{0x2CEB,0x2CED,1},
{0x2CF2,0x2CF2,1},
{0xA640,0xA66C,1},
{0xA680,0xA696,1},
{0xA722,0xA72E,1},
{0xA732,0xA76E,1},
{0xA779,0xA77B,1},
{0xA77D,0xA77D,-35332},
{0xA77E,0xA786,1},
{0xA78B,0xA78B,1},
{0xA78D,0xA78D,-42280},
{0xA790,0xA792,1},
{0xA7A0,0xA7A8,1},
{0xA7AA,0xA7AA,-42308},
{0xFF21,0xFF3A,32},
{0,0,0}
};
static const struct {
   int start;
   int end;
   signed int diff;
} folding_table32[] = {
{0x10400,0x10427,40},
{0,0,0}
};


Each table entry has the start and end codes of a range, and an offset value. If your character is within that range (both ends included), you need to add the offset to obtain the equivalent lowercase character (this table is for case-folding = tolower()).
To be used for toupper(), you should simply add the offset to the start and end values to get the lowercase range, check if your character is within, and subtract the offset instead of adding it.
EDIT: Forgot to mention, when the offset is 1, the uppercase and lowercase symbols are alternated. See the example code below.

There's actually 2 tables. One for 16-bit unicode values and the second one for 32-bit unicode characters (there's only one range defined in 32-bits that can be folded).

Code:

uint32_t casefold(uint32_t character)
{
// THESE TABLES ARE FOR PROPER UNICODE CASE-INSENSITIVE COMPARISON
// TABLES ADD ABOUT 1400 BYTES TO LIBRARY
#include "folding_table.h"
    int idx;

    if(character<0x10000) {
        for(idx=0;folding_table16[idx].start!=0;++idx)
        {
            if(character<folding_table16[idx].start) return character;
            if(character<=folding_table16[idx].end) {
                if(folding_table16[idx].diff==1) {
                    if( (character-folding_table16[idx].start)&1) return character;
                }
                return character+folding_table16[idx].diff;
            }
    }
        return character;
    }
    for(idx=0;folding_table32[idx].start!=0;++idx)
    {
        if(character<folding_table32[idx].start) return character;
        if(character<=folding_table32[idx].end) {
            if(folding_table32[idx].diff==1) {
                if( (character-folding_table32[idx].start)&1) return character;
            }
            return character+folding_table32[idx].diff;
        }
    }

    return character;
}

I'll leave it for the Prime gurus to make a proper Unicode compliant ToUpper() and ToLower(). In C the tables are only 1400 bytes, not sure how much space you need on the Prime. It's not the fastest, but it's the most embeddable method I could find. Feel free to use it.

Claudio
Find all posts by this user
Quote this message in a reply
03-12-2015, 04:44 PM (This post was last modified: 03-12-2015 05:06 PM by bobkrohn.)
Post: #20
RE: ToUpper() ?
Cyrille, my Prime doesn't accept

string(A):= string(A)+32;

Did not know that SIZE could be used on a String. Says nothing in Help.
I have not experimented with the idea of using the "string(A)+n".
I have to play with it!
How many other Functions have unknown uses?
Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 1 Guest(s)