Post Reply 
utf-8 for RPL source code
04-17-2015, 01:04 PM
Post: #4
RE: utf-8 for RPL source code
(04-07-2015 01:06 AM)Claudio L. Wrote:  At this stage I'm debating whether is worth trying to deal with combining marks or perhaps we should just ignore all that and only attempt to display the 256 characters that were originally used in the calculator, and any other codes will either be ignored or shown as a square.

To follow up on this, I created a routine that does NFC normalization in streaming mode, so that no temporary memory needs to be allocated at all.
It will take a significant amount of space in tables, and is quite complex (as in not going to be fast enough at 6MHz).
Simpler versions simply couldn't handle the complexity (it's painted as easy in the Unicode specification, but there's so many corner cases, all handled with extra tables...).
My code still doesn't include hangul compositions, so it only passes 7000 of the >18000 tests. Basically it works properly for all languages except the oriental ideographic languages (chinese, korean, etc), which require additional huge tables or extra algorithms (hangul).
I still haven't tried to pack the tables in the most compact possible way, merely got a working routine, but doesn't look like it will be fast enough for my purposes.
I think I'll trash the project and go back to my original idea: All strings would be assumed as already NFC normalized. No normalization attempt will be done by newRPL.
The character subset will use their unicode codes and all strings will be UTF8 encoded. Since the existing character mappings have only single characters, they are guaranteed to be already NFC normalized.
Any strange characters will be displayed as box or question mark, and skipped, not altered by any routines.
So it will be Unicode aware (tolerant) and utf8 encoded, but by no means will try to get into this mess in embedded systems. When porting to larger hardware, then it should be easy to add a normalization step using a standards compliant library (which takes like 12 MB of space).
I feel I should be spending time and effort on calculations, not text handling.
Find all posts by this user
Quote this message in a reply
Post Reply 


Messages In This Thread
utf-8 for RPL source code - Claudio L. - 04-07-2015, 01:06 AM
RE: utf-8 for RPL source code - Claudio L. - 04-07-2015, 01:33 PM
RE: utf-8 for RPL source code - Claudio L. - 04-17-2015 01:04 PM
RE: utf-8 for RPL source code - Claudio L. - 04-21-2015, 02:06 PM
RE: utf-8 for RPL source code - Claudio L. - 04-24-2015, 01:42 PM
RE: utf-8 for RPL source code - Helix - 04-22-2015, 01:06 AM
RE: utf-8 for RPL source code - Claudio L. - 04-22-2015, 02:08 AM
RE: utf-8 for RPL source code - Helix - 04-22-2015, 10:40 PM
RE: utf-8 for RPL source code - Claudio L. - 04-23-2015, 01:10 PM
RE: utf-8 for RPL source code - Claudio L. - 04-23-2015, 03:52 PM
RE: utf-8 for RPL source code - Helix - 04-23-2015, 11:00 PM
RE: utf-8 for RPL source code - Claudio L. - 04-24-2015, 01:06 PM



User(s) browsing this thread: 1 Guest(s)