Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.postscript > #3452

Re: JSON reader/writer in PostScript (second version)

From news@zzo38computer.org.invalid
Newsgroups comp.lang.postscript
Subject Re: JSON reader/writer in PostScript (second version)
Date 2019-09-15 18:34 +0000
Organization Aioe.org NNTP Server
Message-ID <1568569984.bystand@zzo38computer.org> (permalink)
References <1567559086.bystand@zzo38computer.org> <1567977875.bystand@zzo38computer.org> <1ac2aa42-301c-4343-99de-2943b33fe7b0@googlegroups.com> <a148ab80-3ae6-4bd4-aa86-9c6e8533d8eb@googlegroups.com>

Show all headers | View raw


luser droog <luser.droog@gmail.com> wrote:
> 
> The next issue is: I don't understand what to do with unicode 
> characters if they are discovered. It appears that OP's code
> reads in the multibyte sequences, constructs the codepoint in
> an int, and then truncates that to 8 bits and stores it in a
> string. That doesn't seem right, but I can't really think of
> anything better. Maybe an option either to leave the utf8 alone,
> or convert to arrays of integers? It's not clear to me what
> a PostScript program could hope to do with unicode data.
> 
> So I haven't written any utf8 handling. If I do add it, I think
> it should be added to the parser library itself as an input
> filter. The C version has these already.

The first version of my program will treat \u escapes in the way you
mention; only the low 8 bits of the codepoints are used.

The second version of my program has an option to instead convert any \u
escapes into UTF-8 encoding. (However, it will not convert surrogate pairs
into astral characters.)

Regardless of the version and of the option, if it reads any unescaped
non-ASCII characters, they will be passed through as is; it will not
interpret UTF-8 input at all, but just passes it through.

You might be able to write UTF-8 text on the page with codespace ranges,
so maybe there is the possibility to use Unicode data in that way.

If you want to add UTF-8 handling in your own program though, you can do
it whichever way you think is good, I think.

-- 
Note: I am not always able to read/post messages during Monday-Friday.

Back to comp.lang.postscript | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

JSON reader/writer in PostScript news@zzo38computer.org.invalid - 2019-09-04 01:19 +0000
  Re: JSON reader/writer in PostScript luser droog <luser.droog@gmail.com> - 2019-09-05 23:05 -0700
    Re: JSON reader/writer in PostScript luser droog <luser.droog@gmail.com> - 2019-09-06 22:06 -0700
  JSON reader/writer in PostScript (second version) news@zzo38computer.org.invalid - 2019-09-08 21:32 +0000
    Re: JSON reader/writer in PostScript (second version) luser droog <luser.droog@gmail.com> - 2019-09-12 19:11 -0700
      Re: JSON reader/writer in PostScript (second version) luser droog <luser.droog@gmail.com> - 2019-09-14 19:50 -0700
        Re: JSON reader/writer in PostScript (second version) news@zzo38computer.org.invalid - 2019-09-15 18:34 +0000
    Re: JSON reader/writer in PostScript (second version) luser droog <luser.droog@gmail.com> - 2019-09-13 19:53 -0700

csiph-web