Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.postscript > #3757
| Date | 2022-02-28 21:39 +1100 |
|---|---|
| Subject | Re: Composite fonts for Unicode strings (was: Final request for feedback) |
| Newsgroups | comp.lang.postscript |
| References | <6211b169$1@news.ausics.net> <014e6696-760d-4aa0-a79c-f2c684db6f31n@googlegroups.com> <20220226014436.0000276a@cvkm.cz> |
| From | David Newall <davidn@davidnewall.com> |
| Message-ID | <621ca662$1@news.ausics.net> (permalink) |
| Organization | Ausics - https://www.ausics.net |
Hi Carlos,
On 26/2/22 11:44, Carlos wrote:
> A simpler approach is to reencode the UTF-8 string
What an elegant decoder; and I like the iterator with its clever use of
an array.
Invalid sequences should produce U+FFFD. Add:
/unget {
load 0 get dup 0 get dup 0 gt
{ 1 sub 0 exch put } { pop pop } ifelse
} def
and then only two changes:
pop 16#FFFD 0 % invalid sequence
and
6 bitshift nextch not { pop 16#FFFD exit } if
dup 2#11000000 and 2#10000000 ne
{ /nextch unget pop 16#FFFD exit } if
2#00111111 and add
It still accepts overlong sequences but gives output consistent with the
input.
Regards,
David
Back to comp.lang.postscript | Previous | Next — Previous in thread | Next in thread | Find similar
Final request for feedback David Newall <davidn@davidnewall.com> - 2022-02-20 14:11 +1100
Re: Final request for feedback luser droog <luser.droog@gmail.com> - 2022-02-22 07:36 -0800
Composite fonts for Unicode strings (was: Final request for feedback) Carlos <carlos@cvkm.cz> - 2022-02-26 01:44 +0100
Re: Composite fonts for Unicode strings (was: Final request for feedback) David Newall <davidn@davidnewall.com> - 2022-02-28 21:39 +1100
Re: Composite fonts for Unicode strings (was: Final request for feedback) David Newall <davidn@davidnewall.com> - 2022-02-28 22:26 +1100
Re: Final request for feedback Carlos <carlos@cvkm.cz> - 2022-02-26 01:56 +0100
csiph-web