Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.graphics.apps.gnuplot > #3098

Re: Only partial handling of Unicode in `with points pointtype "CHAR" `?

From Ethan A Merritt <sfeam@users.sourceforge.net>
Newsgroups comp.graphics.apps.gnuplot
Subject Re: Only partial handling of Unicode in `with points pointtype "CHAR" `?
Date 2015-09-28 17:35 -0700
Organization gnuplot development
Message-ID <mucm88$712$1@dont-email.me> (permalink)
References <851d6499-ce49-4434-9c6e-40281b7443a0@googlegroups.com>

Show all headers | View raw


Kalin Kozhuharov wrote:

> Hello,
> 
> I was answering another question here when I might have found a bug, that
> is hard to describe, but relatively easy to reproduce...
> 
> I tested on 2 different linux machines with gnuplot-4.6.5 and
> gnuplot-5.0.1 and few terminals (qt, x11, png) the following commands
> (you'll need UTF-8 to see some of them):
> 
> gnuplot -p -e 'plot sin(x) w p pt "A";'
> gnuplot -p -e 'plot sin(x) w p pt "щ";'
> gnuplot -p -e 'plot sin(x) w p pt "猫";'
> 
> The 5.0.1 works fine (4.6.5 is not supposed to allow this pt, so no
> issues), drawing a sine wave with the glyph specified.
> 
> However, the one below (UNICODE CAT FACE, U+1F431) does not work, no
> matter what I tired:
> 
> gnuplot -p -e 'plot sin(x) w p pt "🐱";'

Bug or limitation, whichever you want to call it.

The structure that holds line and point characteristics reserves
one unsigned long (4 bytes on most machines) to hold the UTF-8 byte
stream representing the character. Currently it is handled internally
as a C language string, which means that the 4th byte is required 
to be \0, leaving three bytes for the UTF-8 representation.
That is sufficient for Unicode code points up to U+FFFF, which covers
the "Basic Multilingual Plane" a.k.a "plane 0". It is not sufficient
for code points higher than that like your CAT FACE at U+1F431.

The entire set of Unicode planes could be handled either by
enlarging the lp_style_type->p_char field or by revising the code
that uses it to treat it explicitly as 4 bytes rather than as
a generic string.  Certainly possible, but I don't think it is
very high priority change.  As you found, using those higher-plane
Unicode characters in labels and other text strings works fine.
It is only use as a point type that is limited.

	Ethan

> Any ideas? How can that be worded better, assuming it is a bug?
> 
> Cheers,
> Kalin.

Back to comp.graphics.apps.gnuplot | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Only partial handling of Unicode in `with points pointtype "CHAR" `? Kalin Kozhuharov <me.kalin@gmail.com> - 2015-09-27 12:10 -0700
  Re: Only partial handling of Unicode in `with points pointtype "CHAR" `? Karl Ratzsch <mail.kfr@gmx.net> - 2015-09-27 23:02 +0200
  Re: Only partial handling of Unicode in `with points pointtype "CHAR" `? Ethan A Merritt <sfeam@users.sourceforge.net> - 2015-09-28 17:35 -0700
    Re: Only partial handling of Unicode in `with points pointtype "CHAR" `? Kalin Kozhuharov <me.kalin@gmail.com> - 2015-09-28 20:50 -0700

csiph-web