Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #31253 > unrolled thread

pyw program not displaying unicode characters properly

Started byjjmeric <jjmeric@free.fr>
First post2012-10-14 18:55 +0200
Last post2012-10-14 16:30 -0400
Articles 12 — 7 participants

Back to article view | Back to comp.lang.python


Contents

  pyw program not displaying unicode characters properly jjmeric <jjmeric@free.fr> - 2012-10-14 18:55 +0200
    Re: pyw program not displaying unicode characters properly Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-10-14 19:19 +0200
      Re: pyw program not displaying unicode characters properly Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-14 21:01 +0000
        Re: pyw program not displaying unicode characters properly Dennis Lee Bieber <wlfraed@ix.netcom.com> - 2012-10-14 21:41 -0400
          Re: pyw program not displaying unicode characters properly Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2012-10-15 07:35 +0000
          Re: pyw program not displaying unicode characters properly Roy Smith <roy@panix.com> - 2012-10-15 07:45 -0400
        Re: pyw program not displaying unicode characters properly Alain Ketterlin <alain@dpt-info.u-strasbg.fr> - 2012-10-15 07:42 +0200
    Re: pyw program not displaying unicode characters properly MRAB <python@mrabarnett.plus.com> - 2012-10-14 18:31 +0100
      Re: pyw program not displaying unicode characters properly jjmeric <jjmeric@free.fr> - 2012-10-14 21:36 +0200
        Re: pyw program not displaying unicode characters properly Ian Kelly <ian.g.kelly@gmail.com> - 2012-10-14 15:19 -0600
          Re: pyw program not displaying unicode characters properly jjmeric <jjmeric@free.fr> - 2012-10-14 23:39 +0200
      Re: pyw program not displaying unicode characters properly Roy Smith <roy@panix.com> - 2012-10-14 16:30 -0400

#31253 — pyw program not displaying unicode characters properly

Fromjjmeric <jjmeric@free.fr>
Date2012-10-14 18:55 +0200
Subjectpyw program not displaying unicode characters properly
Message-ID<MPG.2ae50ce060f7e130989681@news.free.fr>
Hi everybody !

Our language lab at INALCO is using a nice language parsing and analysis 
program written in Python. As you well know a lot of languages use 
characters that can only be handled by unicode.

Here is an example of the problem we have on some Windows computers.
In the attached screen-shot (DELETED), 
the bambara character (a sort of epsilon)  is displayed as a square.

The fact that it works fine on some computers and fails to display the 
characters on others suggests that it is a user configuration issue:
Recent observations: it's OK on Windows 7 but not on Vista computers,
it's OK on some Windows XP computers, it's not on others Windows XP...

On the computers where it fails, we've tried to play with options in the 
International settings, but are not able to fix it.

Any idea that would help us go in the right direction, or just fix it, 
is welcome !

Thanks!
I ni ce! (in bambara, a language spoken in Mali, West Africa)

[toc] | [next] | [standalone]


#31254

FromAlain Ketterlin <alain@dpt-info.u-strasbg.fr>
Date2012-10-14 19:19 +0200
Message-ID<87626dc6oq.fsf@dpt-info.u-strasbg.fr>
In reply to#31253
jjmeric <jjmeric@free.fr> writes:

> Our language lab at INALCO is using a nice language parsing and analysis 
> program written in Python. As you well know a lot of languages use 
> characters that can only be handled by unicode.
>
> Here is an example of the problem we have on some Windows computers.
> In the attached screen-shot (DELETED), 

Usenet has no attachments. Place your document on some publicly
accessible web-servers, if needed.

> the bambara character (a sort of epsilon)  is displayed as a square.
>
> The fact that it works fine on some computers and fails to display the 
> characters on others suggests that it is a user configuration issue:
> Recent observations: it's OK on Windows 7 but not on Vista computers,
> it's OK on some Windows XP computers, it's not on others Windows XP...

You need a font that has glyphs for all unicode characters (at least the
ones you use). See http://en.wikipedia.org/wiki/Unicode_font for a
start. I don't know enough about Windows to give you a name. Anyone?

-- Alain.

P/S: and this has not much to do with python, which will happily send
out any unicode char, and cannot know which ones your terminal/whatever
will be able to display

[toc] | [prev] | [next] | [standalone]


#31260

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-10-14 21:01 +0000
Message-ID<507b280e$0$6512$c3e8da3$5496439d@news.astraweb.com>
In reply to#31254
On Sun, 14 Oct 2012 19:19:33 +0200, Alain Ketterlin wrote:

> Usenet has no attachments. 

*snarfle*

You almost owed me a new monitor. I nearly sprayed my breakfast all over 
it.

"Usenet has no attachments" -- that's like saying that the Web has no 
advertisements. Maybe the websites you visit have no advertisements, but 
there's a *vast* (and often disturbing) part of the WWW that has 
advertisements, some sites are nothing but advertisements.

And so it is with Usenet, there is a vast (and often disturbing) area of 
Usenet containing attachments, and often nothing but attachments. The 
vast volume of all these attachments are such that it is getting hard to 
find ISPs that provide free access to binary newsgroups, but some still 
do, and dedicated for-fee Usenet providers do too.


-- 
Steven

[toc] | [prev] | [next] | [standalone]


#31274

FromDennis Lee Bieber <wlfraed@ix.netcom.com>
Date2012-10-14 21:41 -0400
Message-ID<mailman.2191.1350265325.27098.python-list@python.org>
In reply to#31260
On 14 Oct 2012 21:01:03 GMT, Steven D'Aprano
<steve+comp.lang.python@pearwood.info> declaimed the following in
gmane.comp.python.general:

> 
> "Usenet has no attachments" -- that's like saying that the Web has no 
> advertisements. Maybe the websites you visit have no advertisements, but 
> there's a *vast* (and often disturbing) part of the WWW that has 
> advertisements, some sites are nothing but advertisements.
> 
> And so it is with Usenet, there is a vast (and often disturbing) area of 
> Usenet containing attachments, and often nothing but attachments. The 
> vast volume of all these attachments are such that it is getting hard to 
> find ISPs that provide free access to binary newsgroups, but some still 
> do, and dedicated for-fee Usenet providers do too.

	Classically, NNTP did not have "attachments" as seen in MIME email.

	It did have "binaries" in some encoding -- UUE, BASE64, or some
newer format, but these encodings were the raw body of the post(s), not
something "attached" as a separate file along with a text body.
-- 
	Wulfraed                 Dennis Lee Bieber         AF6VN
        wlfraed@ix.netcom.com    HTTP://wlfraed.home.netcom.com/

[toc] | [prev] | [next] | [standalone]


#31285

FromSteven D'Aprano <steve+comp.lang.python@pearwood.info>
Date2012-10-15 07:35 +0000
Message-ID<507bbcc6$0$29884$c3e8da3$5496439d@news.astraweb.com>
In reply to#31274
On Sun, 14 Oct 2012 21:41:51 -0400, Dennis Lee Bieber wrote:

> On 14 Oct 2012 21:01:03 GMT, Steven D'Aprano
> <steve+comp.lang.python@pearwood.info> declaimed the following in
> gmane.comp.python.general:
> 
> 
>> "Usenet has no attachments" -- that's like saying that the Web has no
>> advertisements. Maybe the websites you visit have no advertisements,
>> but there's a *vast* (and often disturbing) part of the WWW that has
>> advertisements, some sites are nothing but advertisements.
>> 
>> And so it is with Usenet, there is a vast (and often disturbing) area
>> of Usenet containing attachments, and often nothing but attachments.
>> The vast volume of all these attachments are such that it is getting
>> hard to find ISPs that provide free access to binary newsgroups, but
>> some still do, and dedicated for-fee Usenet providers do too.
> 
> 	Classically, NNTP did not have "attachments" as seen in MIME 
email.
> 
> 	It did have "binaries" in some encoding -- UUE, BASE64, or some
> newer format, but these encodings were the raw body of the post(s), not
> something "attached" as a separate file along with a text body.


"A rose by any other name..."


A mere implementation detail. The intention is identical: to attach a non-
text file to a message that otherwise would be text. And the interface is 
close enough as makes no difference.

You can even have a text part and binaries parts in the same news posting.



-- 
Steven

[toc] | [prev] | [next] | [standalone]


#31289

FromRoy Smith <roy@panix.com>
Date2012-10-15 07:45 -0400
Message-ID<roy-B6203E.07454515102012@news.panix.com>
In reply to#31274
In article <mailman.2191.1350265325.27098.python-list@python.org>,
 Dennis Lee Bieber <wlfraed@ix.netcom.com> wrote:

> 	Classically, NNTP did not have "attachments" as seen in MIME email.

NNTP (Network News Transport Protocol) and SMTP (Simple Mail Transfer 
Protocol) are both just ways of shipping around messages.  Neither one 
really knows about attachments.  In both mail and news, "attachments" 
are a higher-level concept encoded inside the message content and 
managed by the various user applications.

> 	It did have "binaries" in some encoding -- UUE, BASE64, or some
> newer format, but these encodings were the raw body of the post(s), not
> something "attached" as a separate file along with a text body.

This is all true of both mail and news, with only trivial changes of the 
formats and names of the encodings.

[toc] | [prev] | [next] | [standalone]


#31283

FromAlain Ketterlin <alain@dpt-info.u-strasbg.fr>
Date2012-10-15 07:42 +0200
Message-ID<87y5j8b8ar.fsf@dpt-info.u-strasbg.fr>
In reply to#31260
Steven D'Aprano <steve+comp.lang.python@pearwood.info> writes:

> On Sun, 14 Oct 2012 19:19:33 +0200, Alain Ketterlin wrote:
>
>> Usenet has no attachments. 
>
> *snarfle*
>
> You almost owed me a new monitor. I nearly sprayed my breakfast all over 
> it. [...]

I owe you nothing, and you can do whatever you want with your breakfast.

> "Usenet has no attachments" -- that's like saying that the Web has no 
> advertisements. Maybe the websites you visit have no advertisements, but 
> there's a *vast* (and often disturbing) part of the WWW that has 
> advertisements, some sites are nothing but advertisements.[...]

I really don't know what you are ranting about here. See Dennis' response.

Any idea about a reasonable complete unicode font on Windows? /That/
would be helpful.

-- Alain.

[toc] | [prev] | [next] | [standalone]


#31255

FromMRAB <python@mrabarnett.plus.com>
Date2012-10-14 18:31 +0100
Message-ID<mailman.2178.1350235875.27098.python-list@python.org>
In reply to#31253
On 2012-10-14 17:55, jjmeric wrote:
>
> Hi everybody !
>
> Our language lab at INALCO is using a nice language parsing and analysis
> program written in Python. As you well know a lot of languages use
> characters that can only be handled by unicode.
>
> Here is an example of the problem we have on some Windows computers.
> In the attached screen-shot (DELETED),
> the bambara character (a sort of epsilon)  is displayed as a square.
>
> The fact that it works fine on some computers and fails to display the
> characters on others suggests that it is a user configuration issue:
> Recent observations: it's OK on Windows 7 but not on Vista computers,
> it's OK on some Windows XP computers, it's not on others Windows XP...
>
> On the computers where it fails, we've tried to play with options in the
> International settings, but are not able to fix it.
>
> Any idea that would help us go in the right direction, or just fix it,
> is welcome !
>
> Thanks!
> I ni ce! (in bambara, a language spoken in Mali, West Africa)
>
A square is shown when the font being used doesn't contain a visible
glyph for the codepoint.

Which codepoint is it? What is the codepoint's name?

Here's how to find out:

 >>> hex(ord("Ɛ"))
'0x190'
 >>> import unicodedata
 >>> unicodedata.name("Ɛ")
'LATIN CAPITAL LETTER OPEN E'

[toc] | [prev] | [next] | [standalone]


#31256

Fromjjmeric <jjmeric@free.fr>
Date2012-10-14 21:36 +0200
Message-ID<MPG.2ae5305c2cd0f29989682@news.free.fr>
In reply to#31255
Alain, MRAB
Thank you for prompt responses.

What they suggest to me is I should look into what font is being used by 
this Python for Windows program.
I am not the programmer, so not idea where to look for.
The program settings do not include a choice for display font.

The font that used for display resembles a sort of Helvetica, but no 
idea how to check this.

Is there some sort of defaut font, or is there in Python or Python for 
Windows any ini file where the font used can be seen, eventually changed 
to a more appropriate one with all the required glyphs (like Lucida Sans 
Unicode has).

Thanks again...

[toc] | [prev] | [next] | [standalone]


#31261

FromIan Kelly <ian.g.kelly@gmail.com>
Date2012-10-14 15:19 -0600
Message-ID<mailman.2180.1350249596.27098.python-list@python.org>
In reply to#31256
On Sun, Oct 14, 2012 at 1:36 PM, jjmeric <jjmeric@free.fr> wrote:
> Is there some sort of defaut font, or is there in Python or Python for
> Windows any ini file where the font used can be seen, eventually changed
> to a more appropriate one with all the required glyphs (like Lucida Sans
> Unicode has).

No, this is up to the program and the GUI framework it uses.  Do you
have any idea which one that would be (e.g. Tkinter, wxPython, PyQT,
etc.)?

[toc] | [prev] | [next] | [standalone]


#31263

Fromjjmeric <jjmeric@free.fr>
Date2012-10-14 23:39 +0200
Message-ID<MPG.2ae54f71a6093098989683@news.free.fr>
In reply to#31261
In article <mailman.2180.1350249596.27098.python-list@python.org>, 
ian.g.kelly@gmail.com says...
> 
> On Sun, Oct 14, 2012 at 1:36 PM, jjmeric <jjmeric@free.fr> wrote:
> > Is there some sort of defaut font, or is there in Python or Python for
> > Windows any ini file where the font used can be seen, eventually changed
> > to a more appropriate one with all the required glyphs (like Lucida Sans
> > Unicode has).
> 
> No, this is up to the program and the GUI framework it uses.  Do you
> have any idea which one that would be (e.g. Tkinter, wxPython, PyQT,
> etc.)?

Thanks Ian
I have no idea, but - thanks to you - I now have an interesting question 
to ask back to the team who works on this in Russia... more later !

[toc] | [prev] | [next] | [standalone]


#31259

FromRoy Smith <roy@panix.com>
Date2012-10-14 16:30 -0400
Message-ID<roy-9529A0.16300814102012@news.panix.com>
In reply to#31255
In article <mailman.2178.1350235875.27098.python-list@python.org>,
 MRAB <python@mrabarnett.plus.com> wrote:
 
> Which codepoint is it? What is the codepoint's name?
> 
> Here's how to find out:
> 
>  >>> hex(ord("?"))
> '0x190'
>  >>> import unicodedata
>  >>> unicodedata.name("?")
> 'LATIN CAPITAL LETTER OPEN E'

Wow, I never knew you could do that.  I usually just google for "unicode 
0190" :-)

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web