Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #13160 > unrolled thread

Re: How do I automate the removal of all non-ascii characters from my code?

Started byGary Herron <gherron@islandtraining.com>
First post2011-09-12 01:17 -0700
Last post2011-09-12 07:39 -0700
Articles 2 — 2 participants

Back to article view | Back to comp.lang.python

This discussion starts older than the indexed window; earlier articles aren't shown. The article labeled Started by below is the oldest one visible, not the original post.


Contents

  Re: How do I automate the removal of all non-ascii characters from my code? Gary Herron <gherron@islandtraining.com> - 2011-09-12 01:17 -0700
    Re: How do I automate the removal of all non-ascii characters from my code? jmfauth <wxjmfauth@gmail.com> - 2011-09-12 07:39 -0700

#13160 — Re: How do I automate the removal of all non-ascii characters from my code?

FromGary Herron <gherron@islandtraining.com>
Date2011-09-12 01:17 -0700
SubjectRe: How do I automate the removal of all non-ascii characters from my code?
Message-ID<mailman.1018.1315815474.27778.python-list@python.org>
On 09/12/2011 12:49 AM, Alec Taylor wrote:
> Good evening,
>
> I have converted ODT to HTML using LibreOffice Writer, because I want
> to convert from HTML to Creole using python-creole. Unfortunately I
> get this error: "File "Convert to Creole.py", line 17
> SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py
> on line 18, but no encoding declared; see
> http://www.python.org/peps/pep-0263.html for details".
>
> Unfortunately I can't post my document yet (it's a research paper I'm
> working on), but I'm sure you'll get the same result if you write up a
> document in LibreOffice Writer and add some End Notes.
>
> How do I automate the removal of all non-ascii characters from my code?
>
> Thanks for all suggestions,
>
> Alec Taylor



This question does not quite make sense.  The error message is 
complaining about a python file.  What does that file have to do with 
ODT to HTML conversion and LibreOffice?

The error message means the python file (wherever it came from) has a 
non-ascii character (as you noted), and so it needs something to tell it 
what such a character means.  (That what the encoding is.)

A comment like this in line 1 or 2 will specify an encoding:
   # -*- coding: <encoding name> -*-
but, we'll have to know more about the file "Convert to Creole.py" to 
guess what encoding name should be specified there.

You might try utf-8 or latin-1.

[toc] | [next] | [standalone]


#13181

Fromjmfauth <wxjmfauth@gmail.com>
Date2011-09-12 07:39 -0700
Message-ID<55455989-d2bf-44d6-b3cf-ba50f1e87a01@d14g2000yqb.googlegroups.com>
In reply to#13160
On 12 sep, 10:17, Gary Herron <gher...@islandtraining.com> wrote:
> On 09/12/2011 12:49 AM, Alec Taylor wrote:
>
>
>
> > Good evening,
>
> > I have converted ODT to HTML using LibreOffice Writer, because I want
> > to convert from HTML to Creole using python-creole. Unfortunately I
> > get this error: "File "Convert to Creole.py", line 17
> > SyntaxError: Non-ASCII character '\xe2' in file Convert to Creole.py
> > on line 18, but no encoding declared; see
> >http://www.python.org/peps/pep-0263.htmlfor details".
>
> > Unfortunately I can't post my document yet (it's a research paper I'm
> > working on), but I'm sure you'll get the same result if you write up a
> > document in LibreOffice Writer and add some End Notes.
>
> > How do I automate the removal of all non-ascii characters from my code?
>
> > Thanks for all suggestions,
>
> > Alec Taylor
>

The coding of the characters is a domain per se.
It is independent from any OS's or applications.

When working with (plain) text files, you should
always be aware about the coding of the text you
are working on. If you are using coding directives,
you must ensure your coding directive matches
the real coding of the text files. A coding
directive is only informative, it does not set
the coding.

I'm pretty sure, you problem comes from this. There
is a mismatch somewhere, you are not aware of.
Removing ascii chars is certainly not a valuable
solution. It must work. If your are working
properly, it can not, not work.

Frome a linguistic point of view, the web has informed
me Creole (*all the Creoles*) can be composed with
the iso-8859-1 coding. That means, iso-8859-1, cp1252 and
all Unicode coding variants are possible coding directives.

jmf

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web