Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #54707 > unrolled thread

removing BOM prepended by codecs?

Started by"J. Bagg" <j.bagg@kent.ac.uk>
First post2013-09-24 16:17 +0100
Last post2013-09-24 23:34 -0400
Articles 2 — 2 participants

Back to article view | Back to comp.lang.python


Contents

  removing BOM prepended by codecs? "J. Bagg" <j.bagg@kent.ac.uk> - 2013-09-24 16:17 +0100
    Re: removing BOM prepended by codecs? Piet van Oostrum <piet@vanoostrum.org> - 2013-09-24 23:34 -0400

#54707 — removing BOM prepended by codecs?

From"J. Bagg" <j.bagg@kent.ac.uk>
Date2013-09-24 16:17 +0100
Subjectremoving BOM prepended by codecs?
Message-ID<mailman.298.1380037108.18130.python-list@python.org>
I've checked the original files using od and they don't have BOMs.

I'll remove them in the servlet. The overhead is probably small enough 
unless somebody is doing a massive search. We have a limit anyway to 
prevent somebody stealing the entire set of data.

I started writing the Python search because the ancient C search had 
started putting out BOMs. I'm actually mystified because our home Linux 
box does not add BOMs even though it runs 2.7 but my work one does even 
though it has the same version. The only difference is Fedora 18 v 
Fedora 17.

The BOMs are certainly there:

<86> <AD><FB>%R 10C0203z-621
%A François-Xavier Le_Bourdonnec

0000000 206     255 373   %   R       1   0   C   0   2   0   3   z   -

J

[toc] | [next] | [standalone]


#54718

FromPiet van Oostrum <piet@vanoostrum.org>
Date2013-09-24 23:34 -0400
Message-ID<m2r4cdingz.fsf@cochabamba.vanoostrum.org>
In reply to#54707
"J. Bagg" <j.bagg@kent.ac.uk> writes:

> I've checked the original files using od and they don't have BOMs.
>
> I'll remove them in the servlet. The overhead is probably small enough
> unless somebody is doing a massive search. We have a limit anyway to
> prevent somebody stealing the entire set of data.
>
> I started writing the Python search because the ancient C search had
> started putting out BOMs. I'm actually mystified because our home Linux
> box does not add BOMs even though it runs 2.7 but my work one does even
> though it has the same version. The only difference is Fedora 18 v
> Fedora 17.
>
> The BOMs are certainly there:
>
> <86> <AD><FB>%R 10C0203z-621
> %A François-Xavier Le_Bourdonnec
>
> 0000000 206     255 373   %   R       1   0   C   0   2   0   3   z   -
>
That is not a BOM or SIG. It isn't even valid utf-8.
-- 
Piet van Oostrum <piet@vanoostrum.org>
WWW: http://pietvanoostrum.com/
PGP key: [8DAE142BE17999C4]

[toc] | [prev] | [standalone]


Back to top | Article view | comp.lang.python


csiph-web