Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #54707

removing BOM prepended by codecs?

Path csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!feeder.erje.net!eu.feeder.erje.net!xlned.com!feeder5.xlned.com!newsfeed.xs4all.nl!newsfeed3.news.xs4all.nl!xs4all!newsgate.cistron.nl!newsgate.news.xs4all.nl!post.news.xs4all.nl!not-for-mail
Return-Path <j.bagg@kent.ac.uk>
X-Original-To python-list@python.org
Delivered-To python-list@mail.python.org
X-Spam-Status OK 0.048
X-Spam-Evidence '*H*': 0.90; '*S*': 0.00; 'stealing': 0.09; 'runs': 0.10; 'python': 0.11; '2.7': 0.14; 'anyway': 0.14; '17.': 0.16; '255': 0.16; 'prevent': 0.16; 'version.': 0.19; 'putting': 0.22; 'header:User-Agent:1': 0.23; 'certainly': 0.24; "i've": 0.25; "i'm": 0.30; 'fedora': 0.31; 'overhead': 0.31; 'search.': 0.31; 'probably': 0.32; 'checked': 0.32; 'linux': 0.33; 'but': 0.35; 'add': 0.35; 'doing': 0.36; "i'll": 0.36; 'subject:?': 0.36; 'massive': 0.38; 'somebody': 0.38; 'to:addr:python-list': 0.38; 'files': 0.38; 'does': 0.39; 'to:addr:python.org': 0.39; 'enough': 0.39; 'received:org': 0.40; 'even': 0.60; 'remove': 0.60; 'entire': 0.61; 'our': 0.64; 'home': 0.69; 'limit': 0.70
Date Tue, 24 Sep 2013 16:17:18 +0100
From "J. Bagg" <j.bagg@kent.ac.uk>
Organization Dept of Anthropology, University of Kent
User-Agent Mozilla/5.0 (Windows; U; Windows NT 6.1; en-GB; rv:1.9.1.16) Gecko/20101125 Thunderbird/3.0.11
MIME-Version 1.0
To python-list@python.org
Subject removing BOM prepended by codecs?
Content-Type text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding 8bit
X-Mailman-Approved-At Tue, 24 Sep 2013 17:38:27 +0200
X-BeenThere python-list@python.org
X-Mailman-Version 2.1.15
Precedence list
List-Id General discussion list for the Python programming language <python-list.python.org>
List-Unsubscribe <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe>
List-Archive <http://mail.python.org/pipermail/python-list/>
List-Post <mailto:python-list@python.org>
List-Help <mailto:python-list-request@python.org?subject=help>
List-Subscribe <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe>
Newsgroups comp.lang.python
Message-ID <mailman.298.1380037108.18130.python-list@python.org> (permalink)
Lines 21
NNTP-Posting-Host 2001:888:2000:d::a6
X-Trace 1380037108 news.xs4all.nl 15995 [2001:888:2000:d::a6]:40352
X-Complaints-To abuse@xs4all.nl
Xref csiph.com comp.lang.python:54707

Show key headers only | View raw


I've checked the original files using od and they don't have BOMs.

I'll remove them in the servlet. The overhead is probably small enough 
unless somebody is doing a massive search. We have a limit anyway to 
prevent somebody stealing the entire set of data.

I started writing the Python search because the ancient C search had 
started putting out BOMs. I'm actually mystified because our home Linux 
box does not add BOMs even though it runs 2.7 but my work one does even 
though it has the same version. The only difference is Fedora 18 v 
Fedora 17.

The BOMs are certainly there:

<86> <AD><FB>%R 10C0203z-621
%A François-Xavier Le_Bourdonnec

0000000 206     255 373   %   R       1   0   C   0   2   0   3   z   -

J

Back to comp.lang.python | Previous | NextNext in thread | Find similar | Unroll thread


Thread

removing BOM prepended by codecs? "J. Bagg" <j.bagg@kent.ac.uk> - 2013-09-24 16:17 +0100
  Re: removing BOM prepended by codecs? Piet van Oostrum <piet@vanoostrum.org> - 2013-09-24 23:34 -0400

csiph-web