Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #104625
| From | Ian Kelly <ian.g.kelly@gmail.com> |
|---|---|
| Newsgroups | comp.lang.python |
| Subject | Re: non printable (moving away from Perl) |
| Date | 2016-03-11 10:08 -0700 |
| Message-ID | <mailman.28.1457716135.26429.python-list@python.org> (permalink) |
| References | <nbt27u$fe7$1@gioia.aioe.org> <mailman.17.1457698399.26429.python-list@python.org> <nbukcd$gs2$1@gioia.aioe.org> <nbus27$1si$1@ger.gmane.org> |
On Fri, Mar 11, 2016 at 9:34 AM, Wolfgang Maier
<wolfgang.maier@biologie.uni-freiburg.de> wrote:
> On 11.03.2016 15:23, Fillmore wrote:
>>
>> On 03/11/2016 07:13 AM, Wolfgang Maier wrote:
>>>
>>> One lesson for Perl regex users is that in Python many things can be
>>> solved without regexes.
>>> How about defining:
>>>
>>> printable = {chr(n) for n in range(32, 127)}
>>>
>>> then using:
>>>
>>> if (set(my_string) - set(printable)):
>>> break
>>
>>
>> seems computationally heavy. I have a file with about 70k lines, of
>> which only 20 contain "funny" chars.
>>
>
> Not sure what you call computationally heavy. I just test-parsed a 30 MB
> file (28k lines) with:
>
> with open(my_file) as i:
> for line in i:
> if set(line) - printable:
> continue
>
> and it finished in less than a second.
Did your test file contain on the order of 100 unique characters, or
on the order of 100,000? Granted that most input data would likely
fall into the former category.
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
non printable (moving away from Perl) Fillmore <fillmore_remove@hotmail.com> - 2016-03-10 19:07 -0500
Re: non printable (moving away from Perl) Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-10 17:25 -0700
Re: non printable (moving away from Perl) Mark Lawrence <breamoreboy@yahoo.co.uk> - 2016-03-11 01:30 +0000
Re: non printable (moving away from Perl) Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-10 20:52 -0700
Re: non printable (moving away from Perl) Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-03-11 13:13 +0100
Re: non printable (moving away from Perl) Fillmore <fillmore_remove@hotmail.com> - 2016-03-11 09:23 -0500
Re: non printable (moving away from Perl) Peter Otten <__peter__@web.de> - 2016-03-11 16:22 +0100
Re: non printable (moving away from Perl) Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-03-11 17:34 +0100
Re: non printable (moving away from Perl) Ian Kelly <ian.g.kelly@gmail.com> - 2016-03-11 10:08 -0700
Re: non printable (moving away from Perl) Wolfgang Maier <wolfgang.maier@biologie.uni-freiburg.de> - 2016-03-11 13:17 +0100
Re: non printable (moving away from Perl) Marko Rauhamaa <marko@pacujo.net> - 2016-03-11 14:47 +0200
Re: non printable (moving away from Perl) MRAB <python@mrabarnett.plus.com> - 2016-03-11 19:23 +0000
Re: non printable (moving away from Perl) Fillmore <fillmore_remove@hotmail.com> - 2016-03-11 14:36 -0500
Re: non printable (moving away from Perl) Ben Finney <ben+python@benfinney.id.au> - 2016-03-12 06:52 +1100
csiph-web