Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #108195
| Path | csiph.com!feeder.erje.net!2.eu.feeder.erje.net!newsfeed0.kamp.net!newsfeed.kamp.net!fu-berlin.de!uni-berlin.de!not-for-mail |
|---|---|
| From | Stephen Hansen <me+python@ixokai.io> |
| Newsgroups | comp.lang.python |
| Subject | Re: Whittle it on down |
| Date | Thu, 05 May 2016 12:09:42 -0700 |
| Lines | 38 |
| Message-ID | <mailman.414.1462475384.32212.python-list@python.org> (permalink) |
| References | <ngejmj$gc4$1@dont-email.me> <572ae25f$0$2821$c3e8da3$76491128@news.astraweb.com> <1462430766.25079.598726825.1B90C7A1@webmail.messagingengine.com> <mailman.398.1462430769.32212.python-list@python.org> <572af811$0$1608$c3e8da3$5496439d@news.astraweb.com> <1462454499.2962191.598999745.40BB8A1E@webmail.messagingengine.com> <mailman.405.1462454501.32212.python-list@python.org> <572b8aee$0$1589$c3e8da3$5496439d@news.astraweb.com> <1462475382.161356.599349169.50845B61@webmail.messagingengine.com> |
| Mime-Version | 1.0 |
| Content-Type | text/plain |
| Content-Transfer-Encoding | 7bit |
| X-Trace | news.uni-berlin.de IQz0OVNNmW2j5hNQf5qNSAj01D2VJYWJr9vRt5s9gc7w== |
| Return-Path | <me+python@ixokai.io> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.005 |
| X-Spam-Evidence | '*H*': 0.99; '*S*': 0.00; 'url:pypi': 0.03; 'matches': 0.07; 'input,': 0.09; 'non-ascii': 0.09; 'oh,': 0.09; 'received:internal': 0.09; 'thu,': 0.15; 'input.': 0.16; 'message- id:@webmail.messagingengine.com': 0.16; 'received:10.202': 0.16; 'received:10.202.2': 0.16; 'received:10.202.2.44': 0.16; 'received:66.111': 0.16; 'received:66.111.4': 0.16; 'received:compute4.internal': 0.16; 'received:io': 0.16; 'received:messagingengine.com': 0.16; 'received:psf.io': 0.16; 'scrape': 0.16; 'utterly': 0.16; 'validation.': 0.16; 'wrote:': 0.16; 'stephen': 0.22; 'am,': 0.23; 'matching': 0.23; 'properties': 0.24; 'header:In-Reply-To:1': 0.24; "doesn't": 0.26; 'accidentally': 0.29; 'character': 0.29; 'that.': 0.30; "i'd": 0.31; 'possibly': 0.32; 'though,': 0.32; 'url:python': 0.33; "d'aprano": 0.33; 'steven': 0.33; 'file': 0.34; 'could': 0.35; 'replace': 0.35; 'text.': 0.35; 'unicode': 0.35; 'too': 0.36; 'there': 0.36; 'url:org': 0.36; 'possible': 0.36; 'to:addr:python- list': 0.36; 'subject:: ': 0.37; 'responsible': 0.37; 'received:10': 0.37; 'received:66': 0.38; 'wrong': 0.38; 'speak': 0.38; 'end': 0.39; 'means': 0.39; 'data': 0.39; 'to:addr:python.org': 0.40; 'where': 0.40; 'skip:u 10': 0.61; 'header:Message-Id:1': 0.61; 'real': 0.62; 'matter': 0.63; 'world': 0.64; 'hoping': 0.77; '6.5': 0.84; 'subject:down': 0.84; 'absolutely': 0.88; 'do:': 0.91 |
| DKIM-Signature | v=1; a=rsa-sha1; c=relaxed/relaxed; d=ixokai.io; h= content-transfer-encoding:content-type:date:from:in-reply-to :message-id:mime-version:references:subject:to:x-sasl-enc :x-sasl-enc; s=mesmtp; bh=e358MtACpO3hyo9QZLQoyu9WIx8=; b=VV3wog 9dfg7dX4eP+iT01SvTaAXofIWjFibJzlEwGoFOjElPKXMahjabBwLb8lkjgRmAdE 89ui8DbQFeLth5jZgwL8PMPoz910J62Msswysu06ZaILj5kafGwkk/fkDYltmFed H+6NOHyQgUL4mNbth2O+UwZ4+cI3fPTvRdKeQ= |
| DKIM-Signature | v=1; a=rsa-sha1; c=relaxed/relaxed; d= messagingengine.com; h=content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-sasl-enc:x-sasl-enc; s=smtpout; bh=e358MtACpO3hyo9 QZLQoyu9WIx8=; b=K1gxyfeSzzulxgv/amjabuzYuUv6zJiNauZs7UhA/R3f2iu YLRjNZL3FpWIY0GH+ORNlqimx+/7BPLiE7zPuZJm/grzV89XFnq3CmgyWrgXx9rV ivcpkZA/CN3WurJgSdetUFjsQLtVoIjq6GrGEZNT3iWw+FkyElkgCRIj5AEg= |
| X-Sasl-Enc | face75KKXiAQS6B/aFN3R2uVDuyWwA6w2sDeUDX4GK8W 1462475382 |
| X-Mailer | MessagingEngine.com Webmail Interface - ajax-140377c4 |
| In-Reply-To | <572b8aee$0$1589$c3e8da3$5496439d@news.astraweb.com> |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.22 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| X-Mailman-Original-Message-ID | <1462475382.161356.599349169.50845B61@webmail.messagingengine.com> |
| X-Mailman-Original-References | <ngejmj$gc4$1@dont-email.me> <572ae25f$0$2821$c3e8da3$76491128@news.astraweb.com> <1462430766.25079.598726825.1B90C7A1@webmail.messagingengine.com> <mailman.398.1462430769.32212.python-list@python.org> <572af811$0$1608$c3e8da3$5496439d@news.astraweb.com> <1462454499.2962191.598999745.40BB8A1E@webmail.messagingengine.com> <mailman.405.1462454501.32212.python-list@python.org> <572b8aee$0$1589$c3e8da3$5496439d@news.astraweb.com> |
| Xref | csiph.com comp.lang.python:108195 |
Show key headers only | View raw
On Thu, May 5, 2016, at 11:03 AM, Steven D'Aprano wrote:
> - Nobody could possibly want to support non-ASCII text. (Apart from the
> approximately 6.5 billion people in the world that don't speak English of
> course, an utterly insignificant majority.)
Oh, I'd absolutely want to support non-ASCII text. If I have unicode
input, though, I unfortunately have to rely on
https://pypi.python.org/pypi/regex as 're' doesn't support matching on
character properties.
I keep hoping it'll replace "re", then we could do:
pattern = regex.compile(ru"^\p{Lu}\s&]+$")
where \p{property} matches against character properties in the unicode
database.
> - Data validity doesn't matter, because there's no possible way that you
> might accidentally scrape data from the wrong part of a HTML file and end
> up with junk input.
Um, no one said that. I was arguing that the *regular expression*
doesn't need to be responsible for validation.
> - Even if you do somehow end up with junk, there couldn't possibly be any
> real consequences to that.
No one said that either...
> - It doesn't matter if you match too much, or to little, that just means
> the
> specs are too pedantic.
Or that...
--
Stephen Hansen
m e @ i x o k a i . i o
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 00:58 -0400
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-04 22:39 -0700
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 08:44 -0400
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 19:31 -0400
Re: Whittle it on down Peter Otten <__peter__@web.de> - 2016-05-06 09:45 +0200
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-06 09:58 -0400
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-06 10:41 -0400
Re: Whittle it on down Peter Otten <__peter__@web.de> - 2016-05-06 17:44 +0200
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-06 18:43 -0400
Re: Whittle it on down alister <alister.ware@ntlworld.com> - 2016-05-06 10:01 +0000
Re: Whittle it on down Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-05-05 08:53 +0300
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 08:57 -0400
Re: Whittle it on down Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-05-05 16:04 +1000
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-04 23:46 -0700
Re: Whittle it on down Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-05-05 17:04 +1000
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 00:34 -0700
Re: Whittle it on down Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-05-05 18:41 +1000
Re: Whittle it on down Random832 <random832@fastmail.com> - 2016-05-05 09:13 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 03:13 +1000
Re: Whittle it on down Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-05-05 17:36 +1000
Re: Whittle it on down Peter Otten <__peter__@web.de> - 2016-05-05 10:17 +0200
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 01:39 +1000
Re: Whittle it on down Random832 <random832@fastmail.com> - 2016-05-05 09:21 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 04:03 +1000
Re: Whittle it on down Random832 <random832@fastmail.com> - 2016-05-05 14:52 -0400
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 12:09 -0700
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 06:32 -0700
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 10:36 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 03:43 +1000
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 11:55 -0700
Re: Whittle it on down Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-05-05 20:49 +0300
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 04:14 +1000
Re: Whittle it on down Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-05-05 21:27 +0300
Re: Whittle it on down Random832 <random832@fastmail.com> - 2016-05-05 14:54 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 10:57 +1000
Re: Whittle it on down Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-05-06 07:19 +0300
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 08:31 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 03:54 +1000
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 17:36 -0400
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 11:56 -0700
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 17:45 -0400
csiph-web