Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #108222
| From | alister <alister.ware@ntlworld.com> |
|---|---|
| Subject | Re: Whittle it on down |
| Newsgroups | comp.lang.python |
| References | <ngejmj$gc4$1@dont-email.me> <1462426755.15465.598690257.42990546@webmail.messagingengine.com> <mailman.397.1462426759.32212.python-list@python.org> <nggku4$p6n$1@dont-email.me> |
| Message-ID | <AXZWy.264624$GG.250375@fx36.am4> (permalink) |
| Organization | virginmedia.com |
| Date | 2016-05-06 10:01 +0000 |
On Thu, 05 May 2016 19:31:33 -0400, DFS wrote: > On 5/5/2016 1:39 AM, Stephen Hansen wrote: > >> Given: >> >>>>> input = [u'Espa\xf1ol', 'Health & Fitness Clubs (36)', 'Health Clubs >>>>> & Gymnasiums (42)', 'Health Fitness Clubs', 'Name', 'Atlanta city >>>>> guide', 'edit address', 'Tweet', 'PHYSICAL FITNESS CONSULTANTS & >>>>> TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 'HEALTH CLUBS & GYMNASIUMS', >>>>> 'www.custombuiltpt.com/', 'RACQUETBALL COURTS PRIVATE', >>>>> 'www.lafitness.com', 'GYMNASIUMS', 'HEALTH & FITNESS CLUBS', >>>>> 'www.lafitness.com', 'HEALTH & FITNESS CLUBS', 'www.lafitness.com', >>>>> 'PERSONAL FITNESS TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 'EXERCISE >>>>> & PHYSICAL FITNESS PROGRAMS', 'FITNESS CENTERS', 'HEALTH CLUBS & >>>>> GYMNASIUMS', 'HEALTH CLUBS & GYMNASIUMS', 'PERSONAL FITNESS >>>>> TRAINERS', '5', '4', '3', '2', '1', 'Yellow Pages', 'About Us', >>>>> 'Contact Us', 'Support', 'Terms of Use', 'Privacy Policy', >>>>> 'Advertise With Us', 'Add/Update Listing', 'Business Profile Login', >>>>> 'F.A.Q.'] >> >> Then: >> >>>>> pattern = re.compile(r"^[A-Z\s&]+$") >>>>> output = [x for x in list if pattern.match(x)] >>>>> output > >> ['PHYSICAL FITNESS CONSULTANTS & TRAINERS', 'HEALTH CLUBS & >> GYMNASIUMS', >> 'HEALTH CLUBS & GYMNASIUMS', 'RACQUETBALL COURTS PRIVATE', >> 'GYMNASIUMS', >> 'HEALTH & FITNESS CLUBS', 'HEALTH & FITNESS CLUBS', 'PERSONAL FITNESS >> TRAINERS', 'HEALTH CLUBS & GYMNASIUMS', 'EXERCISE & PHYSICAL FITNESS >> PROGRAMS', 'FITNESS CENTERS', 'HEALTH CLUBS & GYMNASIUMS', 'HEALTH >> CLUBS & GYMNASIUMS', 'PERSONAL FITNESS TRAINERS'] > > > Should've looked earlier. Their master list of categories > http://www.usdirectory.com/cat/g0 shows a few commas, a bunch of dashes, > and the ampersands we talked about. > > "OFFICE SERVICES, SUPPLIES & EQUIPMENT" gets removed because of the > comma. > > "AUTOMOBILE - DEALERS" gets removed because of the dash. > > I updated your regex and it seems to have fixed it. > > orig: (r"^[A-Z\s&]+$") > new : (r"^[A-Z\s&,-]+$") > > > Thanks again. it looks to me like this system is trying to prevent SQL injection attacks by blacklisting certain characters. this is not the correct way to block such attacks & is probably not a good indicator to the quality of the rest of the application. -- When love is gone, there's always justice. And when justice is gone, there's always force. And when force is gone, there's always Mom. Hi, Mom! -- Laurie Anderson
Back to comp.lang.python | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 00:58 -0400
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-04 22:39 -0700
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 08:44 -0400
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 19:31 -0400
Re: Whittle it on down Peter Otten <__peter__@web.de> - 2016-05-06 09:45 +0200
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-06 09:58 -0400
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-06 10:41 -0400
Re: Whittle it on down Peter Otten <__peter__@web.de> - 2016-05-06 17:44 +0200
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-06 18:43 -0400
Re: Whittle it on down alister <alister.ware@ntlworld.com> - 2016-05-06 10:01 +0000
Re: Whittle it on down Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-05-05 08:53 +0300
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 08:57 -0400
Re: Whittle it on down Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-05-05 16:04 +1000
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-04 23:46 -0700
Re: Whittle it on down Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-05-05 17:04 +1000
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 00:34 -0700
Re: Whittle it on down Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-05-05 18:41 +1000
Re: Whittle it on down Random832 <random832@fastmail.com> - 2016-05-05 09:13 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 03:13 +1000
Re: Whittle it on down Steven D'Aprano <steve+comp.lang.python@pearwood.info> - 2016-05-05 17:36 +1000
Re: Whittle it on down Peter Otten <__peter__@web.de> - 2016-05-05 10:17 +0200
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 01:39 +1000
Re: Whittle it on down Random832 <random832@fastmail.com> - 2016-05-05 09:21 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 04:03 +1000
Re: Whittle it on down Random832 <random832@fastmail.com> - 2016-05-05 14:52 -0400
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 12:09 -0700
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 06:32 -0700
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 10:36 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 03:43 +1000
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 11:55 -0700
Re: Whittle it on down Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-05-05 20:49 +0300
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 04:14 +1000
Re: Whittle it on down Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-05-05 21:27 +0300
Re: Whittle it on down Random832 <random832@fastmail.com> - 2016-05-05 14:54 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 10:57 +1000
Re: Whittle it on down Jussi Piitulainen <jussi.piitulainen@helsinki.fi> - 2016-05-06 07:19 +0300
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 08:31 -0400
Re: Whittle it on down Steven D'Aprano <steve@pearwood.info> - 2016-05-06 03:54 +1000
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 17:36 -0400
Re: Whittle it on down Stephen Hansen <me+python@ixokai.io> - 2016-05-05 11:56 -0700
Re: Whittle it on down DFS <nospam@dfs.com> - 2016-05-05 17:45 -0400
csiph-web