Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.python > #56356
| Path | csiph.com!v102.xanadu-bbs.net!xanadu-bbs.net!news.albasani.net!news.stack.nl!newsfeed.xs4all.nl!newsfeed2.news.xs4all.nl!xs4all!post.news.xs4all.nl!not-for-mail |
|---|---|
| Return-Path | <sam@giraffetech.biz> |
| X-Original-To | python-list@python.org |
| Delivered-To | python-list@mail.python.org |
| X-Spam-Status | OK 0.001 |
| X-Spam-Evidence | '*H*': 1.00; '*S*': 0.00; 'else:': 0.03; 'skip:[ 20': 0.04; 'interpreter': 0.05; 'subject:file': 0.07; 'string': 0.09; 'received:209.85.219': 0.09; 'python': 0.11; 'apache': 0.15; '"-"': 0.16; '#this': 0.16; '(pdb)': 0.16; '->': 0.16; 'match:': 0.16; 'pdb': 0.16; 'skipping': 0.16; 'string)': 0.16; 'subject:Apache': 0.16; 'subject:format': 0.16; 'subject:log': 0.16; 'trying': 0.19; 'split': 0.19; 'import': 0.22; 'print': 0.22; 'skip:\xa0 20': 0.24; 'looks': 0.24; '>': 0.26; 'statement': 0.30; 'message-id:@mail.gmail.com': 0.30; 'code': 0.31; 'getting': 0.31; "skip:' 10": 0.31; 'skip:r 60': 0.31; 'file': 0.32; 'skip:# 10': 0.33; 'trouble': 0.34; 'received:209.85': 0.35; 'received:google.com': 0.35; 'hi,': 0.36; 'received:209': 0.37; 'skip:& 10': 0.38; 'thank': 0.38; 'to:addr :python-list': 0.38; 'skip:& 20': 0.39; 'moving': 0.39; '\xa0\xa0\xa0': 0.39; 'to:addr:python.org': 0.39; 'read': 0.60; 'easy': 0.60; 'break': 0.61; 'you.': 0.62; 'skip:r 40': 0.68; 'skip:r 30': 0.69; '8bit%:100': 0.72; 'skip:/ 30': 0.84 |
| X-Google-DKIM-Signature | v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:date:message-id:subject:from:to :content-type; bh=U/ndK3ys7FhvEi2xcOfTIJshU3TNkNNIwqB/ybMU5Po=; b=b0xMsEOSBI1VnRTO412i90wwBbxMmOjT6wXHTp9Tu0IjzLSIc9cw5Q2t5cRl+TXh40 x59SGgmD2l1ow/50zh+U+c66mmz8YFojNWvZ4yxxNbW1lqPRjsrT6QzcMaCywIrBi64E w65WkB0G7cPvqqE7h47rjX/xq1oGP2qYrw7g8tHxBHkV/Y+jIKK0UEN5n+VmymEPJPLZ SKF3JobYX1gMPqYZ8Oo7zEPo4noixZxv27RcWNHm9I+19Jihid/aNXnmYyfgeRmAPEfo 1zFiktpeL0BTbDmbsTi8IEsF6q9dqvXrRjWI3rQw0upvn9+XtQ2s6A22PnDeez7713ZQ yT7g== |
| X-Gm-Message-State | ALoCoQnaqNpIps7aqydPQB4XhaWT5bkfEXUDbX/gTvdlt4s+bcHPZO4GlEgMutjJp5bTv1JtxS06 |
| MIME-Version | 1.0 |
| X-Received | by 10.182.148.69 with SMTP id tq5mr38890obb.97.1381214011449; Mon, 07 Oct 2013 23:33:31 -0700 (PDT) |
| X-Originating-IP | [98.234.114.38] |
| Date | Mon, 7 Oct 2013 23:33:31 -0700 |
| Subject | Re for Apache log file format |
| From | Sam Giraffe <sam@giraffetech.biz> |
| To | python-list@python.org |
| Content-Type | multipart/alternative; boundary=089e013a0ad082ef5704e834f21f |
| X-Mailman-Approved-At | Tue, 08 Oct 2013 09:06:17 +0200 |
| X-BeenThere | python-list@python.org |
| X-Mailman-Version | 2.1.15 |
| Precedence | list |
| List-Id | General discussion list for the Python programming language <python-list.python.org> |
| List-Unsubscribe | <https://mail.python.org/mailman/options/python-list>, <mailto:python-list-request@python.org?subject=unsubscribe> |
| List-Archive | <http://mail.python.org/pipermail/python-list/> |
| List-Post | <mailto:python-list@python.org> |
| List-Help | <mailto:python-list-request@python.org?subject=help> |
| List-Subscribe | <https://mail.python.org/mailman/listinfo/python-list>, <mailto:python-list-request@python.org?subject=subscribe> |
| Newsgroups | comp.lang.python |
| Message-ID | <mailman.832.1381215979.18130.python-list@python.org> (permalink) |
| Lines | 103 |
| NNTP-Posting-Host | 2001:888:2000:d::a6 |
| X-Trace | 1381215979 news.xs4all.nl 15950 [2001:888:2000:d::a6]:51486 |
| X-Complaints-To | abuse@xs4all.nl |
| Xref | csiph.com comp.lang.python:56356 |
Show key headers only | View raw
[Multipart message — attachments visible in raw view] - view raw
Hi,
I am trying to split up the re pattern for Apache log file format and seem
to be having some trouble in getting Python to understand multi-line
pattern:
#!/usr/bin/python
import re
#this is a single line
string = '192.168.122.3 - - [29/Sep/2013:03:52:33 -0700] "GET / HTTP/1.0"
302 276 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"'
#trying to break up the pattern match for easy to read code
pattern = re.compile(r'(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+'
r'(?P<ident>\-)\s+'
r'(?P<username>\-)\s+'
r'(?P<TZ>\[(.*?)\])\s+'
r'(?P<url>\"(.*?)\")\s+'
r'(?P<httpcode>\d{3})\s+'
r'(?P<size>\d+)\s+'
r'(?P<referrer>\"\")\s+'
r'(?P<agent>\((.*?)\))')
match = re.search(pattern, string)
if match:
print match.group('ip')
else:
print 'not found'
The python interpreter is skipping to the 'math = re.search' and then the
'if' statement right after it looks at the <ip>, instead of moving onto
<ident> and so on.
mybox:~ user$ python -m pdb /Users/user/Documents/Python/apache.py
> /Users/user/Documents/Python/apache.py(3)<module>()
-> import re
(Pdb) n
> /Users/user/Documents/Python/apache.py(5)<module>()
-> string = '192.168.122.3 - - [29/Sep/2013:03:52:33 -0700] "GET /
HTTP/1.0" 302 276 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"'
(Pdb) n
> /Users/user/Documents/Python/apache.py(7)<module>()
-> pattern = re.compile(r'(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+'
(Pdb) n
> /Users/user/Documents/Python/apache.py(17)<module>()
-> match = re.search(pattern, string)
(Pdb)
Thank you.
Back to comp.lang.python | Previous | Next — Next in thread | Find similar | Unroll thread
Re for Apache log file format Sam Giraffe <sam@giraffetech.biz> - 2013-10-07 23:33 -0700
Re: Re for Apache log file format Neil Cerutti <neilc@norwich.edu> - 2013-10-08 12:50 +0000
Re: Re for Apache log file format Denis McMahon <denismfmcmahon@gmail.com> - 2013-10-08 15:48 +0000
Re: Re for Apache log file format Skip Montanaro <skip@pobox.com> - 2013-10-08 10:59 -0500
Re: Re for Apache log file format Piet van Oostrum <piet@vanoostrum.org> - 2013-10-09 13:33 -0400
csiph-web