Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #56392

Re: Re for Apache log file format

Path csiph.com!usenet.pasdenom.info!weretis.net!feeder1.news.weretis.net!news.swapon.de!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail
From Neil Cerutti <neilc@norwich.edu>
Newsgroups comp.lang.python
Subject Re: Re for Apache log file format
Date 8 Oct 2013 12:50:22 GMT
Organization Norwich University
Lines 43
Message-ID <bbidceF44feU1@mid.individual.net> (permalink)
References <mailman.832.1381215979.18130.python-list@python.org>
Mime-Version 1.0
Content-Type text/plain; charset=us-ascii
Content-Transfer-Encoding 7bit
X-Trace individual.net osLhG04VczgMZv7R/5vnCwCFHwO5mQLhwZ01krD91JNI8EyHrs
Cancel-Lock sha1:mERCaNdVZASsM2Vz9nOYUbJ0n0w=
User-Agent slrn/0.9.9p1/mm/ao (Win32)
Xref csiph.com comp.lang.python:56392

Show key headers only | View raw


On 2013-10-08, Sam Giraffe <sam@giraffetech.biz> wrote:
>
> Hi,
>
> I am trying to split up the re pattern for Apache log file format and seem
> to be having some trouble in getting Python to understand multi-line
> pattern:
>
> #!/usr/bin/python
>
> import re
>
> #this is a single line
> string = '192.168.122.3 - - [29/Sep/2013:03:52:33 -0700] "GET / HTTP/1.0"
> 302 276 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"'
>
> #trying to break up the pattern match for easy to read code
> pattern = re.compile(r'(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+'
>                      r'(?P<ident>\-)\s+'
>                      r'(?P<username>\-)\s+'
>                      r'(?P<TZ>\[(.*?)\])\s+'
>                      r'(?P<url>\"(.*?)\")\s+'
>                      r'(?P<httpcode>\d{3})\s+'
>                      r'(?P<size>\d+)\s+'
>                      r'(?P<referrer>\"\")\s+'
>                      r'(?P<agent>\((.*?)\))')

I recommend using the re.VERBOSE flag when explicating an re.
It'll make your life incrementally easier.

pattern = re.compile(
     r"""(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+
         (?P<ident>\-)\s+
         (?P<username>\-)\s+
         (?P<TZ>\[(.*?)\])\s+    # You can even insert comments.
         (?P<url>\"(.*?)\")\s+
         (?P<httpcode>\d{3})\s+
         (?P<size>\d+)\s+
         (?P<referrer>\"\")\s+
         (?P<agent>\((.*?)\))""", re.VERBOSE)

-- 
Neil Cerutti

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Re for Apache log file format Sam Giraffe <sam@giraffetech.biz> - 2013-10-07 23:33 -0700
  Re: Re for Apache log file format Neil Cerutti <neilc@norwich.edu> - 2013-10-08 12:50 +0000
  Re: Re for Apache log file format Denis McMahon <denismfmcmahon@gmail.com> - 2013-10-08 15:48 +0000
    Re: Re for Apache log file format Skip Montanaro <skip@pobox.com> - 2013-10-08 10:59 -0500
  Re: Re for Apache log file format Piet van Oostrum <piet@vanoostrum.org> - 2013-10-09 13:33 -0400

csiph-web