Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.python > #56392

Re: Re for Apache log file format

From Neil Cerutti <neilc@norwich.edu>
Newsgroups comp.lang.python
Subject Re: Re for Apache log file format
Date 2013-10-08 12:50 +0000
Organization Norwich University
Message-ID <bbidceF44feU1@mid.individual.net> (permalink)
References <mailman.832.1381215979.18130.python-list@python.org>

Show all headers | View raw


On 2013-10-08, Sam Giraffe <sam@giraffetech.biz> wrote:
>
> Hi,
>
> I am trying to split up the re pattern for Apache log file format and seem
> to be having some trouble in getting Python to understand multi-line
> pattern:
>
> #!/usr/bin/python
>
> import re
>
> #this is a single line
> string = '192.168.122.3 - - [29/Sep/2013:03:52:33 -0700] "GET / HTTP/1.0"
> 302 276 "-" "check_http/v1.4.16 (nagios-plugins 1.4.16)"'
>
> #trying to break up the pattern match for easy to read code
> pattern = re.compile(r'(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+'
>                      r'(?P<ident>\-)\s+'
>                      r'(?P<username>\-)\s+'
>                      r'(?P<TZ>\[(.*?)\])\s+'
>                      r'(?P<url>\"(.*?)\")\s+'
>                      r'(?P<httpcode>\d{3})\s+'
>                      r'(?P<size>\d+)\s+'
>                      r'(?P<referrer>\"\")\s+'
>                      r'(?P<agent>\((.*?)\))')

I recommend using the re.VERBOSE flag when explicating an re.
It'll make your life incrementally easier.

pattern = re.compile(
     r"""(?P<ip>\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s+
         (?P<ident>\-)\s+
         (?P<username>\-)\s+
         (?P<TZ>\[(.*?)\])\s+    # You can even insert comments.
         (?P<url>\"(.*?)\")\s+
         (?P<httpcode>\d{3})\s+
         (?P<size>\d+)\s+
         (?P<referrer>\"\")\s+
         (?P<agent>\((.*?)\))""", re.VERBOSE)

-- 
Neil Cerutti

Back to comp.lang.python | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

Re for Apache log file format Sam Giraffe <sam@giraffetech.biz> - 2013-10-07 23:33 -0700
  Re: Re for Apache log file format Neil Cerutti <neilc@norwich.edu> - 2013-10-08 12:50 +0000
  Re: Re for Apache log file format Denis McMahon <denismfmcmahon@gmail.com> - 2013-10-08 15:48 +0000
    Re: Re for Apache log file format Skip Montanaro <skip@pobox.com> - 2013-10-08 10:59 -0500
  Re: Re for Apache log file format Piet van Oostrum <piet@vanoostrum.org> - 2013-10-09 13:33 -0400

csiph-web