Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail
From: Peter Otten <__peter__@web.de>
Newsgroups: comp.lang.python
Subject: Re: Regular expressions
Date: Tue, 03 Nov 2015 10:25:26 +0100
Organization: None
Lines: 83
Message-ID: <mailman.10.1446542744.8789.python-list@python.org>
References: <662g3blobme52hfoududj27err185v2npm@4ax.com> <20151102204237.6a78abdf@bigbox.christie.dr> <56382F33.8050905@gmail.com>
Mime-Version: 1.0
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7Bit
User-Agent: KNode/4.13.3
Precedence: list
Xref: csiph.com comp.lang.python:98143

Michael Torrie wrote:

> On 11/02/2015 07:42 PM, Tim Chase wrote:
>> On 2015-11-02 20:09, Seymore4Head wrote:
>>> How do I make a regular expression that returns true if the end of
>>> the line is an asterisk
>> 
>> Why use a regular expression?
>> 
>>   if line[-1] == '*':
>>     yep(line)
>>   else:
>>     nope(line)
> 
> Indeed, sometimes Jamie Zawinski's is often quite appropriate:
> 
>     Some people, when confronted with a problem, think "I know, I'll use
> regular expressions." Now they have two problems.

Incidentally the code example has two "problems", too.

- What about the empty string?
- What about lines with a trailing "\n", i. e. as they are usually delivered
  when iterating over a file?

Below is a comparison of some of your options. The "one obvious way" 
line.rstrip("\n").endswith("*") is not included ;)

$ cat starry_table.py 
import re


def show_table(data, header):
    rows = [header]
    rows.extend([str(c) for c in row] for row in data)
    widths = [max(len(row[i]) for row in rows) for i in range(len(header))]
    template = "  ".join("{:%d}" % w for w in widths)
    for row in rows:
        print(template.format(*row))


def compare(sample_lines):
    for line in sample_lines:
        got_re = bool(re.compile("\*$").search(line))
        got_re_M = bool(re.compile("\*$", re.M).search(line))
        got_endswith = line.endswith("*")
        got_endswith2 = line.endswith(("*", "*\n"))
        got_substring = line[-1:] == "*"
        try:
            got_char = line[-1] == "*"
        except IndexError:
            got_char = "#exception"
        results = (
            got_re, got_re_M,
            got_endswith, got_endswith2,
            got_substring, got_char)
        yield (
            ["", "X"][len(set(results)) > 1],
            repr(line)) + results


SAMPLE = ["", "\n", "foo\n", "*\n", "*", "foo*", "foo*\n", "foo*\nbar"]
HEADER = [
    "", "line", "regex", "re.M",
    "endswith", 'endswith(("*", "*\\n"))',
    "substring", "char"]

if __name__ == "__main__":
    show_table(compare(SAMPLE), HEADER)


$ python3 starry_table.py 
   line         regex  re.M   endswith  endswith(("*", "*\n"))  substring  char      
X  ''           False  False  False     False                   False      #exception
   '\n'         False  False  False     False                   False      False     
   'foo\n'      False  False  False     False                   False      False     
X  '*\n'        True   True   False     True                    False      False     
   '*'          True   True   True      True                    True       True      
   'foo*'       True   True   True      True                    True       True      
X  'foo*\n'     True   True   False     True                    False      False     
X  'foo*\nbar'  False  True   False     False                   False      False