Groups | Search | Server Info | Login | Register


Groups > comp.unix.shell > #1716

The First Pure Shell Contest (PUSH): relativepath

From Jens Schweikhardt <schweikh@schweikhardt.net>
Newsgroups comp.unix.shell, comp.unix.programmer, comp.programming.contests
Subject The First Pure Shell Contest (PUSH): relativepath
Followup-To poster
Date 2011-08-19 17:23 +0000
Message-ID <9b7kg7F3njU1@mid.individual.net> (permalink)

Cross-posted to 3 groups.

Followups directed to: poster

Show all headers | View raw


    The First Pure Shell Contest (PUSH)

              == Motivation ==

The shell is not known for powerful string processing. Seasoned
programmers use various combinations of sed, awk, cut and others to
perform the more sophisticated transformations, or sacrifice portability
by using their shell's extensions, like arrays. But calls to external
utilities and use of shell extensions are often not needed. Knowing what
the POSIX shell can do may turn a slow non-portable script into a
blazingly fast and portable hacker's delight. The goal is to show the
absurdity of forking and nonportable constructs when a nice and small
pure shell solution solves the problem just as well.

              == Task ==

Write a POSIX shell function named 'relativepath' which takes as
arguments two absolute canonicalized pathnames, and stores the relative
path from the first to the second in the variable "result" or "." when
the arguments are the same. Example: relativepath /foo/bar /foo/baz sets
result to "../baz". A canonicalized pathname is one that contains no "."
or ".." directories and no occurrences of "//"; only the root directory
may end with "/". If an argument is relative or noncanonical, the
function's exit status shall be 1 and the "result" variable shall
contain the string "<%s> not absolute or canonical" where %s is replaced
with the argument in question. For successful invocations the exit
status shall be 0.

More formally, relativepath must successfully pass the unit_test function
in evaluate.sh below.

              == Entry format ==

Entries must consist of a single file implementing the relativepath function.
The first line should be a comment identifying the submitter.

The code should be indented with four spaces (no tabs) as shown in the
sample entry below to

 1) keep it readable
 2) make it easy to determine a "statement count"
 3) avoid writing one long line to improve performance measurements

Entries will be stripped of comments and leading whitespace before
evaluation so that well commented entries suffer no performance penalty,
however slight, due to parsing more characters. The strip_comments
function is quite simplistic: comments must start a line or begin
with " # [A-Z]", i.e. space, hash, space, uppercase letter. It is
your responsibility to verify that this particular way of preprocessing
leaves your entry functional.

              == Sample Entry ==

------------------------------------------------------------------------
# J. Random Hacker (or Anonymous if you prefer it.)
helperfunc() {
    printf '42\n'
}
relativepath() {
    # Note: not considered exhaustive!
    # See <URL:http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html>
    helperfunc
    for foo in bar baz; do     # Another comment.
        something > output
    done
    while test ...; do
        somethingelse < input
    done
    case word in
    ...)
        foo
        bar
        ;;
    ...)
        baz
    esac
    if test ...; then
        when true
    elif [ ... ]; then
        when true
    else
        when false
    fi
    command arg1 arg2 ...
    builtin arg1 arg2 ...
    pipe1 |
        pipe2 |
        pipe3
    a &&
        b &&
        c ||
        d
    bg1 &
    bg2 &
    wait
    empty=    # One assignment per line.
    result=...
    name1=val1 \
        name2=val2 \
        oneshotcmd with 2 variable assignments
    eval "
        code as above indented an additional level, e.g.
        case word in
        pat1)
            foo
            bar
            ;;
        pat2)
            baz
        esac
    "
    trap "
        code as above indented an additional level, e.g.
        case word in
        pat1)
            foo
            bar
            ;;
        pat2)
            baz
        esac
    " SIG
}
------------------------------------------------------------------------

Slight variations are acceptable, but try to stick to a general "one
command per line" principle and do not attempt to outsmart the
simplistic statement counter described below.

              == How to submit your entry ==

Submit entries via email to schweikh AT schweikhardt DOT net
(Followup-To: poster is set). The "Subject:" should contain the string
"relativepath entry 1" (without the quotes). Each contestant may submit
up to three entries; adjust the number in the subject accordingly. A
submission with the same number replaces an earlier submission. The
contest closes September 1st 2011, 00:00:01 UTC. By submitting an entry
the submitter assures that the submission is original work and agrees
that his/her work may be modified and published in usenet newsgroups and
other media, free of charge.

              == Entry Evaluation ==

Entries compete in three categories:

    1) Least statement count (tiebreaker: character count, then performance)
    2) Least character count (tiebreaker: statement count, then performance)
    3) Best performance      (tiebreaker: statement count, then character count)

Entries are evaluated by running

    $ env -i BC=$(which bc) PRINTF=printf SORT=$(which sort) RUNS=10 LOOPS=1000 ./evaluate.sh ./yourentry
    Statements: 999  Characters: 9999  Performance: 999999 # J. Random Hacker

Lower numbers are better. The evaluate.sh script is attached below. The
current version of evaluate.sh can be downloaded from
<URL:http://www.schweikhardt.net/evaluate.sh>. You are encouraged to use
evaluate.sh to test your entry. env is the POSIX utility,
<URL:http://pubs.opengroup.org/onlinepubs/9699919799/utilities/env.html>.
The number of RUNS and LOOPS may differ but will be the same for all
entries. If printf is not built-in to your shell, you can provide an
absolute path name, e.g. PRINTF=/usr/bin/printf.

The statement count is computed by the statement_count function. In words:

    2 for each conditional control structure (if, elif, while)
    1 for each for or case
    1 for each break or continue
    1 for each case label
    1 for each name=value or name=    Put them on separate lines. See sample entry.
    1 for each built-in
    1 for each function call
    eval: 1 + Statement count of argument
    trap: 2 + Statement count of action argument
    here-docs: 1 per line; lines include terminator

The character count is computed by the character_count function. In words:

    * Whitespace and semicolons are free.
    * Comments on a line by themselves are free.
    * Comments on a code line with '#' preceded by whitespace are free.

The performance is the sum of all fields of the POSIX 'times' utility,
converted to milliseconds, after running a loop with a suitable number
of iterations. Several runs are performed and the minimum number is used
as the actual performance. Note that performance measurement is
impossible if your shell's times command uses a non-standard output
format. It must look like this

    zsh/bash/ash $ times
    0m0.20s 0m0.18s
    10m2.81s 1m58.70s

Additional digits of precision are okay, but these formats are not:

    ksh93 $ times
    user    0m1.20s
    sys     0m0.30s

    pdksh $ times
    Shell:     0.10s user     0.20s system
    Kids:      0.20s user     0.50s system


              == Other tidbits ==

    1 Make no assumptions about the value of $0. Your script is run (sourced)
      using an arbitrary file name.
    2 Do not assume that $PWD is a writable directory; use the /tmp directory
      if you need to write and read back files.
    3 You can assume that 'test' and '[' are built-ins. If this is not true
      for your shell you may set PATH with env -i PATH=... to the
      directory with the 'test' and '[' executables for the sole purpose of
      using these external commands. It must run without PATH on a shell where
      these are built-ins (i.e. on the contest evaluation machine).
    4 Your entry should work in the general case, not just for the
      particular invocations tested by the test_suite function.

    VERBOTEN! NOT ALLOWED! DON'T EVEN THINK OF IT!

    5 Invoking external commands with $(command) or `command`. This should
      fail anyway due to PATH being unset.
    6 Invoking an external with an absolute name with /path/to/command.
    7 Abusing any variable used by evaluate.sh
    8 (Re)defining or interfering with unit_test, performance_test,
      statement_count or character_count to improve evaluation results.
    9 Setting PATH (except for 3 above).

    Entries doing so will be disqualified.

              == Announcement of Winners ==

Winners will be announced during September 2011 in comp.unix.shell.

For each of the three categories, the best three submissions are awarded
the gold, silver and bronze medal, respectively. An entry may win in
more than one category. The original (indented and commented) winning
entries are posted to comp.unix.shell, naming the submitter as stated in
the entry's "# J. Random Hacker" comment.

              == Prizes ==

Fame and admiration on the 'Net, especially in comp.unix.shell. Maybe an
offer to write or coauthor a book on shell programming? You're in it for
the honor, not for profit.

              == Contest Operator's Entries ==

The contest operator reserves the right to participate in the contest.
Because it would be an unfair advantage if the contest operator looked
at competing submissions and stole the best ideas, the contest operator
must finish his submissions before the contest is announced and make
public the MD5 sums of his submission(s) in the contest announcement.
Should any of the contest operator's entries win, the public is able to
verify that it was not due to modifications after contest announcement.

Contest operator's entry MD5 sums:
MD5 (entry1) = 02e80db5a7885a4f28dcd1f559057cbf
MD5 (entry2) = 4f84167130276a7f290493f52cf517dc
MD5 (entry3) = ac7a0d00c25358fd278eb51d4352809f

Happy hacking everybody!

  Jens Schweikhardt -- Contest Operator

             == evaluate.sh ==

#!/bin/sh
#!/usr/local/bin/bash
#!/usr/local/bin/zsh
# ksh93 and pdksh -- Bogus performance due to non-standard times(1) output.
#
# evaluate.sh - test and evaluate a contest entry
#
# $Id: evaluate.sh,v 1.7 2011/08/19 17:00:28 schweikh Exp schweikh $

unit_test () {
    while read FROM TO RESULT; do
        relativepath "$FROM" "$TO"
        XS=$?
        if test "$XS $result" != "$RESULT"; then
            $PRINTF '%s\n' "OOPS! Expected '$RESULT' for '$FROM' to '$TO' but got '$XS $result'"
           return 1
        fi
    done <<EOF
/            foo          1 <foo> not absolute or canonical
/            ./foo        1 <./foo> not absolute or canonical
/            ../foo       1 <../foo> not absolute or canonical
bar          /            1 <bar> not absolute or canonical
./bar        /            1 <./bar> not absolute or canonical
../bar       /            1 <../bar> not absolute or canonical
//           /foo         1 <//> not absolute or canonical
/foo//bar    /foo         1 </foo//bar> not absolute or canonical
/.           /foo         1 </.> not absolute or canonical
./           /foo         1 <./> not absolute or canonical
/foo/../bar  /foo         1 </foo/../bar> not absolute or canonical
/foo/./bar   /foo         1 </foo/./bar> not absolute or canonical
/foo/bar/    /foo         1 </foo/bar/> not absolute or canonical
/foo/bar/.   /foo         1 </foo/bar/.> not absolute or canonical
/            /            0 .
/-           /-           0 .
/-x          /-x          0 .
/            /-x          0 -x
/-n          /-n          0 .
/            /-n          0 -n
/-z          /-z          0 .
/            /-z          0 -z
/--          /--          0 .
/--          /--/--       0 --
/--/--       /--          0 ..
/?           /?           0 .
/            /?           0 ?
/??          /??          0 .
/            /??          0 ??
/???         /???         0 .
/            /???         0 ???
/?*          /?*          0 .
/            /?*          0 ?*
/*           /*           0 .
/            /*           0 *
/\ .         /\ .         0 .
/*           /**          0 ../**
/*           /***         0 ../***
/*.*         /*.**        0 ../*.**
/*.???       /*.??        0 ../*.??
/[]          /[]          0 .
/][          /][          0 .
/[a-z]*      /[0-9]*      0 ../[0-9]*
/'           /"           0 ../"
/""''        /''""        0 ../''""
/"'"'        /'"'"        0 ../'"'"
/foo         /foo         0 .
/foo         /            0 ..
/            /foo         0 foo
/            /foo/bar     0 foo/bar
/            /foo/bar/baz 0 foo/bar/baz
/foo         /foo/bar/baz 0 bar/baz
/foo/bar     /            0 ../..
/foo/bar     /foo/bar     0 .
/foo/bar     /foo         0 ..
/foo/bar     /foo/baz     0 ../baz
/foo/bar     /bar/foo     0 ../../bar/foo
/foo/bar/baz /gnarf/blurfl/blubb 0 ../../../gnarf/blurfl/blubb
/foo/bar/baz /gnarf       0 ../../../gnarf
/foo/bar/baz /foo/baz     0 ../../baz
/foo.        /bar.        0 ../bar.
/.foo        /.bar        0 ../.bar
/foo..       /bar..       0 ../bar..
/..foo       /..bar       0 ../..bar
/..foo..     /..bar..     0 ../..bar..
/baz/..foo.. /baz/..bar.. 0 ../..bar..
/.baz./..foo.. /.baz./..bar.. 0 ../..bar..
/a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p /1/2/3/4/5/6/7/8/9/A/B/C/D/E/F 0 ../../../../../../../../../../../../../../../../1/2/3/4/5/6/7/8/9/A/B/C/D/E/F
/f$$-$PPID   /t$PPID-$$   0 ../t$PPID-$$
EOF
    FUNKYNAME=$($PRINTF '/ -n -z C:\\\a\b\f\n\r\t\v\\ \b%% $ & ( ) = ? $(touch x) * ~ # _ ; : . , ! { } [ ] ^ | < >')
    relativepath "/ A name with/spaces in it!" "$FUNKYNAME"
    XS=$?
    if test "$XS $result" != "0 ../..$FUNKYNAME"; then
        $PRINTF '%s\n' "OOPS! Expected '0 ../..$FUNKYNAME' but got '$XS $result'"
        return 1
    fi
}

statement_count () {
    FILE=$1
    COUNT=0
    while read LINE; do
        case $LINE in
        '' | 'relativepath'* | '"' | 'done' | 'else' | 'fi' | 'esac' | '}' | '#'*)
            ;;
        'if '* | 'elif '* | 'while '*)
            COUNT=$(($COUNT + 2))
            ;;
        *)
            COUNT=$(($COUNT + 1))
            ;;
        esac
    done < "$FILE"
    $PRINTF 'Statements: %3u' $COUNT
}

character_count () {
    FILE=$1
    COUNT=0
    while read -r LINE; do
        case $LINE in
        '#'*)
            continue
            ;;
        esac
        set -- $LINE
        for WORD in $*; do
            case $WORD in
            *';'*)
                while test -n "$WORD"; do
                    if test "${WORD%${WORD#?}}" != ";"; then
                        COUNT=$(($COUNT + 1))
                    fi
                    WORD=${WORD#?}
                done
                ;;
            '#'*)
                break
                ;;
            *)
                COUNT=$(($COUNT + ${#WORD}))
                ;;
            esac
        done
    done < "$FILE"
    $PRINTF '  Characters: %4u' $COUNT
}

performance_test () {
    L=$LOOPS
    while test $L -gt 0; do
        unit_test
        L=$(($L - 1))
    done
    times
}

strip_comments () {
    FILE=$1
    while read -r LINE; do
        case $LINE in
        '#'*)
            continue
            ;;
        *)
            LINE=${LINE%% # [A-Z]*}
            $PRINTF '%s\n' "${LINE%${LINE##*[^ ]}}"
            ;;
        esac
    done < "$FILE"
}

: ${RUNS:=10} ${LOOPS:=100}
: ${PRINTF:=printf} ${BC:=/usr/bin/bc} ${SORT:=/usr/bin/sort}

test -n "$ZSH_VERSION" && setopt shwordsplit
strip_comments "$1" > "$1.nc"
. "$1.nc"
unit_test || exit 1
statement_count "$1.nc"
character_count "$1.nc"
while test $RUNS -gt 0; do
    performance_test | {
        IFS="ms" read M1 S1 M2 S2 REST
        IFS="ms" read M3 S3 M4 S4 REST
        $PRINTF 'scale=0;60000*(%d+%d+%d+%d)+1000*(%.3f+%.3f+%.3f+%.3f)/1\n' \
             $M1 $M2 $M3 $M4 $S1 $S2 $S3 $S4 |
        $BC
    }
    RUNS=$(($RUNS - 1))
done |
$SORT -n | {
    read MILLISECONDS
    read SUBMITTER < "$1"
    $PRINTF '  Performance: %6u %s\n' $MILLISECONDS "$SUBMITTER"
}
exit 0

# vi: tabstop=4 shiftwidth=4 expandtab fileformat=unix

Back to comp.unix.shell | Previous | NextNext in thread | Find similar


Thread

The First Pure Shell Contest (PUSH): relativepath Jens Schweikhardt <schweikh@schweikhardt.net> - 2011-08-19 17:23 +0000
  Re: The First Pure Shell Contest (PUSH): relativepath Stephane CHAZELAS <stephane_chazelas@yahoo.fr> - 2011-08-19 18:13 +0000
    Re: The First Pure Shell Contest (PUSH): relativepath Jens Schweikhardt <usenet@schweikhardt.net> - 2011-08-19 19:12 +0000
      Re: The First Pure Shell Contest (PUSH): relativepath pk <pk@pk.invalid> - 2011-08-19 21:14 +0200
        Re: The First Pure Shell Contest (PUSH): relativepath Jens Schweikhardt <usenet@schweikhardt.net> - 2011-08-19 19:29 +0000
          Re: The First Pure Shell Contest (PUSH): relativepath pk <pk@pk.invalid> - 2011-08-19 21:37 +0200
            Re: The First Pure Shell Contest (PUSH): relativepath Jens Schweikhardt <usenet@schweikhardt.net> - 2011-08-19 20:04 +0000
              Re: The First Pure Shell Contest (PUSH): relativepath pk <pk@pk.invalid> - 2011-08-19 22:25 +0200
                Re: The First Pure Shell Contest (PUSH): relativepath Jens Schweikhardt <usenet@schweikhardt.net> - 2011-08-19 21:08 +0000
  Re: The First Pure Shell Contest (PUSH): relativepath Rainer Weikusat <rweikusat@mssgmbh.com> - 2011-08-19 19:58 +0100
    Re: The First Pure Shell Contest (PUSH): relativepath Jens Schweikhardt <usenet@schweikhardt.net> - 2011-08-19 19:23 +0000
      Re: The First Pure Shell Contest (PUSH): relativepath Rainer Weikusat <rweikusat@mssgmbh.com> - 2011-08-19 22:18 +0100
  Re: The First Pure Shell Contest (PUSH): relativepath Stephane CHAZELAS <Stephane.CHAZELAS@free.fr> - 2011-08-20 20:07 +0000
    Re: The First Pure Shell Contest (PUSH): relativepath Jens Schweikhardt <usenet@schweikhardt.net> - 2011-08-21 10:43 +0000
    Re: The First Pure Shell Contest (PUSH): relativepath Rainer Weikusat <rweikusat@mssgmbh.com> - 2011-08-21 15:14 +0100
      Re: The First Pure Shell Contest (PUSH): relativepath Janis Papanagnou <janis_papanagnou@hotmail.com> - 2011-08-21 18:28 +0300
      Re: The First Pure Shell Contest (PUSH): relativepath Stephane CHAZELAS <stephane_chazelas@yahoo.fr> - 2011-08-21 16:19 +0000
        Re: The First Pure Shell Contest (PUSH): relativepath Stephane CHAZELAS <stephane_chazelas@yahoo.fr> - 2011-08-21 20:12 +0000
  Re: The First Pure Shell Contest (PUSH): relativepath Rainer Weikusat <rweikusat@mssgmbh.com> - 2011-08-22 21:15 +0100
    Re: The First Pure Shell Contest (PUSH): relativepath Stephane CHAZELAS <stephane_chazelas@yahoo.fr> - 2011-08-23 18:23 +0000
      Re: The First Pure Shell Contest (PUSH): relativepath Rainer Weikusat <rweikusat@mssgmbh.com> - 2011-08-23 19:52 +0100
        Re: The First Pure Shell Contest (PUSH): relativepath Stephane CHAZELAS <stephane_chazelas@yahoo.fr> - 2011-08-23 21:56 +0000

csiph-web