Path: csiph.com!x330-a1.tempe.blueboxinc.net!newsfeed.hal-mli.net!feeder3.hal-mli.net!news.alt.net!newsfeed.xs4all.nl!newsfeed5.news.xs4all.nl!xs4all!feeder.news-service.com!fu-berlin.de!uni-berlin.de!individual.net!not-for-mail From: Jens Schweikhardt Newsgroups: comp.unix.shell,comp.unix.programmer,comp.programming.contests Subject: The First Pure Shell Contest (PUSH): relativepath Followup-To: poster Date: 19 Aug 2011 17:23:19 GMT Lines: 466 Message-ID: <9b7kg7F3njU1@mid.individual.net> X-Trace: individual.net PacWxKTUOmPDHHPAjGzFPgoaExnOLKu9miimefj9hCFCrq8FGD Summary: Pure Shell Keywords: relativepath X-Orig-Path: not-for-mail Cancel-Lock: sha1:xHvlxMANWHaH+26Nq8J0Ghpkd9A= User-Agent: tin/1.9.6-20101126 ("Burnside") (UNIX) (FreeBSD/9.0-CURRENT (i386)) Xref: x330-a1.tempe.blueboxinc.net comp.unix.shell:1716 comp.unix.programmer:1156 comp.programming.contests:7 The First Pure Shell Contest (PUSH) == Motivation == The shell is not known for powerful string processing. Seasoned programmers use various combinations of sed, awk, cut and others to perform the more sophisticated transformations, or sacrifice portability by using their shell's extensions, like arrays. But calls to external utilities and use of shell extensions are often not needed. Knowing what the POSIX shell can do may turn a slow non-portable script into a blazingly fast and portable hacker's delight. The goal is to show the absurdity of forking and nonportable constructs when a nice and small pure shell solution solves the problem just as well. == Task == Write a POSIX shell function named 'relativepath' which takes as arguments two absolute canonicalized pathnames, and stores the relative path from the first to the second in the variable "result" or "." when the arguments are the same. Example: relativepath /foo/bar /foo/baz sets result to "../baz". A canonicalized pathname is one that contains no "." or ".." directories and no occurrences of "//"; only the root directory may end with "/". If an argument is relative or noncanonical, the function's exit status shall be 1 and the "result" variable shall contain the string "<%s> not absolute or canonical" where %s is replaced with the argument in question. For successful invocations the exit status shall be 0. More formally, relativepath must successfully pass the unit_test function in evaluate.sh below. == Entry format == Entries must consist of a single file implementing the relativepath function. The first line should be a comment identifying the submitter. The code should be indented with four spaces (no tabs) as shown in the sample entry below to 1) keep it readable 2) make it easy to determine a "statement count" 3) avoid writing one long line to improve performance measurements Entries will be stripped of comments and leading whitespace before evaluation so that well commented entries suffer no performance penalty, however slight, due to parsing more characters. The strip_comments function is quite simplistic: comments must start a line or begin with " # [A-Z]", i.e. space, hash, space, uppercase letter. It is your responsibility to verify that this particular way of preprocessing leaves your entry functional. == Sample Entry == ------------------------------------------------------------------------ # J. Random Hacker (or Anonymous if you prefer it.) helperfunc() { printf '42\n' } relativepath() { # Note: not considered exhaustive! # See helperfunc for foo in bar baz; do # Another comment. something > output done while test ...; do somethingelse < input done case word in ...) foo bar ;; ...) baz esac if test ...; then when true elif [ ... ]; then when true else when false fi command arg1 arg2 ... builtin arg1 arg2 ... pipe1 | pipe2 | pipe3 a && b && c || d bg1 & bg2 & wait empty= # One assignment per line. result=... name1=val1 \ name2=val2 \ oneshotcmd with 2 variable assignments eval " code as above indented an additional level, e.g. case word in pat1) foo bar ;; pat2) baz esac " trap " code as above indented an additional level, e.g. case word in pat1) foo bar ;; pat2) baz esac " SIG } ------------------------------------------------------------------------ Slight variations are acceptable, but try to stick to a general "one command per line" principle and do not attempt to outsmart the simplistic statement counter described below. == How to submit your entry == Submit entries via email to schweikh AT schweikhardt DOT net (Followup-To: poster is set). The "Subject:" should contain the string "relativepath entry 1" (without the quotes). Each contestant may submit up to three entries; adjust the number in the subject accordingly. A submission with the same number replaces an earlier submission. The contest closes September 1st 2011, 00:00:01 UTC. By submitting an entry the submitter assures that the submission is original work and agrees that his/her work may be modified and published in usenet newsgroups and other media, free of charge. == Entry Evaluation == Entries compete in three categories: 1) Least statement count (tiebreaker: character count, then performance) 2) Least character count (tiebreaker: statement count, then performance) 3) Best performance (tiebreaker: statement count, then character count) Entries are evaluated by running $ env -i BC=$(which bc) PRINTF=printf SORT=$(which sort) RUNS=10 LOOPS=1000 ./evaluate.sh ./yourentry Statements: 999 Characters: 9999 Performance: 999999 # J. Random Hacker Lower numbers are better. The evaluate.sh script is attached below. The current version of evaluate.sh can be downloaded from . You are encouraged to use evaluate.sh to test your entry. env is the POSIX utility, . The number of RUNS and LOOPS may differ but will be the same for all entries. If printf is not built-in to your shell, you can provide an absolute path name, e.g. PRINTF=/usr/bin/printf. The statement count is computed by the statement_count function. In words: 2 for each conditional control structure (if, elif, while) 1 for each for or case 1 for each break or continue 1 for each case label 1 for each name=value or name= Put them on separate lines. See sample entry. 1 for each built-in 1 for each function call eval: 1 + Statement count of argument trap: 2 + Statement count of action argument here-docs: 1 per line; lines include terminator The character count is computed by the character_count function. In words: * Whitespace and semicolons are free. * Comments on a line by themselves are free. * Comments on a code line with '#' preceded by whitespace are free. The performance is the sum of all fields of the POSIX 'times' utility, converted to milliseconds, after running a loop with a suitable number of iterations. Several runs are performed and the minimum number is used as the actual performance. Note that performance measurement is impossible if your shell's times command uses a non-standard output format. It must look like this zsh/bash/ash $ times 0m0.20s 0m0.18s 10m2.81s 1m58.70s Additional digits of precision are okay, but these formats are not: ksh93 $ times user 0m1.20s sys 0m0.30s pdksh $ times Shell: 0.10s user 0.20s system Kids: 0.20s user 0.50s system == Other tidbits == 1 Make no assumptions about the value of $0. Your script is run (sourced) using an arbitrary file name. 2 Do not assume that $PWD is a writable directory; use the /tmp directory if you need to write and read back files. 3 You can assume that 'test' and '[' are built-ins. If this is not true for your shell you may set PATH with env -i PATH=... to the directory with the 'test' and '[' executables for the sole purpose of using these external commands. It must run without PATH on a shell where these are built-ins (i.e. on the contest evaluation machine). 4 Your entry should work in the general case, not just for the particular invocations tested by the test_suite function. VERBOTEN! NOT ALLOWED! DON'T EVEN THINK OF IT! 5 Invoking external commands with $(command) or `command`. This should fail anyway due to PATH being unset. 6 Invoking an external with an absolute name with /path/to/command. 7 Abusing any variable used by evaluate.sh 8 (Re)defining or interfering with unit_test, performance_test, statement_count or character_count to improve evaluation results. 9 Setting PATH (except for 3 above). Entries doing so will be disqualified. == Announcement of Winners == Winners will be announced during September 2011 in comp.unix.shell. For each of the three categories, the best three submissions are awarded the gold, silver and bronze medal, respectively. An entry may win in more than one category. The original (indented and commented) winning entries are posted to comp.unix.shell, naming the submitter as stated in the entry's "# J. Random Hacker" comment. == Prizes == Fame and admiration on the 'Net, especially in comp.unix.shell. Maybe an offer to write or coauthor a book on shell programming? You're in it for the honor, not for profit. == Contest Operator's Entries == The contest operator reserves the right to participate in the contest. Because it would be an unfair advantage if the contest operator looked at competing submissions and stole the best ideas, the contest operator must finish his submissions before the contest is announced and make public the MD5 sums of his submission(s) in the contest announcement. Should any of the contest operator's entries win, the public is able to verify that it was not due to modifications after contest announcement. Contest operator's entry MD5 sums: MD5 (entry1) = 02e80db5a7885a4f28dcd1f559057cbf MD5 (entry2) = 4f84167130276a7f290493f52cf517dc MD5 (entry3) = ac7a0d00c25358fd278eb51d4352809f Happy hacking everybody! Jens Schweikhardt -- Contest Operator == evaluate.sh == #!/bin/sh #!/usr/local/bin/bash #!/usr/local/bin/zsh # ksh93 and pdksh -- Bogus performance due to non-standard times(1) output. # # evaluate.sh - test and evaluate a contest entry # # $Id: evaluate.sh,v 1.7 2011/08/19 17:00:28 schweikh Exp schweikh $ unit_test () { while read FROM TO RESULT; do relativepath "$FROM" "$TO" XS=$? if test "$XS $result" != "$RESULT"; then $PRINTF '%s\n' "OOPS! Expected '$RESULT' for '$FROM' to '$TO' but got '$XS $result'" return 1 fi done < not absolute or canonical / ./foo 1 <./foo> not absolute or canonical / ../foo 1 <../foo> not absolute or canonical bar / 1 not absolute or canonical ./bar / 1 <./bar> not absolute or canonical ../bar / 1 <../bar> not absolute or canonical // /foo 1 not absolute or canonical /foo//bar /foo 1 not absolute or canonical /. /foo 1 not absolute or canonical ./ /foo 1 <./> not absolute or canonical /foo/../bar /foo 1 not absolute or canonical /foo/./bar /foo 1 not absolute or canonical /foo/bar/ /foo 1 not absolute or canonical /foo/bar/. /foo 1 not absolute or canonical / / 0 . /- /- 0 . /-x /-x 0 . / /-x 0 -x /-n /-n 0 . / /-n 0 -n /-z /-z 0 . / /-z 0 -z /-- /-- 0 . /-- /--/-- 0 -- /--/-- /-- 0 .. /? /? 0 . / /? 0 ? /?? /?? 0 . / /?? 0 ?? /??? /??? 0 . / /??? 0 ??? /?* /?* 0 . / /?* 0 ?* /* /* 0 . / /* 0 * /\ . /\ . 0 . /* /** 0 ../** /* /*** 0 ../*** /*.* /*.** 0 ../*.** /*.??? /*.?? 0 ../*.?? /[] /[] 0 . /][ /][ 0 . /[a-z]* /[0-9]* 0 ../[0-9]* /' /" 0 ../" /""'' /''"" 0 ../''"" /"'"' /'"'" 0 ../'"'" /foo /foo 0 . /foo / 0 .. / /foo 0 foo / /foo/bar 0 foo/bar / /foo/bar/baz 0 foo/bar/baz /foo /foo/bar/baz 0 bar/baz /foo/bar / 0 ../.. /foo/bar /foo/bar 0 . /foo/bar /foo 0 .. /foo/bar /foo/baz 0 ../baz /foo/bar /bar/foo 0 ../../bar/foo /foo/bar/baz /gnarf/blurfl/blubb 0 ../../../gnarf/blurfl/blubb /foo/bar/baz /gnarf 0 ../../../gnarf /foo/bar/baz /foo/baz 0 ../../baz /foo. /bar. 0 ../bar. /.foo /.bar 0 ../.bar /foo.. /bar.. 0 ../bar.. /..foo /..bar 0 ../..bar /..foo.. /..bar.. 0 ../..bar.. /baz/..foo.. /baz/..bar.. 0 ../..bar.. /.baz./..foo.. /.baz./..bar.. 0 ../..bar.. /a/b/c/d/e/f/g/h/i/j/k/l/m/n/o/p /1/2/3/4/5/6/7/8/9/A/B/C/D/E/F 0 ../../../../../../../../../../../../../../../../1/2/3/4/5/6/7/8/9/A/B/C/D/E/F /f$$-$PPID /t$PPID-$$ 0 ../t$PPID-$$ EOF FUNKYNAME=$($PRINTF '/ -n -z C:\\\a\b\f\n\r\t\v\\ \b%% $ & ( ) = ? $(touch x) * ~ # _ ; : . , ! { } [ ] ^ | < >') relativepath "/ A name with/spaces in it!" "$FUNKYNAME" XS=$? if test "$XS $result" != "0 ../..$FUNKYNAME"; then $PRINTF '%s\n' "OOPS! Expected '0 ../..$FUNKYNAME' but got '$XS $result'" return 1 fi } statement_count () { FILE=$1 COUNT=0 while read LINE; do case $LINE in '' | 'relativepath'* | '"' | 'done' | 'else' | 'fi' | 'esac' | '}' | '#'*) ;; 'if '* | 'elif '* | 'while '*) COUNT=$(($COUNT + 2)) ;; *) COUNT=$(($COUNT + 1)) ;; esac done < "$FILE" $PRINTF 'Statements: %3u' $COUNT } character_count () { FILE=$1 COUNT=0 while read -r LINE; do case $LINE in '#'*) continue ;; esac set -- $LINE for WORD in $*; do case $WORD in *';'*) while test -n "$WORD"; do if test "${WORD%${WORD#?}}" != ";"; then COUNT=$(($COUNT + 1)) fi WORD=${WORD#?} done ;; '#'*) break ;; *) COUNT=$(($COUNT + ${#WORD})) ;; esac done done < "$FILE" $PRINTF ' Characters: %4u' $COUNT } performance_test () { L=$LOOPS while test $L -gt 0; do unit_test L=$(($L - 1)) done times } strip_comments () { FILE=$1 while read -r LINE; do case $LINE in '#'*) continue ;; *) LINE=${LINE%% # [A-Z]*} $PRINTF '%s\n' "${LINE%${LINE##*[^ ]}}" ;; esac done < "$FILE" } : ${RUNS:=10} ${LOOPS:=100} : ${PRINTF:=printf} ${BC:=/usr/bin/bc} ${SORT:=/usr/bin/sort} test -n "$ZSH_VERSION" && setopt shwordsplit strip_comments "$1" > "$1.nc" . "$1.nc" unit_test || exit 1 statement_count "$1.nc" character_count "$1.nc" while test $RUNS -gt 0; do performance_test | { IFS="ms" read M1 S1 M2 S2 REST IFS="ms" read M3 S3 M4 S4 REST $PRINTF 'scale=0;60000*(%d+%d+%d+%d)+1000*(%.3f+%.3f+%.3f+%.3f)/1\n' \ $M1 $M2 $M3 $M4 $S1 $S2 $S3 $S4 | $BC } RUNS=$(($RUNS - 1)) done | $SORT -n | { read MILLISECONDS read SUBMITTER < "$1" $PRINTF ' Performance: %6u %s\n' $MILLISECONDS "$SUBMITTER" } exit 0 # vi: tabstop=4 shiftwidth=4 expandtab fileformat=unix