Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.forth > #11798

Re: Google CodeJam?

From hughaguilar96@yahoo.com
Newsgroups comp.lang.forth
Subject Re: Google CodeJam?
Date 2012-04-30 23:54 -0700
Organization http://groups.google.com
Message-ID <31940905.52.1335855283770.JavaMail.geo-discussion-forums@ynen8> (permalink)
References (1 earlier) <ef8dcfcf-f7b6-474f-a20d-197d269575cd@f37g2000yqc.googlegroups.com> <13201023.2427.1335046423871.JavaMail.geo-discussion-forums@ynbi17> <1kiyc5r.cu8iaa1g5nh99N%awegel@arcor.de> <1kj32wv.bjuk8i1utlkq5N%awegel@arcor.de> <1kj7r60.3aht9b1sk2bgbN%awegel@arcor.de>

Show all headers | View raw


On Friday, April 27, 2012 6:15:49 AM UTC-6, Alex Wegel wrote:
> For those still there - here's a final(?) version of my approach to the
> alien program.
> Now it's down to 150 words of source - i think that's how i leave it.
> 
> #! /usr/local/bin/gforth-fast
> \ : key pad dup 1 stdin read-file throw drop c@ ; ( fast vsn. of KEY)
> : <n> 0 begin key [char] 0 - dup 10 u< while swap 10 * + repeat drop ;
> 0 value L
> : <dims> <n> to L <n> <n> ;
> 
> : ord [char] a - ;
> : az? ord dup 26 u< ;
> : az* begin key az? while c, repeat drop ;
> : <dic> 0 do az* loop align ;
> 
> : >msk [ -1 dup 1 rshift xor ] literal swap rshift invert ;
> : az*) -1 begin key az? while >msk and repeat drop ;
> : -pat -1 cells L * allot ;
> : 
> 
>   begin key
>     az? if
>       >msk
>     else
>       [char] ( ord = if az*) else exit then
>     then ,
>   again ;
> 
> : m?
>    here dup rot cells - do
>     i @ over c@ lshift
>     0< if unloop drop false exit then
>     char+
>   cell +loop
>   drop true ;
> 
> : #m 0 -rot L * bounds do i L m? - L +loop ;
> : .m ." Case #" 1 .r ." : " 1 .r cr ;
> : chk 0 do 2dup 
>  #m -pat i 1+ .m loop 2drop ;
> 
> here <dims> over <dic> chk bye

Here is a new version of my program, including these upgrades:
1.) I implemented a SSEQ data type that is similar to my SEQ data type except that it allows for big (>255 char) strings.
2.) I replaced SEQ with SSEQ throughout the ALIEN program as needed so that the program will now handle the large sample file.
3.) I switched to bit-masks similar to Alex's program.
4.) I fixed a stack-picture comment that was incorrect.

This is not plain Forth --- I've never written a plain Forth program in my life. I find plain Forth to be largely unreadable (although I can *decipher* it given enough time). If I were offered a job maintaining plain Forth software, I would refuse and just stick with cab driving  --- unless it paid a *lot* of money, but that is unrealistic because the employer would just have the program rewritten from scratch in a readable language for less money.

My program seems rather slow on the large sample file. I think this is due to my heavy use of the heap, which is quite slow on all Forth systems (I don't know why). I also compile functions at run-time, which may be slow depending upon which compiler is being used and how much optimization is being done.


This is SSEQ which will go into the novice package:

list
    w field .cnt
    w field .chars      \ pointer to heap
constant sseq

: sseqable ( adr cnt -- new-adr cnt )       \ put string in heap for SSEQ so <KILL-SSEQ> will work on the SSEQ
    >r
    r@ alloc                        \ -- adr new-adr        \r: -- cnt
    tuck  r@ cmove>                 \ -- new-adr
    r> ;

: init-sseq ( adr cnt node -- node )
    init-list >r
    sseqable    r@ .cnt !  r@ .chars !
    r> ;

: new-sseq ( adr cnt -- node )    
    sseq alloc
    init-sseq ;
    
: <kill-sseq> ( node -- )    
    dup .chars @  dealloc
    dealloc ;
    
: kill-sseq ( head -- )    
    each[  <kill-sseq>  ]each ;
    
macro: <sseq> ( node -- adr cnt )    
    dup .chars @  swap .cnt @ ;
    
: <show-sseq> ( node -- )
    <sseq> type  cr ;

: show-sseq ( head -- )
    cr
    ['] <show-sseq>  each ;
    
1000 value sseq-size    \ making this too big could be a problem if such a large block can't be found in the heap
    
: <read-sseq> ( adr cnt -- head )
    r/o <open-file> >r                                          \ r: -- file-id
    nil begin                               \ -- head
        sseq-size 2+  alloc                 \ -- head chars
        dup sseq-size r@  read-line  abort" *** READ-SSEQ failed to read line ***"
        while                               \ -- head chars cnt
            dup >r  realloc  r>             \ -- head new-chars cnt
            new-sseq                        \ -- head node
            link  repeat 2drop
    r> <close-file> ;

: read-sseq ( name -- head )
    count <read-sseq> ;
    
: <write-sseq> ( head adr cnt -- )
    w/o <create-file>  swap                 \ -- file-id head
    each[  <sseq>  rover  write-line abort" *** <WRITE-SSEQ> failed to write ***"  ]each
    <close-file> ;

: write-sseq ( head name -- )
    count <write-sseq> ;

    
This is the new ALIEN program:    

\ This is the solution to the "alien language" example problem from Google CodeJam.
\ http://code.google.com/codejam/contest/90101/dashboard#s=p0

\ Written by Hugh Aguilar --- copyright (c) 2012 --- BSD license

\ requires novice.4th and list.4th

marker alien.4th


\ ****** 
\ ****** input and output
\ ****** 

variable #letters
variable #words 
variable #patterns

: sseqs ( name-str -- word-sseq pattern-sseq )
    read-sseq
    dup <sseq>  evaluate 
    #patterns !  #words !  #letters !
    .fore @                                 \ -- word-seq 
    dup #words @ nth                        \ -- word-seq pattern-seq
    delink 
    dup #patterns @ nth                     \ -- word-seq pattern-seq extraneous-seq
    delink  kill-sseq                       \ -- word-seq pattern-seq
                                                                                            \ error checking
    over length  #words @       <> abort" *** bad word-sseq ***"
    dup  length  #patterns @    <> abort" *** bad pattern-sseq ***" 
    over each[  .cnt @  #letters @  <> abort" *** string in word-sseq is wrong length ***"  ]each ;

: dump-result ( pattern-list name-str -- )    
    <cstr  +cstr  c" .result" +cstr  cstr>  write-sseq ;
    
    
\ ****** 
\ ****** convert pattern string into any-seq list
\ ****** each node in the list represents one char in the target string
\ ****** the .LINE string of each node contains all of the character that would match
\ ****** 

: check-any-str { pattern-adr pattern-cnt head -- }
    head length  #letters @  <> if
        cr ." *** any-seq is the wrong length ***" 
        cr  pattern-adr pattern-cnt type
        cr  true abort" *** aborting ***  
        then ;

: <make-any-seq> ( head str -- head )
    dup c@ if   new-seq link                \ str has characters in it
    else        drop            then ;

: make-any-seq { pattern-adr pattern-cnt | group? -- any-seq }
    nil  <cstr
    pattern-adr pattern-cnt  bounds ?do     \ -- head
        group? if                                                                               \ if inside of ( ) group
            I c@  [char] )  = if    false to group?     cstr> <make-any-seq>  <cstr
            else                    I c@ char+cstr                                      then
        else                                                                                    \ else outside of ( ) group
            I c@  [char] (  = if    true to group?
            else                    I c@ char+cstr      cstr> <make-any-seq>  <cstr     then
            then
        loop 
    cstr> <make-any-seq>  
    pattern-adr pattern-cnt rover check-any-str ;
    

\ ****** 
\ ****** generate pattern matcher
\ ****** 

char & comment  \ this is an example of what will get generated

:noname ( mask-adr -- match? )              \ this is for: a(bc)

    dup @  1 and  0= if  drop  false exit then
    w +                                     \ -- new-mask-adr

    dup @  6 and  0= if  drop  false exit then
    w +                                     \ -- new-mask-adr

    drop  true ;

&

: char>mask ( char -- mask )                \ this assumes that there are < 32 chars, and they start with 'a'
    [char] a -                              \ -- ordinal
    1 swap lshift ;
    
: <generate-pattern> ( node -- )
    >r
    s" dup @ "                                                          evaluate
    0  r> .line @ count  bounds do          \ -- mask
        I c@ char>mask  or  loop  lit,      \ --
    s" and  0= if  drop  false exit then      w + "                     evaluate ;

: generate-pattern ( any-seq -- xt )
    >r
    s" :noname ( char-adr -- match? ) "                                 evaluate
    r>  ['] <generate-pattern> each
    s" drop  true ; "                                                   evaluate ;


\ ****** 
\ ****** upgrade word-seq
\ ****** 

list
    w field .mask       \ pointer to array of words
constant mask

: <kill-mask> ( node -- )
    dup .mask @  dealloc
    dealloc ;
    
: kill-mask ( head -- )    
    each[  <kill-mask>  ]each ;
    
: init-mask ( word-adr word-cnt node -- node )    
    init-list >r
    dup w * alloc                           \ -- word-adr word-cnt mask-array
    dup r@ .mask !
    -rot bounds do                          \ -- mask-element
        I c@ char>mask  over !
        w +  loop drop                      \ --
    r> ;

: new-mask ( word-adr word-cnt -- )
    mask alloc
    init-mask ;
    
: upgrade-word ( word-sseq -- mask-list )               \ create a MASK list given a SSEQ list
    nil swap                                            \ -- mask-list word-sseq
    each[  <sseq> new-mask  link  ]each ;
    

\ ****** 
\ ****** upgrade pattern-seq
\ ****** 
    
sseq                    \ starts out as pattern-str, later gets changed to result-str
    w field .any        \ pointer to any-seq
    w field .xt         \ xt of generated pattern matcher
    w field .matches    \ count of matches for this pattern    
constant pattern

: <kill-pattern> ( node -- )    
    dup .any @  kill-seq
    <kill-sseq> ;
    
: kill-pattern ( head -- )
    each[  <kill-pattern>  ]each ;    

: init-pattern ( pattern-adr pattern-cnt node -- node )
    init-sseq >r
    r@ .chars @  r@ .cnt @  make-any-seq            r@ .any !
    r@ .any @   generate-pattern                    r@ .xt !   
    0                                               r@ .matches !
    r> ;
    
: new-pattern ( adr cnt -- node )     
    pattern alloc
    init-pattern ;

: upgrade-pattern ( pattern-sseq -- pattern-list )      \ create a PATTERN list given a SSEQ list
    nil swap                                            \ -- pattern-list pattern-sseq
    each[  <sseq> new-pattern  link  ]each ;
    
    
\ ****** 
\ ****** pattern-match
\ ****** 

: <check-word> ( pattern-list-node mask-node -- pattern-list-node )    
    .mask @                                             \ -- pattern-list-node mask-adr
    over .xt @  execute                                 \ -- pattern-list-node match?
    if  1  over .matches +!  then ;                     \ -- pattern-list-node
    
: <check-pattern> ( mask-list pattern-list-node -- mask-list )
    over  ['] <check-word> each                         \ -- word-sseq pattern-list-node
    drop ;
    
: check-pattern ( mask-list pattern-list -- )
    ['] <check-pattern> each                            \ -- mask-list
    drop ;

    
\ ****** 
\ ****** make result strings
\ ****** 

: u>str ( u -- adr cnt )
    u>d <# #s #> ;

: <fill-result> ( pattern# pattern-list-node -- new-pattern# )
    dup .chars @ dealloc                                                    \ get rid of pattern-str
    <cstr
        c" Case #"              +cstr  
        over u>str              <+cstr>
        c" : "                  +cstr
        dup .matches @ u>str    <+cstr>
    cstr> count sseqable  rover .cnt !  swap .chars !   \ -- pattern#       \ set result-str
    1+ ;                                                \ -- new-pattern# 
    
: fill-result ( pattern-list -- )
    1  swap  ['] <fill-result> each                     \ -- pattern#
    drop ;
    
    
\ ****** 
\ ****** main program
\ ****** 

: alien  ( name-str -- )     
    dup sseqs  { name-str word-sseq pattern-sseq | mask-list pattern-list -- }
    s" marker upgrade-pattern-stuff " evaluate                              \ so we can get rid of the UPGRADE-PATTERN words
    word-sseq upgrade-word          to mask-list
    pattern-sseq upgrade-pattern    to pattern-list
    mask-list pattern-list check-pattern
    pattern-list fill-result
    pattern-list name-str dump-result
                                                                            \ clean up
    word-sseq       kill-sseq
    pattern-sseq    kill-sseq            
    mask-list       kill-mask
    pattern-list    kill-pattern
    s" upgrade-pattern-stuff " evaluate ;
    

Back to comp.lang.forth | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Re: Google CodeJam? hughaguilar96@yahoo.com - 2012-04-21 15:13 -0700
  Re: Google CodeJam? Bernd Paysan <bernd.paysan@gmx.de> - 2012-04-22 02:19 +0200
  Re: Google CodeJam? "WJ" <w_a_x_man@yahoo.com> - 2012-04-22 04:03 +0000
    Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-22 23:08 -0700
    Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-22 23:17 -0700
      Re: Google CodeJam? "WJ" <w_a_x_man@yahoo.com> - 2012-04-23 19:27 +0000
        Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-23 22:08 -0700
          Re: Google CodeJam? "WJ" <w_a_x_man@yahoo.com> - 2012-04-24 08:30 +0000
            Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-24 23:00 -0700
              Re: Google CodeJam? "WJ" <w_a_x_man@yahoo.com> - 2012-04-26 09:19 +0000
                Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-26 03:17 -0700
                Re: Google CodeJam? vandys@vsta.org - 2012-04-26 16:45 +0000
                Re: Google CodeJam? Paul Rubin <no.email@nospam.invalid> - 2012-04-26 10:47 -0700
                Re: Google CodeJam? Paul Rubin <no.email@nospam.invalid> - 2012-04-27 02:26 -0700
                Re: Google CodeJam? vandys@vsta.org - 2012-04-27 16:37 +0000
                Re: Google CodeJam? Paul Rubin <no.email@nospam.invalid> - 2012-04-27 10:17 -0700
                Re: Google CodeJam? vandys@vsta.org - 2012-04-27 18:15 +0000
        Re: Google CodeJam? Rugxulo <rugxulo@gmail.com> - 2012-04-26 16:00 -0700
  Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-22 12:12 +0200
    Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-22 14:11 +0200
      Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-22 14:17 +0200
        Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-25 14:25 +0200
    Re: Google CodeJam? mhx@iae.nl (Marcel Hendrix) - 2012-04-22 18:23 +0200
      Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-22 22:24 +0200
        Re: Google CodeJam? mhx@iae.nl (Marcel Hendrix) - 2012-04-23 20:59 +0200
          Re: Google CodeJam? Paul Rubin <no.email@nospam.invalid> - 2012-04-23 13:56 -0700
    Re: Google CodeJam? mhx@iae.nl (Marcel Hendrix) - 2012-04-22 18:29 +0200
      Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-22 21:36 +0200
    Re: Google CodeJam? Bernd Paysan <bernd.paysan@gmx.de> - 2012-04-22 23:08 +0200
      Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-22 23:06 -0700
      Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-23 23:19 +0200
        Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-23 22:35 -0700
          Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-25 01:59 +0200
            Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-24 22:16 -0700
              Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-25 14:25 +0200
                Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-25 11:58 -0700
                Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-26 00:23 +0200
                Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-25 11:58 -0700
                Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-26 00:23 +0200
                Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-25 11:58 -0700
                Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-26 00:23 +0200
                Re: Google CodeJam? Albert van der Horst <albert@spenarnc.xs4all.nl> - 2012-04-26 12:36 +0000
              Re: Google CodeJam? anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-04-25 13:39 +0000
                Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-04-25 10:05 -0700
                Re: Google CodeJam? anton@mips.complang.tuwien.ac.at (Anton Ertl) - 2012-04-26 16:30 +0000
              Re: Google CodeJam? Bernd Paysan <bernd.paysan@gmx.de> - 2012-04-27 15:21 +0200
        Re: Google CodeJam? Paul Rubin <no.email@nospam.invalid> - 2012-04-24 00:51 -0700
          Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-25 01:59 +0200
    Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-25 01:59 +0200
      Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-27 14:15 +0200
        Re: Google CodeJam? hughaguilar96@yahoo.com - 2012-04-30 23:54 -0700
          Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-05-02 01:44 +0200
            Re: Google CodeJam? Hugh Aguilar <hughaguilar96@yahoo.com> - 2012-05-01 20:45 -0700
              Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-05-02 23:38 +0200
  Re: Google CodeJam? Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2012-04-24 12:50 +0100
    Re: Google CodeJam? Paul Rubin <no.email@nospam.invalid> - 2012-04-24 08:22 -0700
    Re: Google CodeJam? mhx@iae.nl (Marcel Hendrix) - 2012-04-24 22:28 +0200
      Re: Google CodeJam? Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2012-04-25 09:50 +0100
    Re: Google CodeJam? awegel@arcor.de (Alex Wegel) - 2012-04-25 01:59 +0200
      Re: Google CodeJam? Gerry Jackson <gerry@jackson9000.fsnet.co.uk> - 2012-04-25 09:49 +0100

csiph-web