Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #19530

Re: startsWith datastructure

From Robert Klemme <shortcutter@googlemail.com>
Newsgroups comp.lang.java.programmer
Subject Re: startsWith datastructure
Date 2012-10-27 13:56 +0200
Message-ID <af20epFsifeU1@mid.individual.net> (permalink)
References <rhen88pmvervpnbdkg32cgokujup957b08@4ax.com>

Show all headers | View raw


On 27.10.2012 12:50, Roedy Green wrote:
> Lets say you had a list of book publishers and the block of ISBN
> numbers they own. e.g. Prentice-Hall owns all isbns of the form
> 013XXXXXXXXXX,
> or put another way, own all ISBNs that startsWith "013".
> Some publishers such as O'Reilly own multiple blocks.
>
>       013, Prentice Hall
>       014, Penguin
>      0201, Addison-Wesley
>      0393, H. W. Norton
>      0471, John Wiley & Sons
>      0553, Bantam
>    059600, O'Reilly
>    156592, O'Reilly
>      0060, Harper
>     06723, Sams
>     07356, Microsoft Press
>     07432, Simon & Schuster
>     07710, McClelland & Stewart
>     08125, Tor
>    086571, New Society Publishers
>   0915972, Love Line
> 15795106, Ronin Publishing
>
> etc.
>
> Your list is incomplete. It has only the major publishers.
>
> The problem is, given an ISBN, can you make at guess at its publisher?
>
> What sort of  datastructure/lookup mechanism would you use?
>
> Obviously, you could just do a linear search looking for a match, and
> unless the list got very long, that should suffice,  but, just for
> fun, say you wanted something faster, what would you do?
>
> This is just an example of a class of problem I wanted to write a
> little essay on for the Java glossary with possible solutions.

That's a typical use case for a Trie:
http://en.wikipedia.org/wiki/Trie

Other than that you could use a TreeSet and make use of method subSet() 
where the lower bound is the fragment you got and the upper bound is the 
fragment with the last character increased "by one" or a "max" character 
appended.

Or you use a sorted array and use Arrays.binarySearch().  The Trie is 
more efficient though.

Cheers

	robert


-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

Back to comp.lang.java.programmer | Previous | NextPrevious in thread | Find similar | Unroll thread


Thread

startsWith datastructure Roedy Green <see_website@mindprod.com.invalid> - 2012-10-27 03:50 -0700
  Re: startsWith datastructure Roedy Green <see_website@mindprod.com.invalid> - 2012-10-27 04:13 -0700
  Re: startsWith datastructure "Chris Uppal" <chris.uppal@metagnostic.REMOVE-THIS.org> - 2012-10-27 12:50 +0100
  Re: startsWith datastructure Robert Klemme <shortcutter@googlemail.com> - 2012-10-27 13:56 +0200

csiph-web