Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > microsoft.public.scripting.vbscript > #11374

Re: script to search list of strings in files/directories

From "Dave \"Crash\" Dummy" <invalid@invalid.invalid>
Newsgroups microsoft.public.scripting.vbscript
Subject Re: script to search list of strings in files/directories
Date 2016-09-01 13:41 -0400
Organization A noiseless patient Spider
Message-ID <nq9p7g$2a2$1@dont-email.me> (permalink)
References (6 earlier) <nq7u9e$ie7$1@dont-email.me> <nq84dk$1f5$1@dont-email.me> <nq9ch8$ikr$1@dont-email.me> <nq9gbg$19l$1@dont-email.me> <nq9m0f$lvk$1@dont-email.me>

Show all headers | View raw


Mayayana wrote:
> "Dave "Crash" Dummy" wrote
> 
> | > | >   The case-insensitive search takes more time. With | > | > a
> single operation it's not disernible, but with hundreds | > | > of
> calls using case-sensitive search can speed things up. | | I don't
> know that normalizing the string prior to running the InStr |
> function is any faster. If the case insensitive option is selected
> the | function is going to normalize the string before doing the
> search, | anyway. It may even be slower to normalize the string in a
> separate | operation before running InStr. |
> 
> Your speculation seems reasonable, but Microsoft apparently didn't
> think the same way. I think what the InStr function probably does is
> to search numerically. So a CS search for "A" will look for byte 65.
> A non-CS search will look for 65 or 97. Then that will get less 
> efficient as the string gets longer and each character adds a dual
> search. If 65 or 97 is found then look for 66 or 98. If any of those
> 4 combinations are found then look for 67 or 99. Etc.
> 
> Here's a simple test:
> 
> 400 iterations of searching a text file, 573 KB. The nonsense word
> "AggyDaggy" (to ensure uniqueness) was added near the end and then a
> search was run.
> 
> ------------------------------------------- Dim FSO, Arg, TS, s1, x1,
> x2, i, Ret
> 
> Arg = WScript.Arguments(0) Set FSO =
> CreateObject("Scripting.FileSystemObject") Set TS =
> FSO.OpenTextFile(Arg, 1) s1 = TS.ReadAll TS.Close Set TS = Nothing
> 
> x1 = Timer s1 = UCase(s1) For i = 1 to 400 Ret = InStr(1, s1,
> "AGGYDAGGY", 0) Next x2 = Timer
> 
> MsgBox x2 - x1 --------------------------------------
> 
> Case sensitive:                          .234375 seconds UCase
> followed by case sensitive: .53125 seconds non-case sensitive:
> 3.875 seconds
> 
> I've consistently found that two things can greatly increase the
> speed of scripts that have to do extensive work with strings:
> 
> 1) non-case sensitive string search using UCase. 2) Build strings
> with an array rather than concatenation.
> 
> The latter method uses an array member for each concatenation.
> Instead of doing s = s & "more text" it does A(x) = "more text". Then
> it uses Join at the end. I actually got that idea from Matthew
> Curland's book. He was one of the original VB6 designers and pointed 
> out that Join walks the whole array, measuring the content, then
> allocates a single string to accomodate it all. Concatenating must
> allocate a new string every time, so adding "more" to a 3 MB ANSI
> string requires allocating a new string of 3 MB + 4 bytes. Memory 
> allocation takes a lot more time than calculations, and slows as it
> gets bigger.
> 
> In typical usage it doesn't much matter. One InStr call will be
> insignificant no matter which way it's done. A half dozen 
> concatenations don't cost much. But it's not unusual to need to
> optimize. Using the two methods above seems like more work but can
> actually cut a lot of time out of operations. Also, Replace is
> extremely slow, probably because of the same concatenation problem.
> It's actually often much faster to write a complex tokenizing routine
> than to run a few Replace operations.

How do you predict the required size of the array? Using "redim
preserve" for each entry seems kind of awkward.
-- 
Crash

"If the world was perfect, it wouldn't be."
~ Yogi Berra ~

Back to microsoft.public.scripting.vbscript | Previous | NextPrevious in thread | Next in thread | Find similar | Unroll thread


Thread

script to search list of strings in files/directories zmau1962@gmail.com - 2016-08-30 00:29 -0700
  Re: script to search list of strings in files/directories "Evertjan." <exxjxw.hannivoort@inter.nl.net> - 2016-08-30 10:52 +0200
  Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-08-31 14:22 -0400
    Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-08-31 13:05 -0700
      Re: script to search list of strings in files/directories "Evertjan." <exxjxw.hannivoort@inter.nl.net> - 2016-08-31 22:16 +0200
        Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-08-31 16:39 -0400
          Re: script to search list of strings in files/directories "Evertjan." <exxjxw.hannivoort@inter.nl.net> - 2016-08-31 22:52 +0200
            Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-08-31 20:54 -0400
              Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-08-31 22:39 -0400
                Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-09-01 10:03 -0400
                Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-01 11:09 -0400
                Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-09-01 12:45 -0400
                Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-01 13:41 -0400
                Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-09-01 15:19 -0400
      Re: script to search list of strings in files/directories "R.Wieser" <address@not.available> - 2016-09-01 09:25 +0200
        Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-09-05 07:36 -0700
  Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-01 14:35 -0400
    Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-09-05 07:21 -0700
      Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-05 14:43 -0400
      Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-05 14:54 -0400
        Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-05 15:41 -0400
          Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-09-05 13:07 -0700
          Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-09-05 13:09 -0700
            Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-06 11:33 -0400
  Re: script to search list of strings in files/directories Dr J R Stockton <reply1600@merlyn.demon.co.uk.invalid> - 2016-09-07 23:43 +0100

csiph-web