Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > microsoft.public.scripting.vbscript > #11373
| From | "Mayayana" <mayayana@invalid.nospam> |
|---|---|
| Newsgroups | microsoft.public.scripting.vbscript |
| Subject | Re: script to search list of strings in files/directories |
| Date | 2016-09-01 12:45 -0400 |
| Organization | A noiseless patient Spider |
| Message-ID | <nq9m0f$lvk$1@dont-email.me> (permalink) |
| References | (5 earlier) <XnsA675E89F7DE68eejj99@194.109.6.166> <nq7u9e$ie7$1@dont-email.me> <nq84dk$1f5$1@dont-email.me> <nq9ch8$ikr$1@dont-email.me> <nq9gbg$19l$1@dont-email.me> |
"Dave "Crash" Dummy" wrote
| > | > The case-insensitive search takes more time. With
| > | > a single operation it's not disernible, but with hundreds
| > | > of calls using case-sensitive search can speed things up.
|
| I don't know that normalizing the string prior to running the InStr
| function is any faster. If the case insensitive option is selected the
| function is going to normalize the string before doing the search,
| anyway. It may even be slower to normalize the string in a separate
| operation before running InStr.
|
Your speculation seems reasonable, but Microsoft
apparently didn't think the same way. I think what
the InStr function probably does is to search numerically.
So a CS search for "A" will look for byte 65. A non-CS
search will look for 65 or 97. Then that will get less
efficient as the string gets longer and each character
adds a dual search. If 65 or 97 is found then look for
66 or 98. If any of those 4 combinations are found then
look for 67 or 99. Etc.
Here's a simple test:
400 iterations of searching a text file, 573 KB.
The nonsense word "AggyDaggy" (to ensure uniqueness)
was added near the end and then a search was run.
-------------------------------------------
Dim FSO, Arg, TS, s1, x1, x2, i, Ret
Arg = WScript.Arguments(0)
Set FSO = CreateObject("Scripting.FileSystemObject")
Set TS = FSO.OpenTextFile(Arg, 1)
s1 = TS.ReadAll
TS.Close
Set TS = Nothing
x1 = Timer
s1 = UCase(s1)
For i = 1 to 400
Ret = InStr(1, s1, "AGGYDAGGY", 0)
Next
x2 = Timer
MsgBox x2 - x1
--------------------------------------
Case sensitive: .234375 seconds
UCase followed by case sensitive: .53125 seconds
non-case sensitive: 3.875 seconds
I've consistently found that two things can greatly
increase the speed of scripts that have to do extensive
work with strings:
1) non-case sensitive string search using UCase.
2) Build strings with an array rather than concatenation.
The latter method uses an array member for each
concatenation. Instead of doing s = s & "more text"
it does A(x) = "more text". Then it uses Join at the
end. I actually got that idea from Matthew Curland's book.
He was one of the original VB6 designers and pointed
out that Join walks the whole array, measuring the
content, then allocates a single string to accomodate
it all. Concatenating must allocate a new string every
time, so adding "more" to a 3 MB ANSI string requires
allocating a new string of 3 MB + 4 bytes. Memory
allocation takes a lot more time than calculations,
and slows as it gets bigger.
In typical usage it doesn't much matter. One InStr call
will be insignificant no matter which way it's done. A half dozen
concatenations don't cost much. But it's not unusual to need
to optimize. Using the two methods above seems like more
work but can actually cut a lot of time out of operations.
Also, Replace is extremely slow, probably because of the
same concatenation problem. It's actually often much faster
to write a complex tokenizing routine than to run a few
Replace operations.
Back to microsoft.public.scripting.vbscript | Previous | Next — Previous in thread | Next in thread | Find similar | Unroll thread
script to search list of strings in files/directories zmau1962@gmail.com - 2016-08-30 00:29 -0700
Re: script to search list of strings in files/directories "Evertjan." <exxjxw.hannivoort@inter.nl.net> - 2016-08-30 10:52 +0200
Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-08-31 14:22 -0400
Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-08-31 13:05 -0700
Re: script to search list of strings in files/directories "Evertjan." <exxjxw.hannivoort@inter.nl.net> - 2016-08-31 22:16 +0200
Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-08-31 16:39 -0400
Re: script to search list of strings in files/directories "Evertjan." <exxjxw.hannivoort@inter.nl.net> - 2016-08-31 22:52 +0200
Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-08-31 20:54 -0400
Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-08-31 22:39 -0400
Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-09-01 10:03 -0400
Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-01 11:09 -0400
Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-09-01 12:45 -0400
Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-01 13:41 -0400
Re: script to search list of strings in files/directories "Mayayana" <mayayana@invalid.nospam> - 2016-09-01 15:19 -0400
Re: script to search list of strings in files/directories "R.Wieser" <address@not.available> - 2016-09-01 09:25 +0200
Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-09-05 07:36 -0700
Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-01 14:35 -0400
Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-09-05 07:21 -0700
Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-05 14:43 -0400
Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-05 14:54 -0400
Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-05 15:41 -0400
Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-09-05 13:07 -0700
Re: script to search list of strings in files/directories Mau Z <zmau1962@gmail.com> - 2016-09-05 13:09 -0700
Re: script to search list of strings in files/directories "Dave \"Crash\" Dummy" <invalid@invalid.invalid> - 2016-09-06 11:33 -0400
Re: script to search list of strings in files/directories Dr J R Stockton <reply1600@merlyn.demon.co.uk.invalid> - 2016-09-07 23:43 +0100
csiph-web