Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.lang.java.programmer > #21727 > unrolled thread
| Started by | Wojtek <nowhere@a.com> |
|---|---|
| First post | 2013-01-26 01:14 -0800 |
| Last post | 2013-03-15 10:31 -0700 |
| Articles | 20 on this page of 40 — 12 participants |
Back to article view | Back to comp.lang.java.programmer
The first 10 files Wojtek <nowhere@a.com> - 2013-01-26 01:14 -0800
Re: The first 10 files Roedy Green <see_website@mindprod.com.invalid> - 2013-01-26 02:44 -0800
Re: The first 10 files Lew <lewbloch@gmail.com> - 2013-01-26 10:20 -0800
Re: The first 10 files "John B. Matthews" <nospam@nospam.invalid> - 2013-01-26 06:31 -0500
Re: The first 10 files Wojtek <nowhere@a.com> - 2013-01-26 15:42 -0800
Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 17:13 -0700
Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:21 -0500
Re: The first 10 files "John B. Matthews" <nospam@nospam.invalid> - 2013-01-26 22:05 -0500
Re: The first 10 files Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-01-26 08:24 -0400
Re: The first 10 files Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-01-26 08:25 -0400
Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 13:26 -0500
Re: The first 10 files Robert Klemme <shortcutter@googlemail.com> - 2013-01-26 22:15 +0100
Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 16:25 -0500
Re: The first 10 files Eric Sosman <esosman@comcast-dot-net.invalid> - 2013-01-26 17:06 -0500
Re: The first 10 files Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2013-01-26 15:21 -0800
Re: The first 10 files Eric Sosman <esosman@comcast-dot-net.invalid> - 2013-01-26 20:42 -0500
Re: The first 10 files Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2013-01-26 17:56 -0800
Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:29 -0500
Re: The first 10 files Eric Sosman <esosman@comcast-dot-net.invalid> - 2013-01-26 21:56 -0500
Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 20:51 -0700
Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 20:47 -0700
Re: The first 10 files Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-01-26 22:02 -0400
Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:35 -0500
Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:43 -0500
Re: The first 10 files Robert Klemme <shortcutter@googlemail.com> - 2013-01-27 13:55 +0100
Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-02-24 17:50 -0500
Re: The first 10 files Robert Klemme <shortcutter@googlemail.com> - 2013-02-25 21:53 +0100
Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 20:57 -0700
Re: The first 10 files Wojtek <nowhere@a.com> - 2013-01-26 21:20 -0800
Re: The first 10 files Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-01-27 07:23 -0400
Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-27 20:36 -0500
Re: The first 10 files Wojtek <nowhere@a.com> - 2013-01-28 16:28 -0800
Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:23 -0500
Re: The first 10 files Roedy Green <see_website@mindprod.com.invalid> - 2013-01-26 19:09 -0800
Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 16:00 -0700
Re: The first 10 files Knute Johnson <nospam@knutejohnson.com> - 2013-01-26 18:37 -0800
Re: The first 10 files Wojtek <nowhere@a.com> - 2013-03-14 03:07 -0700
Re: The first 10 files lipska the kat <"nospam at neversurrender dot co dot uk"> - 2013-03-14 12:49 +0000
Re: The first 10 files Robert Klemme <shortcutter@googlemail.com> - 2013-03-15 11:38 +0100
Re: The first 10 files Wojtek <nowhere@a.com> - 2013-03-15 10:31 -0700
Page 2 of 2 — ← Prev page 1 [2]
| From | Jim Janney <jjanney@shell.xmission.com> |
|---|---|
| Date | 2013-01-26 20:47 -0700 |
| Message-ID | <ydn38xn1e41.fsf@shell.xmission.com> |
| In reply to | #21766 |
Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> writes: > On Sat, 26 Jan 2013 20:42:16 -0500, Eric Sosman wrote: > >> [...] >>>> Because the listFiles() method will fetch the information >>>> for all 30K files from the O/S, will construct 30K File objects >>>> to represent them, and will submit all 30K File objects to the >>>> FileFilter, one by one. The FileFilter will (very quickly) >>>> reject 29.99K of the 30K Files, but ... >>> >>> Will it? >> >> Necessarily. As far as listFiles() knows, the FileFilter >> might accept the very last File object given to it. Therefore, >> listFiles() cannot fail to present that very last File -- and >> every other File -- for inspection. > > Except in the way I already noted, you mean. > >> [...] >>> Indeed, I suppose one could throw an exception from the FileFilter accept() >>> method to interrupt enumeration, if that's how listFiles() is implemented. >>> That would avoid the need to enumerate more than the needed number of >>> actual files. >> >> It would also avoid the burden of returning anything from >> listFiles() -- like, say, the array of accepted files ... > > As you've already agreed, it is possible for the FileFilter implementation > to store the results itself, obviating any need for the listFiles() method > to return successfully. > > If it works (which is not assured...it depends on how listFiles() is > implemented in the first place), then yes, maybe it's a bit of a kludge. > But it's an easier, more portable kludge than writing some JNI-based > component and would in fact get the job done. > > Sometimes, when the library you're using doesn't provide exactly the > features you need, you wind up with a kludge. Oh well...shit happens. > > I'm not saying it's a great solution. But it's a far cry from a conclusion > that it simply cannot be done with the Java API as it exists now. It's an abuse of the notion of a filter, but yes, it can be made to work. I stand corrected. -- Jim Janney
[toc] | [prev] | [next] | [standalone]
| From | Arved Sandstrom <asandstrom2@eastlink.ca> |
|---|---|
| Date | 2013-01-26 22:02 -0400 |
| Message-ID | <iV%Ms.125649$Id.75544@newsfe24.iad> |
| In reply to | #21764 |
On 01/26/2013 09:42 PM, Eric Sosman wrote: > On 1/26/2013 6:21 PM, Peter Duniho wrote: >> On Sat, 26 Jan 2013 17:06:07 -0500, Eric Sosman wrote: >> >>> On 1/26/2013 4:15 PM, Robert Klemme wrote: >>>> On 26.01.2013 19:26, Arne Vajhøj wrote: >>>> >>>>> But I am a bit skeptical about whether a String[] with 30K elements >>>>> is really the bottleneck. >>>>> >>>>> If the real bottleneck is the OS calls to get next file, then >>>>> a filter like this will not help. >>>> >>>> Why? >>> >>> Because the listFiles() method will fetch the information >>> for all 30K files from the O/S, will construct 30K File objects >>> to represent them, and will submit all 30K File objects to the >>> FileFilter, one by one. The FileFilter will (very quickly) >>> reject 29.99K of the 30K Files, but ... >> >> Will it? > > Necessarily. As far as listFiles() knows, the FileFilter > might accept the very last File object given to it. Therefore, > listFiles() cannot fail to present that very last File -- and > every other File -- for inspection. [ SNIP ] I'd have to agree. A simple test shows this to be the case, but your reasoning precludes having to run such a test in the first place. My code "gets' the first N files from listFiles(), for some definition of "first", but it certainly doesn't only get N files from the OS. Based on Wojtek's later post, I'd be examining the entire problem in more detail before arriving at a decent solution. I don't think most of the problem pertaining to offering reasonable batches of files to a Java program for processing is something that I'd address in Java anyway. AHS
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-26 21:35 -0500 |
| Message-ID | <5104925e$0$284$14726298@news.sunsite.dk> |
| In reply to | #21767 |
On 1/26/2013 9:02 PM, Arved Sandstrom wrote: > On 01/26/2013 09:42 PM, Eric Sosman wrote: >> On 1/26/2013 6:21 PM, Peter Duniho wrote: >>> On Sat, 26 Jan 2013 17:06:07 -0500, Eric Sosman wrote: >>> >>>> On 1/26/2013 4:15 PM, Robert Klemme wrote: >>>>> On 26.01.2013 19:26, Arne Vajhøj wrote: >>>>> >>>>>> But I am a bit skeptical about whether a String[] with 30K elements >>>>>> is really the bottleneck. >>>>>> >>>>>> If the real bottleneck is the OS calls to get next file, then >>>>>> a filter like this will not help. >>>>> >>>>> Why? >>>> >>>> Because the listFiles() method will fetch the information >>>> for all 30K files from the O/S, will construct 30K File objects >>>> to represent them, and will submit all 30K File objects to the >>>> FileFilter, one by one. The FileFilter will (very quickly) >>>> reject 29.99K of the 30K Files, but ... >>> >>> Will it? >> >> Necessarily. As far as listFiles() knows, the FileFilter >> might accept the very last File object given to it. Therefore, >> listFiles() cannot fail to present that very last File -- and >> every other File -- for inspection. > [ SNIP ] > > I'd have to agree. A simple test shows this to be the case, but your > reasoning precludes having to run such a test in the first place. > > My code "gets' the first N files from listFiles(), for some definition > of "first", but it certainly doesn't only get N files from the OS. > > Based on Wojtek's later post, I'd be examining the entire problem in > more detail before arriving at a decent solution. I don't think most of > the problem pertaining to offering reasonable batches of files to a Java > program for processing is something that I'd address in Java anyway. If OP happens to be on Java 7, then I will suggest using: java.nio.file.Files.newDirectoryStream(dir) It is a straight forward way of getting the first N files. And it is is as likely as the exception hack to not to read all filenames from the OS. Arne
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-26 21:43 -0500 |
| Message-ID | <51049469$0$293$14726298@news.sunsite.dk> |
| In reply to | #21771 |
On 1/26/2013 9:35 PM, Arne Vajhøj wrote:
> On 1/26/2013 9:02 PM, Arved Sandstrom wrote:
>> On 01/26/2013 09:42 PM, Eric Sosman wrote:
>>> On 1/26/2013 6:21 PM, Peter Duniho wrote:
>>>> On Sat, 26 Jan 2013 17:06:07 -0500, Eric Sosman wrote:
>>>>
>>>>> On 1/26/2013 4:15 PM, Robert Klemme wrote:
>>>>>> On 26.01.2013 19:26, Arne Vajhøj wrote:
>>>>>>
>>>>>>> But I am a bit skeptical about whether a String[] with 30K elements
>>>>>>> is really the bottleneck.
>>>>>>>
>>>>>>> If the real bottleneck is the OS calls to get next file, then
>>>>>>> a filter like this will not help.
>>>>>>
>>>>>> Why?
>>>>>
>>>>> Because the listFiles() method will fetch the information
>>>>> for all 30K files from the O/S, will construct 30K File objects
>>>>> to represent them, and will submit all 30K File objects to the
>>>>> FileFilter, one by one. The FileFilter will (very quickly)
>>>>> reject 29.99K of the 30K Files, but ...
>>>>
>>>> Will it?
>>>
>>> Necessarily. As far as listFiles() knows, the FileFilter
>>> might accept the very last File object given to it. Therefore,
>>> listFiles() cannot fail to present that very last File -- and
>>> every other File -- for inspection.
>> [ SNIP ]
>>
>> I'd have to agree. A simple test shows this to be the case, but your
>> reasoning precludes having to run such a test in the first place.
>>
>> My code "gets' the first N files from listFiles(), for some definition
>> of "first", but it certainly doesn't only get N files from the OS.
>>
>> Based on Wojtek's later post, I'd be examining the entire problem in
>> more detail before arriving at a decent solution. I don't think most of
>> the problem pertaining to offering reasonable batches of files to a Java
>> program for processing is something that I'd address in Java anyway.
>
> If OP happens to be on Java 7, then I will suggest using:
>
> java.nio.file.Files.newDirectoryStream(dir)
>
> It is a straight forward way of getting the first N files.
>
> And it is is as likely as the exception hack to not to read
> all filenames from the OS.
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Iterator;
public class ListFilesWithLimit {
public static void main(String[] args) throws IOException {
Iterator<Path> dir =
Files.newDirectoryStream(Paths.get("/work")).iterator();
int n = 0;
while(dir.hasNext() && n < 10) {
System.out.println(dir.next());
}
}
}
Arne
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2013-01-27 13:55 +0100 |
| Message-ID | <amkmdqFlumnU1@mid.individual.net> |
| In reply to | #21773 |
On 27.01.2013 03:43, Arne Vajhøj wrote:
> On 1/26/2013 9:35 PM, Arne Vajhøj wrote:
>> On 1/26/2013 9:02 PM, Arved Sandstrom wrote:
>> If OP happens to be on Java 7, then I will suggest using:
>>
>> java.nio.file.Files.newDirectoryStream(dir)
>>
>> It is a straight forward way of getting the first N files.
>>
>> And it is is as likely as the exception hack to not to read
>> all filenames from the OS.
>
> import java.io.IOException;
> import java.nio.file.Files;
> import java.nio.file.Path;
> import java.nio.file.Paths;
> import java.util.Iterator;
>
> public class ListFilesWithLimit {
> public static void main(String[] args) throws IOException {
> Iterator<Path> dir =
> Files.newDirectoryStream(Paths.get("/work")).iterator();
> int n = 0;
> while(dir.hasNext() && n < 10) {
> System.out.println(dir.next());
> }
> }
> }
For earlier Java versions we could emulate that with a second thread.
package file;
import java.io.File;
import java.io.FileFilter;
import java.util.ArrayList;
import java.util.List;
import java.util.concurrent.BlockingQueue;
import java.util.concurrent.SynchronousQueue;
import java.util.concurrent.TimeUnit;
public final class ListFileTestThreaded2 {
private static final class CountFilterThread extends Thread
implements FileFilter {
private final File dir;
private final int maxFiles;
private final BlockingQueue<List<File>> queue;
private List<File> filesSeen = new ArrayList<File>();
public CountFilterThread(File dir, int maxFiles,
BlockingQueue<List<File>> queue) {
this.dir = dir;
this.maxFiles = maxFiles;
this.queue = queue;
}
@Override
public void run() {
try {
dir.listFiles(this);
if (filesSeen != null) {
send();
}
} catch (InterruptedException e) {
e.printStackTrace();
}
}
private void send() throws InterruptedException {
queue.put(filesSeen);
filesSeen = null;
}
@Override
public boolean accept(final File f) {
try {
if (filesSeen != null) {
filesSeen.add(f);
if (filesSeen.size() == maxFiles) {
send();
assert filesSeen == null;
}
}
return false;
} catch (InterruptedException e) {
throw new IllegalStateException(e);
}
}
}
private static final int[] LIMITS = { 10, 100, 1000, 10000,
Integer.MAX_VALUE };
public static void main(String[] args) throws InterruptedException {
for (final String s : args) {
System.out.println("Testing: " + s);
final File dir = new File(s);
if (dir.isDirectory()) {
for (final int limit : LIMITS) {
final SynchronousQueue<List<File>> queue = new
SynchronousQueue<List<File>>();
final CountFilterThread cf = new CountFilterThread(dir,
limit, queue);
cf.setDaemon(true);
final long t1 = System.nanoTime();
cf.start();
final List<File> entries = queue.take();
final long delta = System.nanoTime() - t1;
System.out.printf("It took %20dus to retrieve %20d files,
%20.5fus/file.\n",
TimeUnit.NANOSECONDS.toMicros(delta), entries.size(),
(double) TimeUnit.NANOSECONDS.toMicros(delta)
/ entries.size());
}
} else {
System.out.println("Not a directory.");
}
}
System.out.println("done");
}
}
https://gist.github.com/4648256
It's not guaranteed though that this will be faster. And it's
definitively not simpler than the straight forward approach. :-)
Cheers
robert
--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-02-24 17:50 -0500 |
| Message-ID | <512a993e$0$289$14726298@news.sunsite.dk> |
| In reply to | #21797 |
On 1/27/2013 7:55 AM, Robert Klemme wrote:
> On 27.01.2013 03:43, Arne Vajhøj wrote:
>> On 1/26/2013 9:35 PM, Arne Vajhøj wrote:
>>> On 1/26/2013 9:02 PM, Arved Sandstrom wrote:
>
>>> If OP happens to be on Java 7, then I will suggest using:
>>>
>>> java.nio.file.Files.newDirectoryStream(dir)
>>>
>>> It is a straight forward way of getting the first N files.
>>>
>>> And it is is as likely as the exception hack to not to read
>>> all filenames from the OS.
>>
>> import java.io.IOException;
>> import java.nio.file.Files;
>> import java.nio.file.Path;
>> import java.nio.file.Paths;
>> import java.util.Iterator;
>>
>> public class ListFilesWithLimit {
>> public static void main(String[] args) throws IOException {
>> Iterator<Path> dir =
>> Files.newDirectoryStream(Paths.get("/work")).iterator();
>> int n = 0;
>> while(dir.hasNext() && n < 10) {
>> System.out.println(dir.next());
>> }
>> }
>> }
>
> For earlier Java versions we could emulate that with a second thread.
>
> package file;
>
> import java.io.File;
> import java.io.FileFilter;
> import java.util.ArrayList;
> import java.util.List;
> import java.util.concurrent.BlockingQueue;
> import java.util.concurrent.SynchronousQueue;
> import java.util.concurrent.TimeUnit;
>
> public final class ListFileTestThreaded2 {
>
> private static final class CountFilterThread extends Thread
> implements FileFilter {
>
> private final File dir;
> private final int maxFiles;
> private final BlockingQueue<List<File>> queue;
> private List<File> filesSeen = new ArrayList<File>();
>
> public CountFilterThread(File dir, int maxFiles,
> BlockingQueue<List<File>> queue) {
> this.dir = dir;
> this.maxFiles = maxFiles;
> this.queue = queue;
> }
>
> @Override
> public void run() {
> try {
> dir.listFiles(this);
>
> if (filesSeen != null) {
> send();
> }
> } catch (InterruptedException e) {
> e.printStackTrace();
> }
> }
>
> private void send() throws InterruptedException {
> queue.put(filesSeen);
> filesSeen = null;
> }
>
> @Override
> public boolean accept(final File f) {
> try {
> if (filesSeen != null) {
> filesSeen.add(f);
>
> if (filesSeen.size() == maxFiles) {
> send();
> assert filesSeen == null;
> }
> }
>
> return false;
> } catch (InterruptedException e) {
> throw new IllegalStateException(e);
> }
> }
> }
>
> private static final int[] LIMITS = { 10, 100, 1000, 10000,
> Integer.MAX_VALUE };
>
> public static void main(String[] args) throws InterruptedException {
> for (final String s : args) {
> System.out.println("Testing: " + s);
> final File dir = new File(s);
>
> if (dir.isDirectory()) {
> for (final int limit : LIMITS) {
> final SynchronousQueue<List<File>> queue = new
> SynchronousQueue<List<File>>();
> final CountFilterThread cf = new CountFilterThread(dir,
> limit, queue);
> cf.setDaemon(true);
> final long t1 = System.nanoTime();
> cf.start();
> final List<File> entries = queue.take();
> final long delta = System.nanoTime() - t1;
> System.out.printf("It took %20dus to retrieve %20d files,
> %20.5fus/file.\n",
> TimeUnit.NANOSECONDS.toMicros(delta), entries.size(),
> (double) TimeUnit.NANOSECONDS.toMicros(delta)
> / entries.size());
> }
> } else {
> System.out.println("Not a directory.");
> }
> }
>
> System.out.println("done");
> }
>
> }
>
> https://gist.github.com/4648256
>
> It's not guaranteed though that this will be faster. And it's
> definitively not simpler than the straight forward approach. :-)
Is that much different from the throw exception in filter solution
except that it requires a lot more code?
Arne
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2013-02-25 21:53 +0100 |
| Message-ID | <ap21b9F52mbU2@mid.individual.net> |
| In reply to | #22489 |
On 24.02.2013 23:50, Arne Vajhøj wrote:
> On 1/27/2013 7:55 AM, Robert Klemme wrote:
>> On 27.01.2013 03:43, Arne Vajhøj wrote:
>>> On 1/26/2013 9:35 PM, Arne Vajhøj wrote:
>>>> On 1/26/2013 9:02 PM, Arved Sandstrom wrote:
>>
>>>> If OP happens to be on Java 7, then I will suggest using:
>>>>
>>>> java.nio.file.Files.newDirectoryStream(dir)
>>>>
>>>> It is a straight forward way of getting the first N files.
>>>>
>>>> And it is is as likely as the exception hack to not to read
>>>> all filenames from the OS.
>>>
>>> import java.io.IOException;
>>> import java.nio.file.Files;
>>> import java.nio.file.Path;
>>> import java.nio.file.Paths;
>>> import java.util.Iterator;
>>>
>>> public class ListFilesWithLimit {
>>> public static void main(String[] args) throws IOException {
>>> Iterator<Path> dir =
>>> Files.newDirectoryStream(Paths.get("/work")).iterator();
>>> int n = 0;
>>> while(dir.hasNext() && n < 10) {
>>> System.out.println(dir.next());
>>> }
>>> }
>>> }
>>
>> For earlier Java versions we could emulate that with a second thread.
>>
>> package file;
>>
>> import java.io.File;
>> import java.io.FileFilter;
>> import java.util.ArrayList;
>> import java.util.List;
>> import java.util.concurrent.BlockingQueue;
>> import java.util.concurrent.SynchronousQueue;
>> import java.util.concurrent.TimeUnit;
>>
>> public final class ListFileTestThreaded2 {
>>
>> private static final class CountFilterThread extends Thread
>> implements FileFilter {
>>
>> private final File dir;
>> private final int maxFiles;
>> private final BlockingQueue<List<File>> queue;
>> private List<File> filesSeen = new ArrayList<File>();
>>
>> public CountFilterThread(File dir, int maxFiles,
>> BlockingQueue<List<File>> queue) {
>> this.dir = dir;
>> this.maxFiles = maxFiles;
>> this.queue = queue;
>> }
>>
>> @Override
>> public void run() {
>> try {
>> dir.listFiles(this);
>>
>> if (filesSeen != null) {
>> send();
>> }
>> } catch (InterruptedException e) {
>> e.printStackTrace();
>> }
>> }
>>
>> private void send() throws InterruptedException {
>> queue.put(filesSeen);
>> filesSeen = null;
>> }
>>
>> @Override
>> public boolean accept(final File f) {
>> try {
>> if (filesSeen != null) {
>> filesSeen.add(f);
>>
>> if (filesSeen.size() == maxFiles) {
>> send();
>> assert filesSeen == null;
>> }
>> }
>>
>> return false;
>> } catch (InterruptedException e) {
>> throw new IllegalStateException(e);
>> }
>> }
>> }
>>
>> private static final int[] LIMITS = { 10, 100, 1000, 10000,
>> Integer.MAX_VALUE };
>>
>> public static void main(String[] args) throws InterruptedException {
>> for (final String s : args) {
>> System.out.println("Testing: " + s);
>> final File dir = new File(s);
>>
>> if (dir.isDirectory()) {
>> for (final int limit : LIMITS) {
>> final SynchronousQueue<List<File>> queue = new
>> SynchronousQueue<List<File>>();
>> final CountFilterThread cf = new CountFilterThread(dir,
>> limit, queue);
>> cf.setDaemon(true);
>> final long t1 = System.nanoTime();
>> cf.start();
>> final List<File> entries = queue.take();
>> final long delta = System.nanoTime() - t1;
>> System.out.printf("It took %20dus to retrieve %20d files,
>> %20.5fus/file.\n",
>> TimeUnit.NANOSECONDS.toMicros(delta), entries.size(),
>> (double) TimeUnit.NANOSECONDS.toMicros(delta)
>> / entries.size());
>> }
>> } else {
>> System.out.println("Not a directory.");
>> }
>> }
>>
>> System.out.println("done");
>> }
>>
>> }
>>
>> https://gist.github.com/4648256
>>
>> It's not guaranteed though that this will be faster. And it's
>> definitively not simpler than the straight forward approach. :-)
>
> Is that much different from the throw exception in filter solution
> except that it requires a lot more code?
No.
robert
--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/
[toc] | [prev] | [next] | [standalone]
| From | Jim Janney <jjanney@shell.xmission.com> |
|---|---|
| Date | 2013-01-26 20:57 -0700 |
| Message-ID | <ydntxq3z3ao.fsf@shell.xmission.com> |
| In reply to | #21767 |
Arved Sandstrom <asandstrom2@eastlink.ca> writes: > On 01/26/2013 09:42 PM, Eric Sosman wrote: >> On 1/26/2013 6:21 PM, Peter Duniho wrote: >>> On Sat, 26 Jan 2013 17:06:07 -0500, Eric Sosman wrote: >>> >>>> On 1/26/2013 4:15 PM, Robert Klemme wrote: >>>>> On 26.01.2013 19:26, Arne Vajhøj wrote: >>>>> >>>>>> But I am a bit skeptical about whether a String[] with 30K elements >>>>>> is really the bottleneck. >>>>>> >>>>>> If the real bottleneck is the OS calls to get next file, then >>>>>> a filter like this will not help. >>>>> >>>>> Why? >>>> >>>> Because the listFiles() method will fetch the information >>>> for all 30K files from the O/S, will construct 30K File objects >>>> to represent them, and will submit all 30K File objects to the >>>> FileFilter, one by one. The FileFilter will (very quickly) >>>> reject 29.99K of the 30K Files, but ... >>> >>> Will it? >> >> Necessarily. As far as listFiles() knows, the FileFilter >> might accept the very last File object given to it. Therefore, >> listFiles() cannot fail to present that very last File -- and >> every other File -- for inspection. > [ SNIP ] > > I'd have to agree. A simple test shows this to be the case, but your > reasoning precludes having to run such a test in the first place. > > My code "gets' the first N files from listFiles(), for some definition > of "first", but it certainly doesn't only get N files from the OS. > > Based on Wojtek's later post, I'd be examining the entire problem in > more detail before arriving at a decent solution. I don't think most > of the problem pertaining to offering reasonable batches of files to a > Java program for processing is something that I'd address in Java > anyway. There's also the problem of starvation, since we have no guarantees concerning the order of entries in the directory. -- Jim Janney
[toc] | [prev] | [next] | [standalone]
| From | Wojtek <nowhere@a.com> |
|---|---|
| Date | 2013-01-26 21:20 -0800 |
| Message-ID | <mn.d5007dd1cea6abb1.70216@a.com> |
| In reply to | #21767 |
Arved Sandstrom wrote : > I'd be examining the entire problem in more detail before arriving at a > decent solution. I don't think most of the problem pertaining to offering > reasonable batches of files to a Java program for processing is something > that I'd address in Java anyway. Events are on a per-user basis, that is to say each user has their own event list. The events are observed when the user logs in. Might be today or next week. To keep server processing reasonable I want to limit the number of events sent back to the user at a time (10 was just a number I pulled out of the air, obviously some tuning is required, and might even be dynamic depending on how busy the rest of the system is). I have no control over the number of events, how often they occur, nor how often a user logs in to look at them. 30K might be the high end, though I need to cover it if I get a busy event set and a lazy user. I might even set up a DB table for each user and store each event file as it comes in. Then use the DB to get the file names. Still white-boarding this... -- Wojtek :-)
[toc] | [prev] | [next] | [standalone]
| From | Arved Sandstrom <asandstrom2@eastlink.ca> |
|---|---|
| Date | 2013-01-27 07:23 -0400 |
| Message-ID | <278Ns.137021$pV4.59710@newsfe21.iad> |
| In reply to | #21788 |
On 01/27/2013 01:20 AM, Wojtek wrote: > Arved Sandstrom wrote : >> I'd be examining the entire problem in more detail before arriving at >> a decent solution. I don't think most of the problem pertaining to >> offering reasonable batches of files to a Java program for processing >> is something that I'd address in Java anyway. > > Events are on a per-user basis, that is to say each user has their own > event list. > > The events are observed when the user logs in. Might be today or next week. > > To keep server processing reasonable I want to limit the number of > events sent back to the user at a time (10 was just a number I pulled > out of the air, obviously some tuning is required, and might even be > dynamic depending on how busy the rest of the system is). > > I have no control over the number of events, how often they occur, nor > how often a user logs in to look at them. 30K might be the high end, > though I need to cover it if I get a busy event set and a lazy user. > > I might even set up a DB table for each user and store each event file > as it comes in. Then use the DB to get the file names. > > Still white-boarding this... > A file is not actually an unreasonable place to keep info for one event. You want to store that information *someplace*, and a file is not worse than a row in a DB table or a message on a queue somewhere. It's just that we don't want to have tens or hundreds of thousands of files in one directory. SIDE NOTE: don't set up a DB table for each user. :-) Why not use the NIO2 watch service, and observe the event file input directory for file creation events? On each such event do something with the event file. Number of options here: 1. Move it into a user-specific directory; 2. Append it to a user-specific event file; 3. Put it in a DB. etc I sort of like (2) myself. What do you mean by keeping server processing reasonable? AHS
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-27 20:36 -0500 |
| Message-ID | <5105d604$0$281$14726298@news.sunsite.dk> |
| In reply to | #21788 |
On 1/27/2013 12:20 AM, Wojtek wrote: > Arved Sandstrom wrote : >> I'd be examining the entire problem in more detail before arriving at >> a decent solution. I don't think most of the problem pertaining to >> offering reasonable batches of files to a Java program for processing >> is something that I'd address in Java anyway. > > Events are on a per-user basis, that is to say each user has their own > event list. > > The events are observed when the user logs in. Might be today or next week. > > To keep server processing reasonable I want to limit the number of > events sent back to the user at a time (10 was just a number I pulled > out of the air, obviously some tuning is required, and might even be > dynamic depending on how busy the rest of the system is). > > I have no control over the number of events, how often they occur, nor > how often a user logs in to look at them. 30K might be the high end, > though I need to cover it if I get a busy event set and a lazy user. > > I might even set up a DB table for each user and store each event file > as it comes in. Then use the DB to get the file names. > > Still white-boarding this... Java 6 no DB: Spread files out over some subdirs. Java 7 no DB: Use new NIO caoabilities. DB: Single table for all users and just use index. Arne
[toc] | [prev] | [next] | [standalone]
| From | Wojtek <nowhere@a.com> |
|---|---|
| Date | 2013-01-28 16:28 -0800 |
| Message-ID | <mn.e3dc7dd1c4cbc071.70216@a.com> |
| In reply to | #21812 |
Arne Vajhøj wrote : > DB: > > Single table for all users and just use index. Sigh, this is what comes out when I am really tired and the fermented grape juice takes effect. I have to stop thinking about this stuff on weekends... -- Wojtek :-)
[toc] | [prev] | [next] | [standalone]
| From | Arne Vajhøj <arne@vajhoej.dk> |
|---|---|
| Date | 2013-01-26 21:23 -0500 |
| Message-ID | <51048f97$0$284$14726298@news.sunsite.dk> |
| In reply to | #21760 |
On 1/26/2013 6:21 PM, Peter Duniho wrote: > On Sat, 26 Jan 2013 17:06:07 -0500, Eric Sosman wrote: > >> On 1/26/2013 4:15 PM, Robert Klemme wrote: >>> On 26.01.2013 19:26, Arne Vajhøj wrote: >>> >>>> But I am a bit skeptical about whether a String[] with 30K elements >>>> is really the bottleneck. >>>> >>>> If the real bottleneck is the OS calls to get next file, then >>>> a filter like this will not help. >>> >>> Why? >> >> Because the listFiles() method will fetch the information >> for all 30K files from the O/S, will construct 30K File objects >> to represent them, and will submit all 30K File objects to the >> FileFilter, one by one. The FileFilter will (very quickly) >> reject 29.99K of the 30K Files, but ... > > Will it? > > It is plausible that the implementation of listFiles() uses an OS API that > enumerates files one at a time. On Windows, getting the first file of the > enumeration is faster than asking for all the files at once. > > Indeed, I suppose one could throw an exception from the FileFilter accept() > method to interrupt enumeration, if that's how listFiles() is implemented. > That would avoid the need to enumerate more than the needed number of > actual files. > > Of course, this is all implementation-dependent and since it's not > explicitly documented, could change at any time anyway. But unless you've > actually examined the implementation details for listFiles(), it's not a > foregone conclusion that the technique of using a FileFilter offers no way > to improve latency. It is a foregone conclusion that the posted code that Eric commented on would read all files, because it did not throw an exception. Code with a different logic could behave differently. Arne
[toc] | [prev] | [next] | [standalone]
| From | Roedy Green <see_website@mindprod.com.invalid> |
|---|---|
| Date | 2013-01-26 19:09 -0800 |
| Message-ID | <2469g89u7vpchs8lo0lbc7dh7lrtldslor@4ax.com> |
| In reply to | #21760 |
On Sat, 26 Jan 2013 15:21:53 -0800, Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> wrote, quoted or indirectly quoted someone who said : >Indeed, I suppose one could throw an exception from the FileFilter accept() >method to interrupt enumeration, if that's how listFiles() is implemented. >That would avoid the need to enumerate more than the needed number of >actual files. you could resolve that question with some System.nanotime dumps. How long for first to show up relative to others. IIRC is builds the array then feeds it to the Filter, but that could just have been someone explaining how it works conceptually. I do know that Java takes a lot longer to span a disk than C . Building the array first means less native code needed for multiplatform implementation. For most applications, you need to run every file name through the filter so it does not matter which you do first. You would save building File objects for items not passing the Filter. -- Roedy Green Canadian Mind Products http://mindprod.com The first 90% of the code accounts for the first 90% of the development time. The remaining 10% of the code accounts for the other 90% of the development time. ~ Tom Cargill Ninety-ninety Law
[toc] | [prev] | [next] | [standalone]
| From | Jim Janney <jjanney@shell.xmission.com> |
|---|---|
| Date | 2013-01-26 16:00 -0700 |
| Message-ID | <ydnbocb1rdx.fsf@shell.xmission.com> |
| In reply to | #21727 |
Wojtek <nowhere@a.com> writes:
> Using:
>
> int max = 10;
> int count = 0;
>
> for (File thisFile : aDir.listFiles())
> {
> doSomething(thisFile);
>
> if ( ++count >= max )
> break;
> }
>
> gives me the first ten files in aDir. But if aDir contains 30K files,
> then the listFiles() will run for a long time as it builds an array
> for the 30K files.
>
> Is there a way to have Java only get the first "max" files?
As Roedy says, in pure Java there's no way to avoid reading the entire
directory, whether it builds the entire array or not. And if you want
them in any particular order it's necessary to read them all and sort
them anyway.
--
Jim Janney
[toc] | [prev] | [next] | [standalone]
| From | Knute Johnson <nospam@knutejohnson.com> |
|---|---|
| Date | 2013-01-26 18:37 -0800 |
| Message-ID | <ke23sm$afb$1@dont-email.me> |
| In reply to | #21727 |
On 1/26/2013 1:14 AM, Wojtek wrote:
> Using:
>
> int max = 10;
> int count = 0;
>
> for (File thisFile : aDir.listFiles())
> {
> doSomething(thisFile);
>
> if ( ++count >= max )
> break;
> }
>
> gives me the first ten files in aDir. But if aDir contains 30K files,
> then the listFiles() will run for a long time as it builds an array for
> the 30K files.
>
> Is there a way to have Java only get the first "max" files?
>
import java.io.*;
import java.nio.*;
import java.nio.file.*;
public class FileSystemsTest {
public static void main(String[] args) throws IOException {
long start = System.currentTimeMillis();
Path dir = FileSystems.getDefault().getPath(".");
int i=10;
DirectoryStream<Path> stream = Files.newDirectoryStream(dir);
for (Path path : stream) {
System.out.println(path.getFileName());
if (--i <= 0)
break;
}
long stop = System.currentTimeMillis();
System.out.println(stop - start);
}
}
300003 files in the directory, almost 1.7GB of files, Windows XP, Java 7
and it takes 16 ms to run. Somebody else should try this out on their
computer to see if it works as fast.
.
.
.
01/26/2013 05:46 PM 58,890 9998.txt
01/26/2013 05:46 PM 58,890 9999.txt
01/26/2013 06:31 PM 1,316 FileSystemsTest.class
01/26/2013 06:29 PM 636 FileSystemsTest.java
01/26/2013 05:44 PM 650 MakeFiles.java
30003 File(s) 1,766,702,602 bytes
2 Dir(s) 49,387,085,824 bytes free
C:\Documents and Settings\Knute Johnson\bigdirectory>java FileSystemsTest
0.txt
1.txt
10.txt
100.txt
1000.txt
10000.txt
10001.txt
10002.txt
10003.txt
10004.txt
16
C:\Documents and Settings\Knute Johnson\bigdirectory>
--
Knute Johnson
[toc] | [prev] | [next] | [standalone]
| From | Wojtek <nowhere@a.com> |
|---|---|
| Date | 2013-03-14 03:07 -0700 |
| Message-ID | <mn.70bb7dd3ff682a5b.70216@a.com> |
| In reply to | #21772 |
Knute Johnson wrote :
>
> 300003 files in the directory, almost 1.7GB of files, Windows XP, Java 7 and
> it takes 16 ms to run. Somebody else should try this out on their computer
> to see if it works as fast.
Ok, I'm back :-)
I am using WinXP and Java 7, and the directory holds 30,001 32K files,
920MBytes
The Code:
----------------------------------------------------
package tester;
import java.io.File;
import java.io.IOException;
import java.nio.file.DirectoryStream;
import java.nio.file.FileSystems;
import java.nio.file.Files;
import java.nio.file.Path;
public class NewsGroup
{
public static void main( String[] args ) throws IOException
{
int maxFiles = 10;
System.out.println( "Large File Number Tester" );
if (args[0].equals( "nio" ))
nioRun( "C:\\apps\\test", maxFiles );
else if (args[0].equals( "io" ))
ioRun( "C:\\apps\\test", maxFiles );
else
System.out.println( "NewsGroup io|nio" );
}
private static void ioRun( String filePath, int maxFiles ) throws
IOException
{
int i = 1;
System.out.println( "IO run" );
long start = System.currentTimeMillis();
File folder = new File( filePath );
File[] listOfFiles = folder.listFiles();
for (File file : listOfFiles)
{
System.out.println( "" + i + ": " + file.getName() );
if (++i > maxFiles)
break;
}
long stop = System.currentTimeMillis();
System.out.println( "Elapsed: " + (stop - start) + " ms" );
}
private static void nioRun( String filePath, int maxFiles ) throws
IOException
{
int i = 1;
System.out.println( "NIO run" );
long start = System.currentTimeMillis();
Path dir = FileSystems.getDefault().getPath( filePath );
DirectoryStream<Path> stream = Files.newDirectoryStream( dir );
for (Path path : stream)
{
System.out.println( "" + i + ": " + path.getFileName() );
if (++i > maxFiles)
break;
}
long stop = System.currentTimeMillis();
System.out.println( "Elapsed: " + (stop - start) + " ms" );
}
}
----------------------------------------------------
A batch file to run it:
----------------------------------------------------
@echo off
java -jar NewsGroup.jar %1
----------------------------------------------------
And the results:
----------------------------------------------------
C:\apps>run io
Large File Number Tester
IO run
1: F_000000.jpg
2: F_000001.jpg
3: F_000002.jpg
4: F_000003.jpg
5: F_000004.jpg
6: F_000005.jpg
7: F_000006.jpg
8: F_000007.jpg
9: F_000008.jpg
10: F_000009.jpg
Elapsed: 156 ms
C:\apps>run io
Large File Number Tester
IO run
1: F_000000.jpg
2: F_000001.jpg
3: F_000002.jpg
4: F_000003.jpg
5: F_000004.jpg
6: F_000005.jpg
7: F_000006.jpg
8: F_000007.jpg
9: F_000008.jpg
10: F_000009.jpg
Elapsed: 140 ms
C:\apps>run io
Large File Number Tester
IO run
1: F_000000.jpg
2: F_000001.jpg
3: F_000002.jpg
4: F_000003.jpg
5: F_000004.jpg
6: F_000005.jpg
7: F_000006.jpg
8: F_000007.jpg
9: F_000008.jpg
10: F_000009.jpg
Elapsed: 156 ms
C:\apps>run nio
Large File Number Tester
NIO run
1: F_000000.jpg
2: F_000001.jpg
3: F_000002.jpg
4: F_000003.jpg
5: F_000004.jpg
6: F_000005.jpg
7: F_000006.jpg
8: F_000007.jpg
9: F_000008.jpg
10: F_000009.jpg
Elapsed: 219 ms
C:\apps>run nio
Large File Number Tester
NIO run
1: F_000000.jpg
2: F_000001.jpg
3: F_000002.jpg
4: F_000003.jpg
5: F_000004.jpg
6: F_000005.jpg
7: F_000006.jpg
8: F_000007.jpg
9: F_000008.jpg
10: F_000009.jpg
Elapsed: 31 ms
C:\apps>run nio
Large File Number Tester
NIO run
1: F_000000.jpg
2: F_000001.jpg
3: F_000002.jpg
4: F_000003.jpg
5: F_000004.jpg
6: F_000005.jpg
7: F_000006.jpg
8: F_000007.jpg
9: F_000008.jpg
10: F_000009.jpg
Elapsed: 31 ms
C:\apps>run nio
Large File Number Tester
NIO run
1: F_000000.jpg
2: F_000001.jpg
3: F_000002.jpg
4: F_000003.jpg
5: F_000004.jpg
6: F_000005.jpg
7: F_000006.jpg
8: F_000007.jpg
9: F_000008.jpg
10: F_000009.jpg
Elapsed: 78 ms
C:\apps>
----------------------------------------------------
So NIO is about 4-5 times faster than IO. The first NIO run looks like
an anomoly, might be some JRE loading happening.
All the runs produce different timings, might be a Windows caching
effect. However the NIO is consistently much faster overall.
--
Wojtek :-)
[toc] | [prev] | [next] | [standalone]
| From | lipska the kat <"nospam at neversurrender dot co dot uk"> |
|---|---|
| Date | 2013-03-14 12:49 +0000 |
| Message-ID | <RvudnYCk24DlWtzMnZ2dnUVZ8iqdnZ2d@bt.com> |
| In reply to | #22952 |
On 14/03/13 10:07, Wojtek wrote: > Knute Johnson wrote : >> >> 300003 files in the directory, almost 1.7GB of files, Windows XP, Java >> 7 and it takes 16 ms to run. Somebody else should try this out on >> their computer to see if it works as fast. > > Ok, I'm back :-) [snip] Ubuntu Linux 12.04 64 bit, default kernel Intel® Pentium(R) CPU B960 @ 2.20GHz × 2 java-7-openjdk-amd64 64-Bit Server VM (build 22.0-b10, mixed mode) 4GB RAM ls /var/images | wc -l 1292 Find below the results for 10 runs for a count of 10 files results for 100 runs for a count of 100 files ommitting the filename output but executing file.getName(); or path.getFileName(); available here http://pastebin.com/tyFni9xA Large File Number Tester ============= Run 1 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 36 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 9 ms ============= Run 2 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 13 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 2 ms ============= Run 3 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 19 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 1 ms ============= Run 4 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 17 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 2 ms ============= Run 5 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 8 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 1 ms ============= Run 6 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 7 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 0 ms ============= Run 7 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 7 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 1 ms ============= Run 8 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 9 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 2 ms ============= Run 9 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 6 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 2 ms ============= Run 10 ==================== IO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 6 ms ---------- NIO run 1: tn_c0DgCQGwMklC.jpg 2: tn_7O3dRExnb1z5.jpg 3: tn_rljSB5H8yWGr.jpg 4: BpBBTq5FmOem.jpg 5: tn_FSc7Cn8S2KQP.jpg 6: tn_qRTnPNB7BdQq.jpg 7: IGbwhM3DsMlu.jpg 8: tn_VSetEaWqbuMD.jpg 9: IR081OywEjqb.jpg 10: cvNziQ9ecLeq.jpg Elapsed: 2 ms lipska -- Lipska the Kat©: Troll hunter, sandbox destroyer and farscape dreamer of Aeryn Sun
[toc] | [prev] | [next] | [standalone]
| From | Robert Klemme <shortcutter@googlemail.com> |
|---|---|
| Date | 2013-03-15 11:38 +0100 |
| Message-ID | <aqgc14FilbeU1@mid.individual.net> |
| In reply to | #22952 |
On 14.03.2013 11:07, Wojtek wrote: > So NIO is about 4-5 times faster than IO. The first NIO run looks like > an anomoly, might be some JRE loading happening. > > All the runs produce different timings, might be a Windows caching > effect. However the NIO is consistently much faster overall. I am not convinced that this conclusion is warranted. There are a few factors which I believe make your conclusion questionable: - You included class loading time in your measurement. For example, assuming that all io functionality is implemented on top of nio it would be logical to expect more classes to be loaded. There are a number of use cases where it matters - but there are also use cases where it doesn't matter (long running servers). - Generally we are dealing with quite low timings (around 100ms) and relatively high variations. Also the test was made on Windows and the System.currentTimeMillis() is known to be imprecise on that platform (in the order of tens of milliseconds). - Your io approach does not use FileFilter which some have suggested to be a way to avoid constructing a large result array. - The test is an artificial situation. With all factors like JVM involved it may be that in a realistic application things look different to an extent that the differences you measured here do not matter any more. Kind regards robert -- remember.guy do |as, often| as.you_can - without end http://blog.rubybestpractices.com/
[toc] | [prev] | [next] | [standalone]
| From | Wojtek <nowhere@a.com> |
|---|---|
| Date | 2013-03-15 10:31 -0700 |
| Message-ID | <mn.7a777dd3df9ee7cf.70216@a.com> |
| In reply to | #22958 |
Robert Klemme wrote : > On 14.03.2013 11:07, Wojtek wrote: > >> So NIO is about 4-5 times faster than IO. The first NIO run looks like >> an anomoly, might be some JRE loading happening. >> >> All the runs produce different timings, might be a Windows caching >> effect. However the NIO is consistently much faster overall. > > I am not convinced that this conclusion is warranted. There are a few > factors which I believe make your conclusion questionable: > > - You included class loading time in your measurement. For example, assuming > that all io functionality is implemented on top of nio it would be logical to > expect more classes to be loaded. There are a number of use cases where it > matters - but there are also use cases where it doesn't matter (long running > servers). The class loading would be part of a real project. The alternative is to keep an object around which holds a link to a directory. Actually many objects linked to many directories. Also the file list will change as files are added and deleted. > - Generally we are dealing with quite low timings (around 100ms) and > relatively high variations. Also the test was made on Windows and the > System.currentTimeMillis() is known to be imprecise on that platform (in the > order of tens of milliseconds). Fair enough. > - Your io approach does not use FileFilter which some have suggested to be a > way to avoid constructing a large result array. Yes, but to filter the result still means loading each file name, then checking to see if it matches the filter. > - The test is an artificial situation. With all factors like JVM involved it > may be that in a realistic application things look different to an extent > that the differences you measured here do not matter any more. While the absolute times may be questionable, the relative times are consistent. -- Wojtek :-)
[toc] | [prev] | [standalone]
Page 2 of 2 — ← Prev page 1 [2]
Back to top | Article view | comp.lang.java.programmer
csiph-web