Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.lang.java.programmer > #21727 > unrolled thread

The first 10 files

Started byWojtek <nowhere@a.com>
First post2013-01-26 01:14 -0800
Last post2013-03-15 10:31 -0700
Articles 20 on this page of 40 — 12 participants

Back to article view | Back to comp.lang.java.programmer


Contents

  The first 10 files Wojtek <nowhere@a.com> - 2013-01-26 01:14 -0800
    Re: The first 10 files Roedy Green <see_website@mindprod.com.invalid> - 2013-01-26 02:44 -0800
      Re: The first 10 files Lew <lewbloch@gmail.com> - 2013-01-26 10:20 -0800
    Re: The first 10 files "John B. Matthews" <nospam@nospam.invalid> - 2013-01-26 06:31 -0500
      Re: The first 10 files Wojtek <nowhere@a.com> - 2013-01-26 15:42 -0800
        Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 17:13 -0700
        Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:21 -0500
        Re: The first 10 files "John B. Matthews" <nospam@nospam.invalid> - 2013-01-26 22:05 -0500
    Re: The first 10 files Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-01-26 08:24 -0400
      Re: The first 10 files Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-01-26 08:25 -0400
      Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 13:26 -0500
        Re: The first 10 files Robert Klemme <shortcutter@googlemail.com> - 2013-01-26 22:15 +0100
          Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 16:25 -0500
          Re: The first 10 files Eric Sosman <esosman@comcast-dot-net.invalid> - 2013-01-26 17:06 -0500
            Re: The first 10 files Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2013-01-26 15:21 -0800
              Re: The first 10 files Eric Sosman <esosman@comcast-dot-net.invalid> - 2013-01-26 20:42 -0500
                Re: The first 10 files Peter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com> - 2013-01-26 17:56 -0800
                  Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:29 -0500
                  Re: The first 10 files Eric Sosman <esosman@comcast-dot-net.invalid> - 2013-01-26 21:56 -0500
                    Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 20:51 -0700
                  Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 20:47 -0700
                Re: The first 10 files Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-01-26 22:02 -0400
                  Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:35 -0500
                    Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:43 -0500
                      Re: The first 10 files Robert Klemme <shortcutter@googlemail.com> - 2013-01-27 13:55 +0100
                        Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-02-24 17:50 -0500
                          Re: The first 10 files Robert Klemme <shortcutter@googlemail.com> - 2013-02-25 21:53 +0100
                  Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 20:57 -0700
                  Re: The first 10 files Wojtek <nowhere@a.com> - 2013-01-26 21:20 -0800
                    Re: The first 10 files Arved Sandstrom <asandstrom2@eastlink.ca> - 2013-01-27 07:23 -0400
                    Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-27 20:36 -0500
                      Re: The first 10 files Wojtek <nowhere@a.com> - 2013-01-28 16:28 -0800
              Re: The first 10 files Arne Vajhøj <arne@vajhoej.dk> - 2013-01-26 21:23 -0500
              Re: The first 10 files Roedy Green <see_website@mindprod.com.invalid> - 2013-01-26 19:09 -0800
    Re: The first 10 files Jim Janney <jjanney@shell.xmission.com> - 2013-01-26 16:00 -0700
    Re: The first 10 files Knute Johnson <nospam@knutejohnson.com> - 2013-01-26 18:37 -0800
      Re: The first 10 files Wojtek <nowhere@a.com> - 2013-03-14 03:07 -0700
        Re: The first 10 files lipska the kat <"nospam at neversurrender dot co dot uk"> - 2013-03-14 12:49 +0000
        Re: The first 10 files Robert Klemme <shortcutter@googlemail.com> - 2013-03-15 11:38 +0100
          Re: The first 10 files Wojtek <nowhere@a.com> - 2013-03-15 10:31 -0700

Page 1 of 2  [1] 2  Next page →


#21727 — The first 10 files

FromWojtek <nowhere@a.com>
Date2013-01-26 01:14 -0800
SubjectThe first 10 files
Message-ID<mn.d04a7dd156c148ef.70216@a.com>
Using:

int max = 10;
int count = 0;

for (File thisFile : aDir.listFiles())
{
  doSomething(thisFile);

  if ( ++count >= max )
    break;
}

gives me the first ten files in aDir. But if aDir contains 30K files, 
then the listFiles() will run for a long time as it builds an array for 
the 30K files.

Is there a way to have Java only get the first "max" files?

-- 
Wojtek :-)

[toc] | [next] | [standalone]


#21728

FromRoedy Green <see_website@mindprod.com.invalid>
Date2013-01-26 02:44 -0800
Message-ID<3sc7g8tm139j2aubd189pheotsbh3aomg1@4ax.com>
In reply to#21727
On Sat, 26 Jan 2013 01:14:18 -0800, Wojtek <nowhere@a.com> wrote,
quoted or indirectly quoted someone who said :

>Is there a way to have Java only get the first "max" files?

not without jni
-- 
Roedy Green Canadian Mind Products http://mindprod.com
The first 90% of the code accounts for the first 90% of the development time.
The remaining 10% of the code accounts for the other 90% of the development 
time. 
~ Tom Cargill  Ninety-ninety Law 

[toc] | [prev] | [next] | [standalone]


#21742

FromLew <lewbloch@gmail.com>
Date2013-01-26 10:20 -0800
Message-ID<fa20dd32-6a30-49fa-bee7-8e49e8179228@googlegroups.com>
In reply to#21728
Roedy Green wrote:
> Wojtek wrote, quoted or indirectly quoted someone who said :
>> Is there a way to have Java only get the first "max" files?
> 
> not without jni [sic]
 
Several useful answers from other respondents indicate otherwise.

-- 
Lew

[toc] | [prev] | [next] | [standalone]


#21734

From"John B. Matthews" <nospam@nospam.invalid>
Date2013-01-26 06:31 -0500
Message-ID<nospam-E6AD85.06312326012013@news.aioe.org>
In reply to#21727
In article <mn.d04a7dd156c148ef.70216@a.com>, Wojtek <nowhere@a.com> 
wrote:

> Using:
> 
> int max = 10;
> int count = 0;
> 
> for (File thisFile : aDir.listFiles())
> {
>   doSomething(thisFile);
> 
>   if ( ++count >= max )
>     break;
> }
> 
> gives me the first ten files in aDir. But if aDir contains 30K files, 
> then the listFiles() will run for a long time as it builds an array 
> for the 30K files.
> 
> Is there a way to have Java only get the first "max" files?

In a GUI context, one approach uses a SwingWorker to query the file 
system in the background and update a `TableModel` in the worker's 
process() method. A complete example is examined here:

<http://codereview.stackexchange.com/q/4446/6692>

Although it may be beyond your control, you should also critically 
assess a design having tens of thousands of files in a single 
directory. 

-- 
John B. Matthews
trashgod at gmail dot com
<http://sites.google.com/site/drjohnbmatthews>

[toc] | [prev] | [next] | [standalone]


#21761

FromWojtek <nowhere@a.com>
Date2013-01-26 15:42 -0800
Message-ID<mn.d3ae7dd1c4b8d7a3.70216@a.com>
In reply to#21734
John B. Matthews wrote :
> Although it may be beyond your control, you should also critically
> assess a design having tens of thousands of files in a single
> directory.

Well of course.

The directory holds files which are uploaded by external events. If 
there are a lot of events between application runs, then the number of 
files can indeed reach large numbers.

Since this is happening on a server, and you cam potentially have many 
hundreds of people accessing at the same time (each with there own 
directory), I was hoping to be able to "stage" file processing.

The:

 public boolean accept(File pathname) {
             return maxFiles-- > 0;
         }

in FileFilter is interesting, but the file system nevertheless still 
runs through the entire directory. Maybe FileFilter needs:

public boolean abort(File pathname);

Hmm, maybe I need a timed background process to move files to "holding" 
directories which will be limited to a small number of files.

-- 
Wojtek :-)

[toc] | [prev] | [next] | [standalone]


#21762

FromJim Janney <jjanney@shell.xmission.com>
Date2013-01-26 17:13 -0700
Message-ID<ydn7gmz1o0w.fsf@shell.xmission.com>
In reply to#21761
Wojtek <nowhere@a.com> writes:

> John B. Matthews wrote :
>> Although it may be beyond your control, you should also critically
>> assess a design having tens of thousands of files in a single
>> directory.
>
> Well of course.
>
> The directory holds files which are uploaded by external events. If
> there are a lot of events between application runs, then the number of
> files can indeed reach large numbers.
>
> Since this is happening on a server, and you cam potentially have many
> hundreds of people accessing at the same time (each with there own
> directory), I was hoping to be able to "stage" file processing.
>
> The:
>
> public boolean accept(File pathname) {
>             return maxFiles-- > 0;
>         }
>
> in FileFilter is interesting, but the file system nevertheless still
> runs through the entire directory. Maybe FileFilter needs:
>
> public boolean abort(File pathname);
>
> Hmm, maybe I need a timed background process to move files to
> "holding" directories which will be limited to a small number of
> files.

You could run a command in a subprocess.  On a Unix system

    ls -U | head -n 10

should run quickly (-U tells it not to sort).  Not sure how to do that
on Windows.

-- 
Jim Janney

[toc] | [prev] | [next] | [standalone]


#21768

FromArne Vajhøj <arne@vajhoej.dk>
Date2013-01-26 21:21 -0500
Message-ID<51048f19$0$284$14726298@news.sunsite.dk>
In reply to#21761
On 1/26/2013 6:42 PM, Wojtek wrote:
> John B. Matthews wrote :
>> Although it may be beyond your control, you should also critically
>> assess a design having tens of thousands of files in a single
>> directory.
>
> Well of course.
>
> The directory holds files which are uploaded by external events. If
> there are a lot of events between application runs, then the number of
> files can indeed reach large numbers.
>
> Since this is happening on a server, and you cam potentially have many
> hundreds of people accessing at the same time (each with there own
> directory), I was hoping to be able to "stage" file processing.

No matter how and why these files end up there then you should
consider spreading them out in multiple directories.

Arne

[toc] | [prev] | [next] | [standalone]


#21776

From"John B. Matthews" <nospam@nospam.invalid>
Date2013-01-26 22:05 -0500
Message-ID<nospam-7FC141.22053226012013@news.aioe.org>
In reply to#21761
In article <mn.d3ae7dd1c4b8d7a3.70216@a.com>, Wojtek <nowhere@a.com> 
wrote:

> John B. Matthews wrote :
> > Although it may be beyond your control, you should also critically 
> > assess a design having tens of thousands of files in a single 
> > directory.
> 
> Well of course.
> 
> The directory holds files which are uploaded by external events. If 
> there are a lot of events between application runs, then the number 
> of files can indeed reach large numbers.
> 
> Since this is happening on a server, and you cam potentially have 
> many hundreds of people accessing at the same time (each with there 
> own directory), I was hoping to be able to "stage" file processing.
> 
> The:
> 
>  public boolean accept(File pathname) {
>              return maxFiles-- > 0;
>          }
> 
> in FileFilter is interesting, but the file system nevertheless still 
> runs through the entire directory. Maybe FileFilter needs:
> 
> public boolean abort(File pathname);
> 
> Hmm, maybe I need a timed background process to move files to "holding" 
> directories which will be limited to a small number of files.

If Java 7 is available, a java.nio.file.WatchService may be helpful in 
detecting (subsequent) changes while running.

<http://docs.oracle.com/javase/tutorial/essential/io/notification.html>

-- 
John B. Matthews
trashgod at gmail dot com
<http://sites.google.com/site/drjohnbmatthews>

[toc] | [prev] | [next] | [standalone]


#21736

FromArved Sandstrom <asandstrom2@eastlink.ca>
Date2013-01-26 08:24 -0400
Message-ID<eWPMs.128009$tG.112190@newsfe15.iad>
In reply to#21727
On 01/26/2013 05:14 AM, Wojtek wrote:
> Using:
>
> int max = 10;
> int count = 0;
>
> for (File thisFile : aDir.listFiles())
> {
>   doSomething(thisFile);
>
>   if ( ++count >= max )
>     break;
> }
>
> gives me the first ten files in aDir. But if aDir contains 30K files,
> then the listFiles() will run for a long time as it builds an array for
> the 30K files.
>
> Is there a way to have Java only get the first "max" files?
>
One way of doing it, which you can find by Googling but should occur to 
you if you read the File Javadocs carefully, is below.

I've run this in an IDE with the working directory set to where I 
touched a few hundred files.

The files returned will not be in any order; OTOH you didn't indicate 
what you meant by "first".

AHS

-------------------------
package org.ahs.files;

import java.io.File;
import java.io.FileFilter;
import java.util.Arrays;

public class ShortFileList {

     final int maxFiles = 30;

     public static void main(String[] args) {
         if (args.length != 1) {
             System.err.println("Usage: ShortFileList <limit>");
         }
         // let NFE throw
         Integer limit = Integer.parseInt(args[0]);

         File testDir = new File(".");
         File[] files = testDir.listFiles(new MyFileFilter(limit));
         System.out.println(files.length);
         System.out.println(Arrays.asList(files));
     }

     static class MyFileFilter implements FileFilter {

         int maxFiles;

         public MyFileFilter(int maxFiles) {
             this.maxFiles = maxFiles;
         }

         @Override
         public boolean accept(File pathname) {
             return maxFiles-- > 0;
         }
     }
}

[toc] | [prev] | [next] | [standalone]


#21737

FromArved Sandstrom <asandstrom2@eastlink.ca>
Date2013-01-26 08:25 -0400
Message-ID<9XPMs.128010$tG.98995@newsfe15.iad>
In reply to#21736
On 01/26/2013 08:24 AM, Arved Sandstrom wrote:
> On 01/26/2013 05:14 AM, Wojtek wrote:
>> Using:
>>
>> int max = 10;
>> int count = 0;
>>
>> for (File thisFile : aDir.listFiles())
>> {
>>   doSomething(thisFile);
>>
>>   if ( ++count >= max )
>>     break;
>> }
>>
>> gives me the first ten files in aDir. But if aDir contains 30K files,
>> then the listFiles() will run for a long time as it builds an array for
>> the 30K files.
>>
>> Is there a way to have Java only get the first "max" files?
>>
> One way of doing it, which you can find by Googling but should occur to
> you if you read the File Javadocs carefully, is below.
>
> I've run this in an IDE with the working directory set to where I
> touched a few hundred files.
>
> The files returned will not be in any order; OTOH you didn't indicate
> what you meant by "first".
>
> AHS
>
> -------------------------
> package org.ahs.files;
>
> import java.io.File;
> import java.io.FileFilter;
> import java.util.Arrays;
>
> public class ShortFileList {
>
>      final int maxFiles = 30;

IGNORE this variable, earlier experimental version.

[toc] | [prev] | [next] | [standalone]


#21743

FromArne Vajhøj <arne@vajhoej.dk>
Date2013-01-26 13:26 -0500
Message-ID<51041ff8$0$284$14726298@news.sunsite.dk>
In reply to#21736
On 1/26/2013 7:24 AM, Arved Sandstrom wrote:
> On 01/26/2013 05:14 AM, Wojtek wrote:
>> Using:
>>
>> int max = 10;
>> int count = 0;
>>
>> for (File thisFile : aDir.listFiles())
>> {
>>   doSomething(thisFile);
>>
>>   if ( ++count >= max )
>>     break;
>> }
>>
>> gives me the first ten files in aDir. But if aDir contains 30K files,
>> then the listFiles() will run for a long time as it builds an array for
>> the 30K files.
>>
>> Is there a way to have Java only get the first "max" files?
>>
> One way of doing it, which you can find by Googling but should occur to
> you if you read the File Javadocs carefully, is below.

>          Integer limit = Integer.parseInt(args[0]);
>          File testDir = new File(".");
>          File[] files = testDir.listFiles(new MyFileFilter(limit));

>      static class MyFileFilter implements FileFilter {
>
>          int maxFiles;
>
>          public MyFileFilter(int maxFiles) {
>              this.maxFiles = maxFiles;
>          }
>
>          @Override
>          public boolean accept(File pathname) {
>              return maxFiles-- > 0;
>          }
>      }
> }

If the problems is as described by OP then that must be the
correct solution.

"will run for a long time as it builds an array for the 30K files"

does not happen with this solution.

But I am a bit skeptical about whether a String[] with 30K elements
is really the bottleneck.

If the real bottleneck is the OS calls to get next file, then
a filter like this will not help.

Arne


[toc] | [prev] | [next] | [standalone]


#21750

FromRobert Klemme <shortcutter@googlemail.com>
Date2013-01-26 22:15 +0100
Message-ID<amivb3Fag8iU1@mid.individual.net>
In reply to#21743
On 26.01.2013 19:26, Arne Vajhøj wrote:

> But I am a bit skeptical about whether a String[] with 30K elements
> is really the bottleneck.
>
> If the real bottleneck is the OS calls to get next file, then
> a filter like this will not help.

Why?

Kind regards

	robert

-- 
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

[toc] | [prev] | [next] | [standalone]


#21752

FromArne Vajhøj <arne@vajhoej.dk>
Date2013-01-26 16:25 -0500
Message-ID<510449d2$0$285$14726298@news.sunsite.dk>
In reply to#21750
On 1/26/2013 4:15 PM, Robert Klemme wrote:
> On 26.01.2013 19:26, Arne Vajhøj wrote:
>> But I am a bit skeptical about whether a String[] with 30K elements
>> is really the bottleneck.
>>
>> If the real bottleneck is the OS calls to get next file, then
>> a filter like this will not help.
>
> Why?

A String[] with 30K elements should be blazing fast
compared to anything that hits a disk.

And it would read all filenames and call the filter for
each of them.

Arne

[toc] | [prev] | [next] | [standalone]


#21756

FromEric Sosman <esosman@comcast-dot-net.invalid>
Date2013-01-26 17:06 -0500
Message-ID<ke1k0f$nqj$1@dont-email.me>
In reply to#21750
On 1/26/2013 4:15 PM, Robert Klemme wrote:
> On 26.01.2013 19:26, Arne Vajhøj wrote:
>
>> But I am a bit skeptical about whether a String[] with 30K elements
>> is really the bottleneck.
>>
>> If the real bottleneck is the OS calls to get next file, then
>> a filter like this will not help.
>
> Why?

     Because the listFiles() method will fetch the information
for all 30K files from the O/S, will construct 30K File objects
to represent them, and will submit all 30K File objects to the
FileFilter, one by one.  The FileFilter will (very quickly)
reject 29.99K of the 30K Files, but ...

-- 
Eric Sosman
esosman@comcast-dot-net.invalid

[toc] | [prev] | [next] | [standalone]


#21760

FromPeter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com>
Date2013-01-26 15:21 -0800
Message-ID<1iop8bl8ysrfg$.rdxcxhgxuj1r$.dlg@40tude.net>
In reply to#21756
On Sat, 26 Jan 2013 17:06:07 -0500, Eric Sosman wrote:

> On 1/26/2013 4:15 PM, Robert Klemme wrote:
>> On 26.01.2013 19:26, Arne Vajhøj wrote:
>>
>>> But I am a bit skeptical about whether a String[] with 30K elements
>>> is really the bottleneck.
>>>
>>> If the real bottleneck is the OS calls to get next file, then
>>> a filter like this will not help.
>>
>> Why?
> 
>      Because the listFiles() method will fetch the information
> for all 30K files from the O/S, will construct 30K File objects
> to represent them, and will submit all 30K File objects to the
> FileFilter, one by one.  The FileFilter will (very quickly)
> reject 29.99K of the 30K Files, but ...

Will it?

It is plausible that the implementation of listFiles() uses an OS API that
enumerates files one at a time. On Windows, getting the first file of the
enumeration is faster than asking for all the files at once.

Indeed, I suppose one could throw an exception from the FileFilter accept()
method to interrupt enumeration, if that's how listFiles() is implemented.
That would avoid the need to enumerate more than the needed number of
actual files.

Of course, this is all implementation-dependent and since it's not
explicitly documented, could change at any time anyway.  But unless you've
actually examined the implementation details for listFiles(), it's not a
foregone conclusion that the technique of using a FileFilter offers no way
to improve latency.

All that said, I think John Matthews' comment about the question of what
30K files are doing in a single directory in the first place is perhaps one
of the more useful points in this topic. One doesn't always have control
over that, of course...but if one does, it's certainly worth rethinking
that aspect of the design. There are reasons other than code latency to
avoid so many files in a single directory.

Pete

[toc] | [prev] | [next] | [standalone]


#21764

FromEric Sosman <esosman@comcast-dot-net.invalid>
Date2013-01-26 20:42 -0500
Message-ID<ke20lo$sh1$1@dont-email.me>
In reply to#21760
On 1/26/2013 6:21 PM, Peter Duniho wrote:
> On Sat, 26 Jan 2013 17:06:07 -0500, Eric Sosman wrote:
>
>> On 1/26/2013 4:15 PM, Robert Klemme wrote:
>>> On 26.01.2013 19:26, Arne Vajhøj wrote:
>>>
>>>> But I am a bit skeptical about whether a String[] with 30K elements
>>>> is really the bottleneck.
>>>>
>>>> If the real bottleneck is the OS calls to get next file, then
>>>> a filter like this will not help.
>>>
>>> Why?
>>
>>       Because the listFiles() method will fetch the information
>> for all 30K files from the O/S, will construct 30K File objects
>> to represent them, and will submit all 30K File objects to the
>> FileFilter, one by one.  The FileFilter will (very quickly)
>> reject 29.99K of the 30K Files, but ...
>
> Will it?

     Necessarily.  As far as listFiles() knows, the FileFilter
might accept the very last File object given to it.  Therefore,
listFiles() cannot fail to present that very last File -- and
every other File -- for inspection.

> It is plausible that the implementation of listFiles() uses an OS API that
> enumerates files one at a time. On Windows, getting the first file of the
> enumeration is faster than asking for all the files at once.

     Meh.

> Indeed, I suppose one could throw an exception from the FileFilter accept()
> method to interrupt enumeration, if that's how listFiles() is implemented.
> That would avoid the need to enumerate more than the needed number of
> actual files.

     It would also avoid the burden of returning anything from
listFiles() -- like, say, the array of accepted files ...

     A seriously hackish approach might be to do the processing
of the files within the FileFilter itself, treating it as a
"visit this File" callback instead of as a predicate.  Then if
the FileFilter threw an exception after processing the first N
files -- well, they'd already have been processed, and you were
going to ignore the listFiles() return value anyhow, so ...
But, as I said, that's pretty seriously hackish.

> Of course, this is all implementation-dependent and since it's not
> explicitly documented, could change at any time anyway.

     The performance implications of retrieving information on 30K
files from the O/S are undocumented, true.  But the necessity of
retrieving that information is deducible from what *is* documented.

> But unless you've
> actually examined the implementation details for listFiles(), it's not a
> foregone conclusion that the technique of using a FileFilter offers no way
> to improve latency.

     Maybe this is the disconnect: I understood the O.P.'s concern as
"It's doing three thousand times too much work," not as "It takes
three thousand times as long as it should just to get to the first
File instance."  Either way, though, I think a FileFilter (used in a
non-hackish way) cannot reduce either the total work or the latency.
Observe that listFiles() cannot return anything at all until it has
built the entire array of accepted files; Java's arrays have no way
to say "I hold five elements now, but might grow."

> All that said, I think John Matthews' comment about the question of what
> 30K files are doing in a single directory in the first place is perhaps one
> of the more useful points in this topic. One doesn't always have control
> over that, of course...but if one does, it's certainly worth rethinking
> that aspect of the design. There are reasons other than code latency to
> avoid so many files in a single directory.

     Yeah.  The O.P. said something about external processes dumping
files into the directory, possibly dumping many between (widely-
spaced?) executions of his program.  That seems odd to me, though,
because if there's a backlog of thirty thousand it seems odd to want
to reduce it by only ten ...

     If he's stuck with this overall design, though, I think the
walkFileTree() method of java.nio.file.Files would be a cleaner way
to proceed.  His FileVisitor could return FileVisitResult.TERMINATE
after it had seen ten files, and that would be that.  No hacks.

-- 
Eric Sosman
esosman@comcast-dot-net.invalid

[toc] | [prev] | [next] | [standalone]


#21766

FromPeter Duniho <NpOeStPeAdM@NnOwSlPiAnMk.com>
Date2013-01-26 17:56 -0800
Message-ID<c14x7pubxr70$.15kwu0dtzqycz$.dlg@40tude.net>
In reply to#21764
On Sat, 26 Jan 2013 20:42:16 -0500, Eric Sosman wrote:

> [...]
>>>       Because the listFiles() method will fetch the information
>>> for all 30K files from the O/S, will construct 30K File objects
>>> to represent them, and will submit all 30K File objects to the
>>> FileFilter, one by one.  The FileFilter will (very quickly)
>>> reject 29.99K of the 30K Files, but ...
>>
>> Will it?
> 
>      Necessarily.  As far as listFiles() knows, the FileFilter
> might accept the very last File object given to it.  Therefore,
> listFiles() cannot fail to present that very last File -- and
> every other File -- for inspection.

Except in the way I already noted, you mean.

> [...]
>> Indeed, I suppose one could throw an exception from the FileFilter accept()
>> method to interrupt enumeration, if that's how listFiles() is implemented.
>> That would avoid the need to enumerate more than the needed number of
>> actual files.
> 
>      It would also avoid the burden of returning anything from
> listFiles() -- like, say, the array of accepted files ...

As you've already agreed, it is possible for the FileFilter implementation
to store the results itself, obviating any need for the listFiles() method
to return successfully.

If it works (which is not assured...it depends on how listFiles() is
implemented in the first place), then yes, maybe it's a bit of a kludge.
But it's an easier, more portable kludge than writing some JNI-based
component and would in fact get the job done.

Sometimes, when the library you're using doesn't provide exactly the
features you need, you wind up with a kludge. Oh well...shit happens.

I'm not saying it's a great solution. But it's a far cry from a conclusion
that it simply cannot be done with the Java API as it exists now.

Pete

[toc] | [prev] | [next] | [standalone]


#21770

FromArne Vajhøj <arne@vajhoej.dk>
Date2013-01-26 21:29 -0500
Message-ID<51049102$0$281$14726298@news.sunsite.dk>
In reply to#21766
On 1/26/2013 8:56 PM, Peter Duniho wrote:
> On Sat, 26 Jan 2013 20:42:16 -0500, Eric Sosman wrote:
>
>> [...]
>>>>        Because the listFiles() method will fetch the information
>>>> for all 30K files from the O/S, will construct 30K File objects
>>>> to represent them, and will submit all 30K File objects to the
>>>> FileFilter, one by one.  The FileFilter will (very quickly)
>>>> reject 29.99K of the 30K Files, but ...
>>>
>>> Will it?
>>
>>       Necessarily.  As far as listFiles() knows, the FileFilter
>> might accept the very last File object given to it.  Therefore,
>> listFiles() cannot fail to present that very last File -- and
>> every other File -- for inspection.
>
> Except in the way I already noted, you mean.

Except if the code was different from the code he was
commenting on.

>> [...]
>>> Indeed, I suppose one could throw an exception from the FileFilter accept()
>>> method to interrupt enumeration, if that's how listFiles() is implemented.
>>> That would avoid the need to enumerate more than the needed number of
>>> actual files.
>>
>>       It would also avoid the burden of returning anything from
>> listFiles() -- like, say, the array of accepted files ...
>
> As you've already agreed, it is possible for the FileFilter implementation
> to store the results itself, obviating any need for the listFiles() method
> to return successfully.
>
> If it works (which is not assured...it depends on how listFiles() is
> implemented in the first place), then yes, maybe it's a bit of a kludge.
> But it's an easier, more portable kludge than writing some JNI-based
> component and would in fact get the job done.
>
> Sometimes, when the library you're using doesn't provide exactly the
> features you need, you wind up with a kludge. Oh well...shit happens.

If JNI is used then at least it is straight forward logic.

Utilizing Java library classes in a way that they were not intended
to be used based on as assumption about the underlying implementation
is not straight forward logic.

> I'm not saying it's a great solution. But it's a far cry from a conclusion
> that it simply cannot be done with the Java API as it exists now.

There is no way in Java API to ensure that it will be done.

We just find it likely that then implementation will work
that way.

Arne

[toc] | [prev] | [next] | [standalone]


#21774

FromEric Sosman <esosman@comcast-dot-net.invalid>
Date2013-01-26 21:56 -0500
Message-ID<ke2511$eac$1@dont-email.me>
In reply to#21766
On 1/26/2013 8:56 PM, Peter Duniho wrote:
>[...]
> I'm not saying it's a great solution. But it's a far cry from a conclusion
> that it simply cannot be done with the Java API as it exists now.

     Did somebody say that?  I certainly didn't -- indeed, part of
what you snipped from my post was a pointer to a perfectly clean
and well-documented Java SE API that does exactly what's needed.

-- 
Eric Sosman
esosman@comcast-dot-net.invalid

[toc] | [prev] | [next] | [standalone]


#21784

FromJim Janney <jjanney@shell.xmission.com>
Date2013-01-26 20:51 -0700
Message-ID<ydny5ffz3k2.fsf@shell.xmission.com>
In reply to#21774
Eric Sosman <esosman@comcast-dot-net.invalid> writes:

> On 1/26/2013 8:56 PM, Peter Duniho wrote:
>>[...]
>> I'm not saying it's a great solution. But it's a far cry from a conclusion
>> that it simply cannot be done with the Java API as it exists now.
>
>     Did somebody say that?  I certainly didn't -- indeed, part of
> what you snipped from my post was a pointer to a perfectly clean
> and well-documented Java SE API that does exactly what's needed.

I said that.  I was wrong.

-- 
Jim Janney

[toc] | [prev] | [next] | [standalone]


Page 1 of 2  [1] 2  Next page →

Back to top | Article view | comp.lang.java.programmer


csiph-web