Groups | Search | Server Info | Keyboard shortcuts | Login | Register
Groups > comp.os.linux.development.system > #531
| From | pacman@kosh.dhis.org (Alan Curry) |
|---|---|
| Newsgroups | comp.os.linux.development.system |
| Subject | Re: how does pipe data chunking work? |
| Date | 2011-01-30 23:17 +0000 |
| Organization | Aioe.org NNTP Server |
| Message-ID | <ii4rip$km2$1@speranza.aioe.org> (permalink) |
| References | <ii4ovm$ld$1@news.mixmin.net> |
In article <ii4ovm$ld$1@news.mixmin.net>, 1jam <com@example.net> wrote: >Have to admit I'm not clear on this one.. when piping programs together on a >terminal, what determines how often the output of the first program is sent >to the second program? And what amount of data? The first program decides this itself. In most cases, it decides to use the output functions from <stdio.h> and the C library then chooses a buffer size. When the buffer is full, it gets flushed and the data appears in the pipe for the second program to read. There is a "stdbuf" command in recent versions of GNU coreutils, which you can use to alter the behavior of stdio. > >It seems that with a command like: >find . -iname \*bz2 -print0 | xargs -0 bunzip2 > >..find will execute in entirety, then pass on its output all at once..? I >tested this with a double zipped file. Like some_file.bz2.bz2. It gets >decompressed only once, resulting in some_file.bz2. Even if you could guarantee find flushed the filename to output before continuing to read the directory, this would still be unlikely to work. First of all, it's probably reading the directory with a readdir(3) implementation that calls getdents(2), so the directory reading itself is effectively buffered. Secondly, it probably takes more time for xargs to launch the bunzip than it takes for find to do another readdir() call. Third, directories are not ordered objects, so your some_file.bz2 might end up appearing in an earlier position in the directory, where find can't see it without rereading from the beginning. Fourth, xargs is designed to bundle multiple names into a single command line so it won't actually try to launch bunzip until it's got several names to work with. (xargs -l1 would fix that last problem) > >Yet a command like: >dd ... | gzip ... > >..will send blocks at a time, and both programs work simultaneously. dd is special. It has all those weird options, especially bs=N, because it wants to give you control over buffering. -- Alan Curry
Back to comp.os.linux.development.system | Previous | Next — Next in thread | Find similar
Re: how does pipe data chunking work? pacman@kosh.dhis.org (Alan Curry) - 2011-01-30 23:17 +0000 Re: how does pipe data chunking work? Jorgen Grahn <grahn+nntp@snipabacken.se> - 2011-02-02 08:00 +0000 Re: how does pipe data chunking work? 1jam <nospam@nospam.net> - 2011-02-01 20:23 -0900
csiph-web