Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > gnu.bash.bug > #11685
| From | Linda Walsh <bash@tlinx.org> |
|---|---|
| Newsgroups | gnu.bash.bug |
| Subject | Re: Design question(s), re: why use of tmp-files or named-pipes(/dev/fd/N) instead of plain pipes? |
| Date | 2015-10-17 17:43 -0700 |
| Message-ID | <mailman.529.1445129014.7904.bug-bash@gnu.org> (permalink) |
| References | <56218DA5.8030501@tlinx.org> <5622CDC8.2030102@case.edu> |
Chet Ramey wrote: > On 10/16/15 7:52 PM, Linda Walsh wrote: > >> As I mentioned, my initial take on implementation was >> using standard pipes instead of named pipes (not having read >> or perhaps having glossed over the 'named pipes' aspect). > > I think you're missing that process substitution is a word expansion > that is defined to expand to a filename. When it uses /dev/fd, it > uses pipes and exposes that pipe to the process as a filename in > /dev/fd. Named pipes are an alternative for systems that don't support > /dev/fd. ----- ??? I've never seen a usage where it expands to a filename and is treated as such. Are you meaning: readarray foo </etc/passwd (being a read from filename case). vs. readarray foo < <(cat /etc/passwd), and in this case, "<(...)" is creating a process ... that puts the input file on some "/dev/fd/xx", and that readarray (or read) is then "handling that" as though it were a normal file? That's what you are meaning by process substitution?... I wasn't understanding the literalness of the wording...(?) But read or readarray, read from "file handle"s, not filenames -- as in: "cat /etc/passwd |readarray foo" The only conceptual difference between that and readarray foo < <(cat /etc/passwd) is whether or not the readarray is done in the parent or the child, .. i.e. from a semantic point of view, how is: readarray foo < <(cat /etc/passwd) different from shopt -s lastpipe cat /etc/passwd |readarray foo Is there something in the semantics that would require they be implemented differently? > While using pipes for here-documents is possible, the original file-based > implementation (and I mean original -- it's almost 28 years old now) is > simple and works. Inserting another process just to keep writing data to > a pipe (many here documents are larger than the OS pipe buffer size), deal > with write errors and SIGPIPE asynchronously, and add a few pipe file > descriptors to manage, seems like more trouble than benefit. Files are > simple, can handle arbitrary amounts of data, and don't require an > additional process or bookkeeping. ---- But they do require a tempstore that wasn't present in a system-start bash script I was writing -- kept having problems with the different bash communication methods all needing some external-fs names to be present to handle what could have been done in memory. BTW, Dealing with the SIGPIPE is why I kept the output pipe open in the reader process, and use a child-sig handler to close the specific pipe -- I.e. when I wrote similar code in perl, I kept track of child processes->pipe pairs, so my foreground process could close the "writer-pipe" only after the reader-pipe had everything. Alternatively, Could use a pair of pipes, and parent could send a message to child process for more data, or close the pipe) -- a handshake, as it were. As far as may docs being larger than the buffer size -- shouldn't be an issue unless you are trying 2-way communication -- but that has lots of warnings about each reader waiting for output from the other as a writer and the need for asynch/non-blocking design. Writing to a tmp file requires enough room for the entire tmp doc on disk. My tmp right now only has 4.6Available. If some heredoc tried to encode the contents of a video, or Blue-ray data disk, and then something tried to unpack it to tmp first, it would die --- but with pipes, you don't have to send the whole thing at once -- that requires basically full space on sender and receiver or 2x the HEREDOC space. If it could hold it in memory... my linux box has 2x48G (2 NUMA nodes) or ~96G + ~8G Swap. So I'd say it really is case dependent -- ideally if bash was going to be enhanced in this area, it allow the user to set the defaults to use processes/pipes or tmp files. The problem I had with bash & tmps, was that bash scripts started execution off of 'root' before a separate FS for /tmp was mounted. The root partition has even less space available on it than /tmp. So the use of a /tmp file when /tmp wasn't really available yet was an issue. > > here-strings are similar, and there's no reason to have here strings use > a different mechanism than here documents. > > There was also a backwards compatibility issue. Since historical shells > had used temp files for here-documents, there was a set of applications > that assumed stdin was seekable as long as it wasn't a terminal, and it was > important to support them. ---- Oh? What POSIX compatible applications? > Yes. You are missing the fundamental nature of process substitution: > a word expansion that results in a filename. --- hopefully my q's at top were more on track? > If you think that you can improve the implementation of here > documents and here strings by using a pipe instead of a temp > file, by all means take a shot at it. You might want to get > ahold of some old Unix `shar' archives to test your implementation > with very large here-documents. ---- Or, I might read those same shar achives today directly into memory -- depends on system resources.
Back to gnu.bash.bug | Previous | Next | Find similar
Re: Design question(s), re: why use of tmp-files or named-pipes(/dev/fd/N) instead of plain pipes? Linda Walsh <bash@tlinx.org> - 2015-10-17 17:43 -0700
csiph-web