Path: csiph.com!xmission!news.glorb.com!usenet.stanford.edu!not-for-mail From: Chet Ramey Newsgroups: gnu.bash.bug Subject: Re: Design question(s), re: why use of tmp-files or named-pipes(/dev/fd/N) instead of plain pipes? Date: Sat, 17 Oct 2015 18:38:00 -0400 Organization: ITS, Case Western Reserve University Lines: 91 Approved: bug-bash@gnu.org Message-ID: References: <56218DA5.8030501@tlinx.org> Reply-To: chet.ramey@case.edu NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Trace: usenet.stanford.edu 1445121490 1164 208.118.235.17 (17 Oct 2015 22:38:10 GMT) X-Complaints-To: action@cs.stanford.edu Cc: chet.ramey@case.edu To: Linda Walsh , bug-bash Envelope-to: bug-bash@gnu.org User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 In-Reply-To: <56218DA5.8030501@tlinx.org> X-Junkmail-Whitelist: YES (by domain whitelist at mpv2.tis.cwru.edu) X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic] X-Received-From: 129.22.105.37 X-BeenThere: bug-bash@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Bug reports for the GNU Bourne Again SHell List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com gnu.bash.bug:11684 On 10/16/15 7:52 PM, Linda Walsh wrote: > As I mentioned, my initial take on implementation was > using standard pipes instead of named pipes (not having read > or perhaps having glossed over the 'named pipes' aspect). I think you're missing that process substitution is a word expansion that is defined to expand to a filename. When it uses /dev/fd, it uses pipes and exposes that pipe to the process as a filename in /dev/fd. Named pipes are an alternative for systems that don't support /dev/fd. > And, similarly, when using "read a b c" from "<<<(rvalue-expr)" or > < like "read a b c <<< "1 2 3"), bash's use of tmp files instead of > a separate process (conceptually similar to the above) and piped back > into the parent (where it would be wordsplit for read or linesplit > for readarray). While using pipes for here-documents is possible, the original file-based implementation (and I mean original -- it's almost 28 years old now) is simple and works. Inserting another process just to keep writing data to a pipe (many here documents are larger than the OS pipe buffer size), deal with write errors and SIGPIPE asynchronously, and add a few pipe file descriptors to manage, seems like more trouble than benefit. Files are simple, can handle arbitrary amounts of data, and don't require an additional process or bookkeeping. here-strings are similar, and there's no reason to have here strings use a different mechanism than here documents. There was also a backwards compatibility issue. Since historical shells had used temp files for here-documents, there was a set of applications that assumed stdin was seekable as long as it wasn't a terminal, and it was important to support them. > In a similar vein, "proga | progb |progc" in some shell implementations > on non-unix systems used '/tmp' files to simulate pipes -- though > most shells today, probably use something akin to pipes. Pipelines use pipes. > I thought use of 'tmp' files and such was mostly anachronistic. I was > wondering if there was something preventing bash from using > plain pipes for both the <<< and the '< <(...)' cases, rather > than relying on 'tmp' files or OS-specific names > (e.g. /proc or /dev .../fd/x)? I addressed the here-string case above. `< <(...)' is a composition of two separate constructs; it is not something that is implemented as a unit, and it relies on having a filename. If you want to expose the output of an asynchronous process to another process using the file system, your available choices are limited. > I.e. instead of a=$(cat $(cmd1) $(blah $(nested)) $(cmd3)) having > to use maybe 5 tmp files, and waiting to run each $() in a hierarchal, > linear tree, they could all use pipes and in some cases, run in parallel. Command substitution uses pipes. > I.e. Is there something I'm missing that would prevent refactoring > the code to use 1 method (processes w/pipes), rather than the > current use of multiple methods that may or may not be working > in some given target env (like early boot where had problems with > both tmp and /proc|dev/fd usage...) or some reduced-function > linux for an embedded system, or comparable? Yes. You are missing the fundamental nature of process substitution: a word expansion that results in a filename. > To me, it just seems like a 'no-brainer' that going with a > simpler implementation that has fewer external dependencies would > be an all-around-winner for bash, with no downside...I feel > like I'm setting myself up to be shot down, but is there > or are there technical reasons why such shouldn't be done? If you think that you can improve the implementation of here documents and here strings by using a pipe instead of a temp file, by all means take a shot at it. You might want to get ahold of some old Unix `shar' archives to test your implementation with very large here-documents. Chet -- ``The lyf so short, the craft so long to lerne.'' - Chaucer ``Ars longa, vita brevis'' - Hippocrates Chet Ramey, ITS, CWRU chet@case.edu http://cnswww.cns.cwru.edu/~chet/