Path: csiph.com!xmission!news.glorb.com!usenet.stanford.edu!not-for-mail
From: Chet Ramey <chet.ramey@case.edu>
Newsgroups: gnu.bash.bug
Subject: Re: Design question(s), re: why use of tmp-files or named-pipes(/dev/fd/N) instead of plain pipes?
Date: Sat, 17 Oct 2015 18:38:00 -0400
Organization: ITS, Case Western Reserve University
Lines: 91
Approved: bug-bash@gnu.org
Message-ID: <mailman.527.1445121490.7904.bug-bash@gnu.org>
References: <56218DA5.8030501@tlinx.org>
Reply-To: chet.ramey@case.edu
NNTP-Posting-Host: lists.gnu.org
Mime-Version: 1.0
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 7bit
X-Trace: usenet.stanford.edu 1445121490 1164 208.118.235.17 (17 Oct 2015 22:38:10 GMT)
X-Complaints-To: action@cs.stanford.edu
Cc: chet.ramey@case.edu
To: Linda Walsh <bash@tlinx.org>, bug-bash <bug-bash@gnu.org>
Envelope-to: bug-bash@gnu.org
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.11; rv:38.0) Gecko/20100101 Thunderbird/38.3.0
In-Reply-To: <56218DA5.8030501@tlinx.org>
X-Junkmail-Whitelist: YES (by domain whitelist at mpv2.tis.cwru.edu)
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.4.x-2.6.x [generic]
X-Received-From: 129.22.105.37
X-BeenThere: bug-bash@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/bug-bash>
List-Post: <mailto:bug-bash@gnu.org>
List-Help: <mailto:bug-bash-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe>
Xref: csiph.com gnu.bash.bug:11684

On 10/16/15 7:52 PM, Linda Walsh wrote:

> As I mentioned, my initial take on implementation was
> using standard pipes instead of named pipes (not having read
> or perhaps having glossed over the 'named pipes' aspect).

I think you're missing that process substitution is a word expansion
that is defined to expand to a filename.  When it uses /dev/fd, it
uses pipes and exposes that pipe to the process as a filename in
/dev/fd.  Named pipes are an alternative for systems that don't support
/dev/fd.

> And, similarly, when using "read a b c" from "<<<(rvalue-expr)" or
> <<heredoc...end\nheredoc" -- (*especially* for the 1 line case
> like "read a b c <<< "1 2 3"), bash's use of tmp files instead of
> a separate process (conceptually similar to the above) and piped back
> into the parent (where it would be wordsplit for read or linesplit
> for readarray).

While using pipes for here-documents is possible, the original file-based
implementation (and I mean original -- it's almost 28 years old now) is
simple and works.  Inserting another process just to keep writing data to
a pipe (many here documents are larger than the OS pipe buffer size), deal
with write errors and SIGPIPE asynchronously, and add a few pipe file
descriptors to manage, seems like more trouble than benefit.  Files are
simple, can handle arbitrary amounts of data, and don't require an
additional process or bookkeeping.

here-strings are similar, and there's no reason to have here strings use
a different mechanism than here documents.

There was also a backwards compatibility issue.  Since historical shells
had used temp files for here-documents, there was a set of applications
that assumed stdin was seekable as long as it wasn't a terminal, and it was
important to support them.


> In a similar vein, "proga | progb |progc" in some shell implementations
> on non-unix systems used '/tmp' files to simulate pipes -- though
> most shells today, probably use something akin to pipes.

Pipelines use pipes.

> I thought use of 'tmp' files and such was mostly anachronistic.  I was
> wondering if there was something preventing bash from using
> plain pipes for both the <<< and the '< <(...)' cases, rather
> than relying on 'tmp' files or OS-specific names
> (e.g. /proc or /dev .../fd/x)?

I addressed the here-string case above.  `< <(...)' is a composition of
two separate constructs; it is not something that is implemented as a
unit, and it relies on having a filename.  If you want to expose the
output of an asynchronous process to another process using the file
system, your available choices are limited.

> I.e. instead of a=$(cat $(cmd1) $(blah $(nested)) $(cmd3)) having
> to use maybe 5 tmp files, and waiting to run each $() in a hierarchal,
> linear tree, they could all use pipes and in some cases, run in parallel.

Command substitution uses pipes.

> I.e. Is there something I'm missing that would prevent refactoring
> the code to use 1 method (processes w/pipes), rather than the
> current use of multiple methods that may or may not be working
> in some given target env (like early boot where had problems with
> both tmp and /proc|dev/fd usage...) or some reduced-function
> linux for an embedded system, or comparable?

Yes.  You are missing the fundamental nature of process substitution:
a word expansion that results in a filename.

> To me, it just seems like a 'no-brainer' that going with a
> simpler implementation that has fewer external dependencies would
> be an all-around-winner for bash, with no downside...I feel
> like I'm setting myself up to be shot down, but is there
> or are there technical reasons why such shouldn't be done?

If you think that you can improve the implementation of here
documents and here strings by using a pipe instead of a temp
file, by all means take a shot at it.  You might want to get
ahold of some old Unix `shar' archives to test your implementation
with very large here-documents.

Chet


-- 
``The lyf so short, the craft so long to lerne.'' - Chaucer
		 ``Ars longa, vita brevis'' - Hippocrates
Chet Ramey, ITS, CWRU    chet@case.edu    http://cnswww.cns.cwru.edu/~chet/