Path: csiph.com!xmission!news.glorb.com!usenet.stanford.edu!not-for-mail
From: Linda Walsh <bash@tlinx.org>
Newsgroups: gnu.bash.bug
Subject: Upgrading shell vars&strings to allow for possibility of FD redirection into a var.
Date: Mon, 23 Nov 2015 14:10:50 -0800
Lines: 85
Approved: bug-bash@gnu.org
Message-ID: <mailman.576.1448316677.31583.bug-bash@gnu.org>
References: <564532BD.60801@tlinx.org>	<CAJnmqwbeSXYrNF6zEJ9nEx2Zyi23X9fxLvkkc1HFOh7JkqZUsw@mail.gmail.com>	<564CC7A1.9090004@tlinx.org> <20151118185629.GJ27325@eeg.ccf.org>	<564CDF9B.6050401@tlinx.org> <CAJnmqwY6=jFfiAJG2yCV1jSrJyLYb9_4ynJso1=sHynNoJUOsw@mail.gmail.com>
NNTP-Posting-Host: lists.gnu.org
Mime-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Trace: usenet.stanford.edu 1448316677 9919 208.118.235.17 (23 Nov 2015 22:11:17 GMT)
X-Complaints-To: action@cs.stanford.edu
Cc: Greg Wooledge <wooledg@eeg.ccf.org>, bug-bash <bug-bash@gnu.org>
To: konsolebox <konsolebox@gmail.com>
Envelope-to: bug-bash@gnu.org
User-Agent: Thunderbird
In-Reply-To: <CAJnmqwY6=jFfiAJG2yCV1jSrJyLYb9_4ynJso1=sHynNoJUOsw@mail.gmail.com>
X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic]
X-Received-From: 173.164.175.65
X-BeenThere: bug-bash@gnu.org
X-Mailman-Version: 2.1.14
Precedence: list
List-Id: Bug reports for the GNU Bourne Again SHell <bug-bash.gnu.org>
List-Unsubscribe: <https://lists.gnu.org/mailman/options/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=unsubscribe>
List-Archive: <http://lists.gnu.org/archive/html/bug-bash>
List-Post: <mailto:bug-bash@gnu.org>
List-Help: <mailto:bug-bash-request@gnu.org?subject=help>
List-Subscribe: <https://lists.gnu.org/mailman/listinfo/bug-bash>, <mailto:bug-bash-request@gnu.org?subject=subscribe>
Xref: csiph.com gnu.bash.bug:11922



konsolebox wrote:
> On Thu, Nov 19, 2015 at 4:29 AM, Linda Walsh <bash@tlinx.org> wrote:
>> However one cannot
>> categorically say that a NUL byte can't be used as an argument.
> 
> Likely only in an escaped format, or just as a single byte or character.
> 
>> Solving
>> other places where it doesn't work might make it so that it *would* work...
> 
> Most parts of bash interpret, store and handle parameters (internal
> data; not shell arguments) as C-strings (null-terminated) (as an
> example, see make_variable_value() in variables.c, or savestring() in
> general.h).  It's the key to understanding why NUL bytes doesn't work
> most of the time in bash.  If you would want it to interpret them as
> strings like C++ strings or Ruby strings, it would need a big
> overhaul.  You would also think about possible compliance issues,
> compatibility issues, complexities and vulnerabilities.  It's great
> but not easy.
----
	Easiest way to explore -- see what it would take to get
bash to compile under g++.

	Then everywhere you use strings, you could declare it
with "string".  It includes a count of #chars, so embedded nulls
are allowed.  If you need the C-version (0 terminated), you can
use the string.cstr() method -- which will return the whole
string (including embedded NULs, if any), terminated by a NUL.

	Of course thats all dependent on how hard it is to get
to compile under g++ (1), and (2) a long, but able-to-be-broken-apart
piecemeal approach of converting 1 subsystem at a time.  I.e.
by subsystem, some small part of 'bash', that has few interfaces
with the rest of the code, that could be converted, -- with 
compatibility provided at the interfaces.  The neat thing about
C++ is you create 2 copies of a function -- one that works with
old "char *" strings, and one that works with 'strings'.
Ex:a sub to convert a number to a computer-metric prefix

.h:

  char * Binary_Scale (char buf[], double value);
  string Binary_Scale (double value);

.cc:

  char * Binary_Scale (char buf[], double value) { return Scale(buf, value); }
  string Binary_Scale (double value) { return Scale(value); }

later, since the funcs are so small, one might want to inline them,
so move the 'source lines from 'cc' to the ".h" to replace them, and
add 'inline'.  In the new .h:

  inline char * Binary_Scale (char buf[], double value) { return Scale(buf, value); }
  inline string Binary_Scale (double value) { return Scale(value); }

----
Either way, any code that include the header will automatically
call the "right" version of "Binary_Scale", based on the argument
types -- so you can have new and old code living together in the same
program without it being excessively painful...  ;-)

If the library types don't work for one, you can roll your own.
I wanted to create a 'ratio' data type, that took 2-64bit integers.
I then made all the basic functions work with it (+-*/).  As data
was entered, it reduced the integers by the GCD and stored it as
well (in case the original number(s) were needed).  But for subsequent
calculations, they'd always try to use the reduced format to reduce
the possibility of overflow/underflow -- which was checked for and flagged.

I didn't want to use floating point, since, while modern floating point
performance is fast, integer ops are still 10-100x faster and I was
working with all 64-bit data counters -- so wanted to keep all the
intermediate calculations in integer ratios so I wouldn't lose precision
(which is lost on final output, as it was usually converted to a
metric-scaled float).

Anyway -- that's how I'd approach an upgrade like this that would
eventually enable shell vars to contain binary data, plus going to
a 'str+len' from zero terminated, _should_ make the code more robust
and less subject to buffer overruns.