Path: csiph.com!xmission!news.glorb.com!usenet.stanford.edu!not-for-mail From: Linda Walsh Newsgroups: gnu.bash.bug Subject: Upgrading shell vars&strings to allow for possibility of FD redirection into a var. Date: Mon, 23 Nov 2015 14:10:50 -0800 Lines: 85 Approved: bug-bash@gnu.org Message-ID: References: <564532BD.60801@tlinx.org> <564CC7A1.9090004@tlinx.org> <20151118185629.GJ27325@eeg.ccf.org> <564CDF9B.6050401@tlinx.org> NNTP-Posting-Host: lists.gnu.org Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Trace: usenet.stanford.edu 1448316677 9919 208.118.235.17 (23 Nov 2015 22:11:17 GMT) X-Complaints-To: action@cs.stanford.edu Cc: Greg Wooledge , bug-bash To: konsolebox Envelope-to: bug-bash@gnu.org User-Agent: Thunderbird In-Reply-To: X-detected-operating-system: by eggs.gnu.org: GNU/Linux 2.2.x-3.x (no timestamps) [generic] X-Received-From: 173.164.175.65 X-BeenThere: bug-bash@gnu.org X-Mailman-Version: 2.1.14 Precedence: list List-Id: Bug reports for the GNU Bourne Again SHell List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com gnu.bash.bug:11922 konsolebox wrote: > On Thu, Nov 19, 2015 at 4:29 AM, Linda Walsh wrote: >> However one cannot >> categorically say that a NUL byte can't be used as an argument. > > Likely only in an escaped format, or just as a single byte or character. > >> Solving >> other places where it doesn't work might make it so that it *would* work... > > Most parts of bash interpret, store and handle parameters (internal > data; not shell arguments) as C-strings (null-terminated) (as an > example, see make_variable_value() in variables.c, or savestring() in > general.h). It's the key to understanding why NUL bytes doesn't work > most of the time in bash. If you would want it to interpret them as > strings like C++ strings or Ruby strings, it would need a big > overhaul. You would also think about possible compliance issues, > compatibility issues, complexities and vulnerabilities. It's great > but not easy. ---- Easiest way to explore -- see what it would take to get bash to compile under g++. Then everywhere you use strings, you could declare it with "string". It includes a count of #chars, so embedded nulls are allowed. If you need the C-version (0 terminated), you can use the string.cstr() method -- which will return the whole string (including embedded NULs, if any), terminated by a NUL. Of course thats all dependent on how hard it is to get to compile under g++ (1), and (2) a long, but able-to-be-broken-apart piecemeal approach of converting 1 subsystem at a time. I.e. by subsystem, some small part of 'bash', that has few interfaces with the rest of the code, that could be converted, -- with compatibility provided at the interfaces. The neat thing about C++ is you create 2 copies of a function -- one that works with old "char *" strings, and one that works with 'strings'. Ex:a sub to convert a number to a computer-metric prefix .h: char * Binary_Scale (char buf[], double value); string Binary_Scale (double value); .cc: char * Binary_Scale (char buf[], double value) { return Scale(buf, value); } string Binary_Scale (double value) { return Scale(value); } later, since the funcs are so small, one might want to inline them, so move the 'source lines from 'cc' to the ".h" to replace them, and add 'inline'. In the new .h: inline char * Binary_Scale (char buf[], double value) { return Scale(buf, value); } inline string Binary_Scale (double value) { return Scale(value); } ---- Either way, any code that include the header will automatically call the "right" version of "Binary_Scale", based on the argument types -- so you can have new and old code living together in the same program without it being excessively painful... ;-) If the library types don't work for one, you can roll your own. I wanted to create a 'ratio' data type, that took 2-64bit integers. I then made all the basic functions work with it (+-*/). As data was entered, it reduced the integers by the GCD and stored it as well (in case the original number(s) were needed). But for subsequent calculations, they'd always try to use the reduced format to reduce the possibility of overflow/underflow -- which was checked for and flagged. I didn't want to use floating point, since, while modern floating point performance is fast, integer ops are still 10-100x faster and I was working with all 64-bit data counters -- so wanted to keep all the intermediate calculations in integer ratios so I wouldn't lose precision (which is lost on final output, as it was usually converted to a metric-scaled float). Anyway -- that's how I'd approach an upgrade like this that would eventually enable shell vars to contain binary data, plus going to a 'str+len' from zero terminated, _should_ make the code more robust and less subject to buffer overruns.