Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.compilers > #2251

Re: Optimization techniques and undefined behavior

From Christian Gollwitzer <auriocus@gmx.de>
Newsgroups comp.compilers
Subject Re: Optimization techniques and undefined behavior
Date 2019-05-02 22:22 +0200
Organization A noiseless patient Spider
Message-ID <19-05-015@comp.compilers> (permalink)
References (5 earlier) <19-04-042@comp.compilers> <19-04-044@comp.compilers> <19-04-047@comp.compilers> <19-05-004@comp.compilers> <19-05-008@comp.compilers>

Show all headers | View raw


Am 02.05.19 um 16:51 schrieb David Brown:
> On 01/05/2019 14:53, Bart wrote:
>> On 30/04/2019 13:46, David Brown wrote:
>>> It is really incredibly simple (at least in this case).  Do the
>>> multiplications using types that won't overflow.  That might be an
>>> unsigned type if its range is suitable (not because it has defined
>>> overflow behaviour, but use it if it has enough range) or a bigger
>>> signed integer type.
>>
>> That's just kicking the can further down the road.
>>
>
> Yes (especially for the use of a larger signed type).  That's fine.  C
> let's you easily kick the can a /long/ way down the road.  How often do
> you need integers that will overflow a 64-bit type?
>
>> If you have two unknown values A and B, and need to multiply, you won't
>> know if the result will overflow.
>>
> First off, how often do you actually have unknown values to deal with?
> Usually you know something at least.

Well exactly that was my point. In order to parse the file format, you
get unknown values, and similar errors led to buffer overflows in the past:

	https://www.kb.cert.org/vuls/id/523889/

That one was really nasty, because libpng is contained in a lot of
security critical software (browsers, web servers handling uploads...).
So, this IS a real problem, maybe just not common in the code that you
write.


> bytes_per_pixel is not going to be more than, say, 16 - that's for
> 32-bit per colour, including alpha channel.  No picture format is going
> to have more.  Width and height will also be limited - the highest
> resolution image sensors are in the range of 120 megapixels.  If you are
> talking about a panorama picture sewn together from 5 billion such
> camera pictures, you might have a point - and that's just using 64-bit
> types.


To get back to reality. I chose the PGM format just as an example,
because it is so simple that we could actually post the source code of
the parser in this message. However, in my previous job, I had to deal
with multidimensional scientific data sets. The uncompressed file
formats typically have a binary (or even ASCII) header and a block of
memory (the so called "raw" format), and there may be up to 5 dimensions.

A format close to PGm is NRRD:
http://teem.sourceforge.net/nrrd/format.html Software that reads/writes
this can be found here: http://fiji.sc/

The total data set size can be as large as your computer's main memory,
we had machines with up to 1 TB RAM and data sets regularly with up to
100 GB in size.

So let's say you have a 4D data set, then you get, as input

x
y
z
w
datatype

In some formats, if it's only 2 dimensions, then the 3rd and 4th are
simply set to 1. Also, there may be oddly shaped "images" with 100,000
pixels in one dimension and only 2 or 3 in another dimension. So the
"real" task is:

* Read in x, y, z, w and the data type identifier. Each of x,y,z,w must
be checked to be within the range of size_t.

* Then compute the number of bytes by multiplying these numbers. If it
is smaller then the number of bytes in main memory, alloc memory and
load the file.

The easiest way to do that is with unlimited integers, like in Python:

memsize = x*y*z*w*sizeof(datatype)

if memsize > mainmemory:
	error "Out of memory"

In C++, you need to jump through strange hoops to detect the overflow.
You simply can't limit it such that it will not overflow for 64 bit -
because that would mean you restrict each dimension to 65535, which is
definitely NOT enough. Instead, what would be needed is a function
safe_multiply(x,y) which returns x*y if there is no overflow and throws
an error otherwise, so you could write:

try {
	size_t memsize = safe_multiply(x,y,z,w,sizeof(datatype);
} catch (overflow_error & err) {
	std::cerr<<"Not enough memory";
}


Even that safe_multiply() is extremely hard to write, unless your
compiler supports 128 bit math as an extension.

	Christian

Back to comp.compilers | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Re: Optimization techniques David Brown <david.brown@hesbynett.no> - 2019-04-25 21:58 +0200
  Re: Optimization techniques Kaz Kylheku <847-115-0292@kylheku.com> - 2019-04-26 00:18 +0000
    Re: Optimization techniques David Brown <david.brown@hesbynett.no> - 2019-04-28 23:49 +0200
      Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-04-29 00:31 +0100
        Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-04-29 17:08 +0200
          Re: Optimization techniques and undefined behavior Christian Gollwitzer <auriocus@gmx.de> - 2019-04-29 18:10 +0200
            Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-04-30 14:46 +0200
              Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-01 13:53 +0100
                Re: Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-02 11:29 +0100
                Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-03 00:48 +0100
                Re: Optimization techniques and undefined behavior Martin Ward <martin@gkc.org.uk> - 2019-05-03 10:52 +0100
                Re: Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-04 17:44 -0400
                Re: Bounds checking, Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-05 17:10 -0400
                Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-06 08:14 +0200
                Re: Optimization techniques and undefined behavior Gene Wirchenko <genew@telus.net> - 2019-05-11 22:25 -0700
                Re: not a lot of memory, was Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-03 12:45 +0100
                Re: Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-03 13:29 +0100
                Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-03 23:10 +0100
                Re: Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-04 10:45 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-05 11:14 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-05-05 20:44 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-05-06 10:15 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 11:04 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior "Nuno Lopes" <nuno.lopes@ist.utl.pt> - 2019-05-07 22:38 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-08 01:14 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-08 09:55 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior "Derek M. Jones" <derek@_NOSPAM_knosof.co.uk> - 2019-05-08 19:08 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-08 01:42 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-08 10:16 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-09 01:15 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-09 21:56 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-08 10:03 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-09 09:19 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior Kaz Kylheku <847-115-0292@kylheku.com> - 2019-05-10 03:38 +0000
                Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-08 14:37 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior Christopher F Clark <christopher.f.clark@compiler-resources.com> - 2019-05-06 05:05 -0400
                Re: Bounds checking, Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-05 17:38 -0400
                Re: Bounds checking, Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-06 13:07 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 14:01 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-06 01:15 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior Andy Walker <anw@cuboid.co.uk> - 2019-05-06 14:40 +0100
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 15:05 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-08 10:18 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior Jan Ziak <0xe2.0x9a.0x9b@gmail.com> - 2019-05-06 05:39 -0700
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 15:42 +0200
                Re: Bounds checking, Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-06 16:32 +0200
                Re: Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-04 17:59 -0400
                Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-02 16:51 +0200
                Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-02 20:04 +0100
                Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-03 17:23 +0200
                Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-03 21:10 +0100
                Re: Optimization techniques and undefined behavior Martin Ward <martin@gkc.org.uk> - 2019-05-06 13:25 +0100
                Re: Optimization techniques and undefined behavior "Derek M. Jones" <derek@_NOSPAM_knosof.co.uk> - 2019-05-06 16:32 +0100
                Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 16:03 +0200
                Re: Optimization techniques and undefined behavior Martin Ward <martin@gkc.org.uk> - 2019-05-08 13:16 +0100
                Re: Optimization techniques and undefined behavior George Neuner <gneuner2@comcast.net> - 2019-05-08 15:13 -0400
                Re: Optimization techniques and undefined behavior "Robin Vowels" <robin51@dodo.com.au> - 2019-05-07 01:22 +1000
                Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 16:05 +0200
                Re: Optimization techniques and undefined behavior Christian Gollwitzer <auriocus@gmx.de> - 2019-05-02 22:22 +0200
          Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-04-29 18:15 +0100
            Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-04-30 15:48 +0200
              Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-01 12:40 +0100
                Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-02 17:27 +0200
                Re: Optimization techniques and undefined behavior Bart <bc@freeuk.com> - 2019-05-02 18:59 +0100
                Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 16:16 +0200
              Re: Optimization techniques and undefined behavior Martin Ward <martin@gkc.org.uk> - 2019-05-02 14:54 +0100
      Re: Optimization techniques and runtime checks Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-04-29 22:36 +0200
        Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-07 16:29 +0200
          Re: Optimization techniques and runtime checks Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-05-08 02:27 +0200
            Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-08 10:31 +0200
              Re: Optimization techniques and runtime checks Hans-Peter Diettrich <DrDiettrich1@netscape.net> - 2019-05-08 22:50 +0200
              Re: Optimization techniques and runtime checks "Robin Vowels" <robin51@dodo.com.au> - 2019-05-11 19:26 +1000
              Re: Optimization techniques and runtime checks Gene Wirchenko <genew@telus.net> - 2019-05-11 22:43 -0700
                Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-12 20:17 +0200
          Re: Optimization techniques and runtime checks Bart <bc@freeuk.com> - 2019-05-08 14:58 +0100
            Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-08 23:02 +0200
              Re: Optimization techniques and runtime checks Bart <bc@freeuk.com> - 2019-05-09 18:28 +0100
                Re: Optimization techniques and runtime checks David Brown <david.brown@hesbynett.no> - 2019-05-09 22:07 +0200
      Re: Optimization techniques Gene Wirchenko <genew@telus.net> - 2019-04-30 18:24 -0700
        Re: Optimization techniques David Brown <david.brown@hesbynett.no> - 2019-05-01 09:20 +0200
          Re: Optimization techniques Kaz Kylheku <847-115-0292@kylheku.com> - 2019-05-02 17:40 +0000
          Re: Optimization techniques and error detection Gene Wirchenko <genew@telus.net> - 2019-05-03 10:16 -0700
          Re: Optimization techniques "Robin Vowels" <robin51@dodo.com.au> - 2019-05-07 01:42 +1000
  Re: Optimization techniques Kaz Kylheku <847-115-0292@kylheku.com> - 2019-04-26 02:26 +0000
    Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-04-29 00:12 +0200
      Re: Optimization techniques and undefined behavior Kaz Kylheku <847-115-0292@kylheku.com> - 2019-05-02 17:18 +0000
        Re: Optimization techniques and undefined behavior David Brown <david.brown@hesbynett.no> - 2019-05-07 16:38 +0200

csiph-web