Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.graphics.apps.gnuplot > #4191

Re: Why does plotting with point labels make plot generation extremely slow?

Newsgroups comp.graphics.apps.gnuplot
Date 2019-04-30 01:17 -0700
References <0d17edd5-6246-4f25-8edd-df15a0832de8@googlegroups.com> <qa66un$orh$1@dont-email.me> <8c68095b-0cd2-4805-ac23-d0e7cf629d00@googlegroups.com> <qa89jt$qqj$1@dont-email.me>
Message-ID <a9a9f30d-8443-4c56-a1ad-1331e951b605@googlegroups.com> (permalink)
Subject Re: Why does plotting with point labels make plot generation extremely slow?
From ciro.santilli@gmail.com

Show all headers | View raw


On Tuesday, April 30, 2019 at 2:49:51 AM UTC+1, Ethan Merritt wrote:
> On Mon, 29 Apr 2019 01:01:35 -0700, ciro.santilli wrote:
> 
> > On Monday, April 29, 2019 at 7:52:09 AM UTC+1, Ethan Merritt wrote:
> >> On Sun, 28 Apr 2019 02:36:50 -0700, ciro.santilli wrote:
> >> 
> >> > This made me curious, since I wouldn't intuitively expect that just adding point labels would add so much overhead.
> >> > 
> >> > In particular if the point labels are hypertext, in which case they don't even show and no placement calculation needs to be done for them.
> >> 
> >> Several reasons:
> >> 
> >> 1) hypertext labels take fully as much setup and processing as
> >> non-hypertext labels. The label is written as normal but wrapped in an
> >> extra bit of code that sets a "visibility" attribute to "off".
> >> When you mouse over that point the program (actually the qt or wxt
> >> or browser support libraries) have to figure out which point that is
> >> and flip the corresponding attribute to "on", then redraw the plot.
> >> 
> >> 2) font rendering is slow.  Again this is not gnuplot itself.
> >> The font rendering is done by the display system 
> >> (qt/wxt/x11/browser/whatever).
> >> When you say      plot 'foo' using 1:2:3
> >> it plots only points or lines, so no font rendering is required.
> >> As soon as you add "with labels" this changes drastically.
> >> 
> >> 3) Actually a bit of the slow label handling may be gnuplot's
> >> fault if there are a _lot_ of labels, as in your 10^6 label case.
> >> Gnuplot maintains a singly linked list of labels and walks all the
> >> way to the end when adding a new one.  For reasonable numbers of
> >> labels that doesn't matter, but the insertion overhead is O(2) so
> >> that must eventually hurt.  If there were a legitimate use case
> >> for a million labels I'm sure that particular bit of overhead could
> >> be reduced to near zero.  Do you have one?
> >> 
> > 
> > Thanks all for the reply,
> > 
> > My use case is described in detail at:
> > https://stats.stackexchange.com/questions/376361/how-to-find-the-sample-points-that-have-statistically-meaningful-large-outlier-r
> > 
> > Basically, I have 10 million multidimensional points, and I do a XY
> > scatter plot on the two main dimensions of interest.
> 
> OK.  Upstream development source (version 5.3) now reduces the label
> insertion overhead to zero by tracking both the head and tail of the list.
> Timings before and after:
> 
>      Labels             old (sec)       new (sec)
>     -----------------------------------------
>        10^4             1.5             1.0
>        10^5             160             7.3
>        10^6             hopeless         66
> 
> One million hypertext labels took 66 seconds with the wxt terminal,
> slightly longer with qt (hard to compare because qt uses two processes
> running in parallel).  That's to draw the initial plot, however.
> With that many points the time to respond to mouse-over may be a
> substantial fraction of the original draw time.  To help with that,
> the upstream source now short-cuts hypertext generation to skip
> production of a hypertext tag if the text is empty.  So if you can
> manage to provide text for only the "potentially interesting" points
> you may find the mousing speed to be adequate to identify outliers.
> 
> The labels are still profligate in consuming memory, however, as
> there has never any particular pressure to reduce the size of the
> structure that describes a generic label.  
> According to "top" my 10^6 point trial pinned 2.6 Gbyte of memory.
> I am dubious about the practicality of 10^7 points.
> 
> Let us know if you try, and it works!
> 

Awesome, thanks!!

I tested out on master at a59021c8e97e20d00cb9e99d6392fe6bc5b56a07 from GitHub, and it handled my 1m points just fine with wxt, opens in reasonable time, and zoom / hover works. Is your computer particularly old? That is a good thing for a dev actually, as it means that things will run smoothly in any computer ;-)

Oh, and the compile time for this software is just amazing, so fast.

Me: Lenovo ThinkPad P51 laptop with CPU: Intel Core i7-7820HQ CPU (4 cores / 8 threads), RAM: 2x Samsung M471A2K43BB1-CRC (2x 16GiB), SSD: Samsung MZVLB512HAJQ-000L7 (3,000 MB/s), NVIDIA Quadro M1200 4GB GDDR5 GPU.

10m was however took about 16Gb memory and was too slow to be usable in practice. So ultimately I'll have to stick to VisIt / Paraview for this one (also they have that amazing point selection that stays on screen), but I'm sure new applications have been opened by this patch :-)

It might be easy however to implement a functionality that dumps hovered labels to stdout on wxt / debug console on HTML. Otherwise I have to remember and copy 6 digit IDs around manually. This would make gnuplot almost good enough for this use case.

I noticed one thing though: when I tried 1m points conversion to HTML, the output seemed broken even without labels, but it works on Ubuntu packaged gnuplot. I configured just with: ./configure --with-wx

Finally, I highly recommend that you move source and issue tracking to GitHub, it would simply attract more users, and I have never seen spam there ;-)

> 	cheers,
> 
> 		Ethan
> 
> 
> 
> 
> 
> > 
> > Then, some points visually stick out, and I want to use information
> > from all dimensions to understand why, so I need to get a precise ID
> > for the point, which is what I tried to use labels for.
> > 
> > I ended up resorting to VistIt to solve that one, but that was a
> > painful experience ;-)
> 
> 
> 
> 
> 
>  
> >> All that said, I get times of roughly 2.3 minutes for either wxt or
> >> qt to draw 10^5 labels.  (I didn't try 10^6).
> >> 
> >> 	Ethan
> >> 
> >> 
> >> 
> >> > Generate test data with 1 million lines:
> >> > 
> >> >     i=0; while (( $i < 1000000 )); do echo "$i $i $i"; i=$((i + 1)); done > 1m.dat
> >> > 
> >> > and here are all the tests that I've tried:
> >> > 
> >> >     #!/usr/bin/env gnuplot
> >> > 
> >> >     #set terminal png size 1024,1024
> >> >     #set output "gnuplot.png"
> >> > 
> >> >     set terminal canvas mousing
> >> >     set output "gnuplot.html"
> >> > 
> >> >     #set terminal wxt size 1024,768
> >> > 
> >> >     #plot "1m.dat" using 1:2
> >> >     plot "1m.dat" using 1:2:3 with labels hypertext
> >> > 
> >> > With all terminal types, the command without "with labels hypertext" works and finishes quickly.
> >> > 
> >> > But if I add "with labels hypertext" however, all commands take more than one hour, and I've lost patience to wait for them to finish.
> >> > 
> >> > The same goes if I remove "hypertext".
> >> > 
> >> > Tested in gnuplot 5.2 patchlevel 2, Ubuntu 18.10.

Back to comp.graphics.apps.gnuplot | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-28 02:36 -0700
  Re: Why does plotting with point labels make plot generation extremely slow? Hans-Bernhard Bröker <HBBroeker@t-online.de> - 2019-04-28 19:11 +0200
  Re: Why does plotting with point labels make plot generation extremely slow? Ethan Merritt <eamerritt@gmail.com> - 2019-04-29 06:52 +0000
    Re: Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-29 01:01 -0700
      Re: Why does plotting with point labels make plot generation extremely slow? Ethan Merritt <eamerritt@gmail.com> - 2019-04-30 01:49 +0000
        Re: Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-30 01:17 -0700
          Re: Why does plotting with point labels make plot generation extremely slow? Ethan Merritt <sfeam@users.sf.net> - 2019-04-30 19:06 +0000
            Re: Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-30 14:26 -0700

csiph-web