Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]
Groups > comp.graphics.apps.gnuplot > #4191
| Newsgroups | comp.graphics.apps.gnuplot |
|---|---|
| Date | 2019-04-30 01:17 -0700 |
| References | <0d17edd5-6246-4f25-8edd-df15a0832de8@googlegroups.com> <qa66un$orh$1@dont-email.me> <8c68095b-0cd2-4805-ac23-d0e7cf629d00@googlegroups.com> <qa89jt$qqj$1@dont-email.me> |
| Message-ID | <a9a9f30d-8443-4c56-a1ad-1331e951b605@googlegroups.com> (permalink) |
| Subject | Re: Why does plotting with point labels make plot generation extremely slow? |
| From | ciro.santilli@gmail.com |
On Tuesday, April 30, 2019 at 2:49:51 AM UTC+1, Ethan Merritt wrote: > On Mon, 29 Apr 2019 01:01:35 -0700, ciro.santilli wrote: > > > On Monday, April 29, 2019 at 7:52:09 AM UTC+1, Ethan Merritt wrote: > >> On Sun, 28 Apr 2019 02:36:50 -0700, ciro.santilli wrote: > >> > >> > This made me curious, since I wouldn't intuitively expect that just adding point labels would add so much overhead. > >> > > >> > In particular if the point labels are hypertext, in which case they don't even show and no placement calculation needs to be done for them. > >> > >> Several reasons: > >> > >> 1) hypertext labels take fully as much setup and processing as > >> non-hypertext labels. The label is written as normal but wrapped in an > >> extra bit of code that sets a "visibility" attribute to "off". > >> When you mouse over that point the program (actually the qt or wxt > >> or browser support libraries) have to figure out which point that is > >> and flip the corresponding attribute to "on", then redraw the plot. > >> > >> 2) font rendering is slow. Again this is not gnuplot itself. > >> The font rendering is done by the display system > >> (qt/wxt/x11/browser/whatever). > >> When you say plot 'foo' using 1:2:3 > >> it plots only points or lines, so no font rendering is required. > >> As soon as you add "with labels" this changes drastically. > >> > >> 3) Actually a bit of the slow label handling may be gnuplot's > >> fault if there are a _lot_ of labels, as in your 10^6 label case. > >> Gnuplot maintains a singly linked list of labels and walks all the > >> way to the end when adding a new one. For reasonable numbers of > >> labels that doesn't matter, but the insertion overhead is O(2) so > >> that must eventually hurt. If there were a legitimate use case > >> for a million labels I'm sure that particular bit of overhead could > >> be reduced to near zero. Do you have one? > >> > > > > Thanks all for the reply, > > > > My use case is described in detail at: > > https://stats.stackexchange.com/questions/376361/how-to-find-the-sample-points-that-have-statistically-meaningful-large-outlier-r > > > > Basically, I have 10 million multidimensional points, and I do a XY > > scatter plot on the two main dimensions of interest. > > OK. Upstream development source (version 5.3) now reduces the label > insertion overhead to zero by tracking both the head and tail of the list. > Timings before and after: > > Labels old (sec) new (sec) > ----------------------------------------- > 10^4 1.5 1.0 > 10^5 160 7.3 > 10^6 hopeless 66 > > One million hypertext labels took 66 seconds with the wxt terminal, > slightly longer with qt (hard to compare because qt uses two processes > running in parallel). That's to draw the initial plot, however. > With that many points the time to respond to mouse-over may be a > substantial fraction of the original draw time. To help with that, > the upstream source now short-cuts hypertext generation to skip > production of a hypertext tag if the text is empty. So if you can > manage to provide text for only the "potentially interesting" points > you may find the mousing speed to be adequate to identify outliers. > > The labels are still profligate in consuming memory, however, as > there has never any particular pressure to reduce the size of the > structure that describes a generic label. > According to "top" my 10^6 point trial pinned 2.6 Gbyte of memory. > I am dubious about the practicality of 10^7 points. > > Let us know if you try, and it works! > Awesome, thanks!! I tested out on master at a59021c8e97e20d00cb9e99d6392fe6bc5b56a07 from GitHub, and it handled my 1m points just fine with wxt, opens in reasonable time, and zoom / hover works. Is your computer particularly old? That is a good thing for a dev actually, as it means that things will run smoothly in any computer ;-) Oh, and the compile time for this software is just amazing, so fast. Me: Lenovo ThinkPad P51 laptop with CPU: Intel Core i7-7820HQ CPU (4 cores / 8 threads), RAM: 2x Samsung M471A2K43BB1-CRC (2x 16GiB), SSD: Samsung MZVLB512HAJQ-000L7 (3,000 MB/s), NVIDIA Quadro M1200 4GB GDDR5 GPU. 10m was however took about 16Gb memory and was too slow to be usable in practice. So ultimately I'll have to stick to VisIt / Paraview for this one (also they have that amazing point selection that stays on screen), but I'm sure new applications have been opened by this patch :-) It might be easy however to implement a functionality that dumps hovered labels to stdout on wxt / debug console on HTML. Otherwise I have to remember and copy 6 digit IDs around manually. This would make gnuplot almost good enough for this use case. I noticed one thing though: when I tried 1m points conversion to HTML, the output seemed broken even without labels, but it works on Ubuntu packaged gnuplot. I configured just with: ./configure --with-wx Finally, I highly recommend that you move source and issue tracking to GitHub, it would simply attract more users, and I have never seen spam there ;-) > cheers, > > Ethan > > > > > > > > > Then, some points visually stick out, and I want to use information > > from all dimensions to understand why, so I need to get a precise ID > > for the point, which is what I tried to use labels for. > > > > I ended up resorting to VistIt to solve that one, but that was a > > painful experience ;-) > > > > > > > >> All that said, I get times of roughly 2.3 minutes for either wxt or > >> qt to draw 10^5 labels. (I didn't try 10^6). > >> > >> Ethan > >> > >> > >> > >> > Generate test data with 1 million lines: > >> > > >> > i=0; while (( $i < 1000000 )); do echo "$i $i $i"; i=$((i + 1)); done > 1m.dat > >> > > >> > and here are all the tests that I've tried: > >> > > >> > #!/usr/bin/env gnuplot > >> > > >> > #set terminal png size 1024,1024 > >> > #set output "gnuplot.png" > >> > > >> > set terminal canvas mousing > >> > set output "gnuplot.html" > >> > > >> > #set terminal wxt size 1024,768 > >> > > >> > #plot "1m.dat" using 1:2 > >> > plot "1m.dat" using 1:2:3 with labels hypertext > >> > > >> > With all terminal types, the command without "with labels hypertext" works and finishes quickly. > >> > > >> > But if I add "with labels hypertext" however, all commands take more than one hour, and I've lost patience to wait for them to finish. > >> > > >> > The same goes if I remove "hypertext". > >> > > >> > Tested in gnuplot 5.2 patchlevel 2, Ubuntu 18.10.
Back to comp.graphics.apps.gnuplot | Previous | Next — Previous in thread | Next in thread | Find similar
Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-28 02:36 -0700
Re: Why does plotting with point labels make plot generation extremely slow? Hans-Bernhard Bröker <HBBroeker@t-online.de> - 2019-04-28 19:11 +0200
Re: Why does plotting with point labels make plot generation extremely slow? Ethan Merritt <eamerritt@gmail.com> - 2019-04-29 06:52 +0000
Re: Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-29 01:01 -0700
Re: Why does plotting with point labels make plot generation extremely slow? Ethan Merritt <eamerritt@gmail.com> - 2019-04-30 01:49 +0000
Re: Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-30 01:17 -0700
Re: Why does plotting with point labels make plot generation extremely slow? Ethan Merritt <sfeam@users.sf.net> - 2019-04-30 19:06 +0000
Re: Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-30 14:26 -0700
csiph-web