Groups | Search | Server Info | Keyboard shortcuts | Login | Register [http] [https] [nntp] [nntps]


Groups > comp.graphics.apps.gnuplot > #4190

Re: Why does plotting with point labels make plot generation extremely slow?

Path csiph.com!eternal-september.org!feeder.eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From Ethan Merritt <eamerritt@gmail.com>
Newsgroups comp.graphics.apps.gnuplot
Subject Re: Why does plotting with point labels make plot generation extremely slow?
Date Tue, 30 Apr 2019 01:49:49 -0000 (UTC)
Organization A noiseless patient Spider
Lines 126
Message-ID <qa89jt$qqj$1@dont-email.me> (permalink)
References <0d17edd5-6246-4f25-8edd-df15a0832de8@googlegroups.com> <qa66un$orh$1@dont-email.me> <8c68095b-0cd2-4805-ac23-d0e7cf629d00@googlegroups.com>
Mime-Version 1.0
Content-Type text/plain; charset=UTF-8
Content-Transfer-Encoding 8bit
Injection-Date Tue, 30 Apr 2019 01:49:49 -0000 (UTC)
Injection-Info reader02.eternal-september.org; posting-host="5a8dabee16ed3df594cda3ac1f141585"; logging-data="27475"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/T5YGbsbZj8vt4jsnF9SUS"
User-Agent Pan/0.139 (Sexual Chocolate; GIT bf56508 git://git.gnome.org/pan2)
Cancel-Lock sha1:HHIVJ4j/XsKu97voWDkzE7LWWuI=
Xref csiph.com comp.graphics.apps.gnuplot:4190

Show key headers only | View raw


On Mon, 29 Apr 2019 01:01:35 -0700, ciro.santilli wrote:

> On Monday, April 29, 2019 at 7:52:09 AM UTC+1, Ethan Merritt wrote:
>> On Sun, 28 Apr 2019 02:36:50 -0700, ciro.santilli wrote:
>> 
>> > This made me curious, since I wouldn't intuitively expect that just adding point labels would add so much overhead.
>> > 
>> > In particular if the point labels are hypertext, in which case they don't even show and no placement calculation needs to be done for them.
>> 
>> Several reasons:
>> 
>> 1) hypertext labels take fully as much setup and processing as
>> non-hypertext labels. The label is written as normal but wrapped in an
>> extra bit of code that sets a "visibility" attribute to "off".
>> When you mouse over that point the program (actually the qt or wxt
>> or browser support libraries) have to figure out which point that is
>> and flip the corresponding attribute to "on", then redraw the plot.
>> 
>> 2) font rendering is slow.  Again this is not gnuplot itself.
>> The font rendering is done by the display system 
>> (qt/wxt/x11/browser/whatever).
>> When you say      plot 'foo' using 1:2:3
>> it plots only points or lines, so no font rendering is required.
>> As soon as you add "with labels" this changes drastically.
>> 
>> 3) Actually a bit of the slow label handling may be gnuplot's
>> fault if there are a _lot_ of labels, as in your 10^6 label case.
>> Gnuplot maintains a singly linked list of labels and walks all the
>> way to the end when adding a new one.  For reasonable numbers of
>> labels that doesn't matter, but the insertion overhead is O(2) so
>> that must eventually hurt.  If there were a legitimate use case
>> for a million labels I'm sure that particular bit of overhead could
>> be reduced to near zero.  Do you have one?
>> 
> 
> Thanks all for the reply,
> 
> My use case is described in detail at:
> https://stats.stackexchange.com/questions/376361/how-to-find-the-sample-points-that-have-statistically-meaningful-large-outlier-r
> 
> Basically, I have 10 million multidimensional points, and I do a XY
> scatter plot on the two main dimensions of interest.

OK.  Upstream development source (version 5.3) now reduces the label
insertion overhead to zero by tracking both the head and tail of the list.
Timings before and after:

     Labels             old (sec)       new (sec)
    -----------------------------------------
       10^4             1.5             1.0
       10^5             160             7.3
       10^6             hopeless         66

One million hypertext labels took 66 seconds with the wxt terminal,
slightly longer with qt (hard to compare because qt uses two processes
running in parallel).  That's to draw the initial plot, however.
With that many points the time to respond to mouse-over may be a
substantial fraction of the original draw time.  To help with that,
the upstream source now short-cuts hypertext generation to skip
production of a hypertext tag if the text is empty.  So if you can
manage to provide text for only the "potentially interesting" points
you may find the mousing speed to be adequate to identify outliers.

The labels are still profligate in consuming memory, however, as
there has never any particular pressure to reduce the size of the
structure that describes a generic label.  
According to "top" my 10^6 point trial pinned 2.6 Gbyte of memory.
I am dubious about the practicality of 10^7 points.

Let us know if you try, and it works!

	cheers,

		Ethan





> 
> Then, some points visually stick out, and I want to use information
> from all dimensions to understand why, so I need to get a precise ID
> for the point, which is what I tried to use labels for.
> 
> I ended up resorting to VistIt to solve that one, but that was a
> painful experience ;-)





 
>> All that said, I get times of roughly 2.3 minutes for either wxt or
>> qt to draw 10^5 labels.  (I didn't try 10^6).
>> 
>> 	Ethan
>> 
>> 
>> 
>> > Generate test data with 1 million lines:
>> > 
>> >     i=0; while (( $i < 1000000 )); do echo "$i $i $i"; i=$((i + 1)); done > 1m.dat
>> > 
>> > and here are all the tests that I've tried:
>> > 
>> >     #!/usr/bin/env gnuplot
>> > 
>> >     #set terminal png size 1024,1024
>> >     #set output "gnuplot.png"
>> > 
>> >     set terminal canvas mousing
>> >     set output "gnuplot.html"
>> > 
>> >     #set terminal wxt size 1024,768
>> > 
>> >     #plot "1m.dat" using 1:2
>> >     plot "1m.dat" using 1:2:3 with labels hypertext
>> > 
>> > With all terminal types, the command without "with labels hypertext" works and finishes quickly.
>> > 
>> > But if I add "with labels hypertext" however, all commands take more than one hour, and I've lost patience to wait for them to finish.
>> > 
>> > The same goes if I remove "hypertext".
>> > 
>> > Tested in gnuplot 5.2 patchlevel 2, Ubuntu 18.10.

Back to comp.graphics.apps.gnuplot | Previous | NextPrevious in thread | Next in thread | Find similar


Thread

Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-28 02:36 -0700
  Re: Why does plotting with point labels make plot generation extremely slow? Hans-Bernhard Bröker <HBBroeker@t-online.de> - 2019-04-28 19:11 +0200
  Re: Why does plotting with point labels make plot generation extremely slow? Ethan Merritt <eamerritt@gmail.com> - 2019-04-29 06:52 +0000
    Re: Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-29 01:01 -0700
      Re: Why does plotting with point labels make plot generation extremely slow? Ethan Merritt <eamerritt@gmail.com> - 2019-04-30 01:49 +0000
        Re: Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-30 01:17 -0700
          Re: Why does plotting with point labels make plot generation extremely slow? Ethan Merritt <sfeam@users.sf.net> - 2019-04-30 19:06 +0000
            Re: Why does plotting with point labels make plot generation extremely slow? ciro.santilli@gmail.com - 2019-04-30 14:26 -0700

csiph-web