Path: csiph.com!news.mixmin.net!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail
From: Ben Bacarisse <ben.usenet@bsb.me.uk>
Newsgroups: comp.programming
Subject: Re: Another little puzzle
Date: Thu, 22 Dec 2022 04:10:50 +0000
Organization: A noiseless patient Spider
Lines: 75
Message-ID: <87wn6krsc5.fsf@bsb.me.uk>
References: <puzzle-20221214131815@ram.dialup.fu-berlin.de> <algorithm-20221221130021@ram.dialup.fu-berlin.de> <86pmcczcak.fsf@linuxsc.com>
MIME-Version: 1.0
Content-Type: text/plain
Injection-Info: reader01.eternal-september.org; posting-host="d7d033ef05c573a0094d614fba2038d6"; logging-data="1357276"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+nTP6qPA83blGlpQyEvbeo7NC+enTe12s="
User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux)
Cancel-Lock: sha1:BW3N564r68UyDLQ+zIRYcPzfeIg= sha1:afxJviP3fTLUDswLqxC5NQyG4iw=
X-BSB-Auth: 1.1dc524e3b61b2f64e02d.20221222041050GMT.87wn6krsc5.fsf@bsb.me.uk
Xref: csiph.com comp.programming:16114

Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

> ram@zedat.fu-berlin.de (Stefan Ram) writes:
>
>> ram@zedat.fu-berlin.de (Stefan Ram) writes:
>>
>>> Given n times of the 24-hour day, print their average.
>>> For example, the average of "eight o'clock" and
>>> "ten o'clock" (n=2) would be "nine o'clock".
>>> (You can choose any representation, for example "HH:MM"
>>> or "seconds since midnight".)
>>
>>   Thanks for all replies!
>>
>>   I waited a few days before answering to allow
>>   sufficient time to think about the problem.
>>
>>   There were not enough tests written and run.  As a result,
>>   the puzzle has not yet been solved (unless I have overlooked
>>   a contribution or misworded expectations).
>>
>>   So, here are two possible test cases.
>>
>> average( 23.5,  1.5 )==  0.5
>> average( 11.5, 13.5 )== 12.5
>>
>>   (I use hours as units, so "0.5" means, "half past midnight".)
>>
>>   I hope that these test cases encode sensible expectations
>>   for an average of two times on a 24-hour clock in the spirit
>>   of the example given in the OP, which was, "the average of
>>   eight o'clock and ten o'clock would be nine o'clock", since
>>   these test cases just have rotated that example by 3.5 and
>>   15.5 hours.
>>
>>   I believe that I have not seen an algorithm so far in this
>>   thread that would pass these tests.
>
> As before, the problem is underspecified.

Some remarks not specifically in reply to you, Tim...

The input is a collection, t(n), of n > 1 numbers in [0, 24).  The
average should be a number, A, in [0, 24) that minimises

  Sum_{i=1,n} distance(A, t(i))

(or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms of
variance).  So far, this is just what an average is.  The key point is
what is the distance (or difference) whose sum (or sum of squares) we
want to minimise?  For times, I would say it is the length of the
shorter arc round an imaginary 24-hour clock face.

The problem has a natural interpretation in terms of angles.  Whatever
the circular quantity is, convert the values to unit vectors round a
circle.  For times of day, just scale [0, 24) to [0, 2*pi).  The average
is then just the direction of the average vector, converted back to a
time of day.

Sometimes that vector has zero length, and the average is undefined, but
otherwise the length of the vector gives an indication of the
variability of the data.

Why do I consider this a reasonable interpretation of the problem?
Well, given a list of times of day when a train is observed to pass some
station, the circular 24-hour-time average should be our best estimate
of the scheduled time.

Obviously there are other possible readings of the problem, but I was
not able to justify any of them as useful for any real-world
applications.  This is a case where I hope I am wrong and there /are/
other circular averages with practical interpretations.

-- 
Ben.