Path: csiph.com!news.mixmin.net!eternal-september.org!reader01.eternal-september.org!.POSTED!not-for-mail From: Ben Bacarisse Newsgroups: comp.programming Subject: Re: Another little puzzle Date: Thu, 22 Dec 2022 04:10:50 +0000 Organization: A noiseless patient Spider Lines: 75 Message-ID: <87wn6krsc5.fsf@bsb.me.uk> References: <86pmcczcak.fsf@linuxsc.com> MIME-Version: 1.0 Content-Type: text/plain Injection-Info: reader01.eternal-september.org; posting-host="d7d033ef05c573a0094d614fba2038d6"; logging-data="1357276"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1+nTP6qPA83blGlpQyEvbeo7NC+enTe12s=" User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/27.1 (gnu/linux) Cancel-Lock: sha1:BW3N564r68UyDLQ+zIRYcPzfeIg= sha1:afxJviP3fTLUDswLqxC5NQyG4iw= X-BSB-Auth: 1.1dc524e3b61b2f64e02d.20221222041050GMT.87wn6krsc5.fsf@bsb.me.uk Xref: csiph.com comp.programming:16114 Tim Rentsch writes: > ram@zedat.fu-berlin.de (Stefan Ram) writes: > >> ram@zedat.fu-berlin.de (Stefan Ram) writes: >> >>> Given n times of the 24-hour day, print their average. >>> For example, the average of "eight o'clock" and >>> "ten o'clock" (n=2) would be "nine o'clock". >>> (You can choose any representation, for example "HH:MM" >>> or "seconds since midnight".) >> >> Thanks for all replies! >> >> I waited a few days before answering to allow >> sufficient time to think about the problem. >> >> There were not enough tests written and run. As a result, >> the puzzle has not yet been solved (unless I have overlooked >> a contribution or misworded expectations). >> >> So, here are two possible test cases. >> >> average( 23.5, 1.5 )== 0.5 >> average( 11.5, 13.5 )== 12.5 >> >> (I use hours as units, so "0.5" means, "half past midnight".) >> >> I hope that these test cases encode sensible expectations >> for an average of two times on a 24-hour clock in the spirit >> of the example given in the OP, which was, "the average of >> eight o'clock and ten o'clock would be nine o'clock", since >> these test cases just have rotated that example by 3.5 and >> 15.5 hours. >> >> I believe that I have not seen an algorithm so far in this >> thread that would pass these tests. > > As before, the problem is underspecified. Some remarks not specifically in reply to you, Tim... The input is a collection, t(n), of n > 1 numbers in [0, 24). The average should be a number, A, in [0, 24) that minimises Sum_{i=1,n} distance(A, t(i)) (or Sum_{i=1,n} difference(A, t(i))^2 if you prefer to think in terms of variance). So far, this is just what an average is. The key point is what is the distance (or difference) whose sum (or sum of squares) we want to minimise? For times, I would say it is the length of the shorter arc round an imaginary 24-hour clock face. The problem has a natural interpretation in terms of angles. Whatever the circular quantity is, convert the values to unit vectors round a circle. For times of day, just scale [0, 24) to [0, 2*pi). The average is then just the direction of the average vector, converted back to a time of day. Sometimes that vector has zero length, and the average is undefined, but otherwise the length of the vector gives an indication of the variability of the data. Why do I consider this a reasonable interpretation of the problem? Well, given a list of times of day when a train is observed to pass some station, the circular 24-hour-time average should be our best estimate of the scheduled time. Obviously there are other possible readings of the problem, but I was not able to justify any of them as useful for any real-world applications. This is a case where I hope I am wrong and there /are/ other circular averages with practical interpretations. -- Ben.