Path: csiph.com!fu-berlin.de!uni-berlin.de!not-for-mail From: Pavlos Parissis Newsgroups: comp.lang.python Subject: Re: subclassing collections.Counter Date: Tue, 15 Dec 2015 23:18:27 +0100 Lines: 147 Message-ID: References: <567036A5.6050205@gmail.com> <56703DE8.7010609@gmail.com> <56705129.2020004@gmail.com> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="sU1wOloWdnhUpUqMsNHrm7Kbntcg3OAvI" X-Trace: news.uni-berlin.de 89ZhifJj3h1OlF1YLsIPZwcs+TaEScpwmXlCGfHZWyww== Return-Path: X-Original-To: python-list@python.org Delivered-To: python-list@mail.python.org X-Spam-Status: OK 0.000 X-Spam-Evidence: '*H*': 1.00; '*S*': 0.00; 'handler': 0.04; 'context': 0.05; 'lines,': 0.05; 'frontend': 0.07; 'mask': 0.07; 'socket': 0.07; 'subject:skip:c 10': 0.07; '@property': 0.09; 'csv': 0.09; 'dict': 0.09; 'event):': 0.09; 'format:': 0.09; 'metrics': 0.09; 'sockets': 0.09; 'url:github': 0.09; 'pushed': 0.13; 'def': 0.13; 'backend': 0.15; 'add(self,': 0.16; 'backend.': 0.16; 'backends': 0.16; 'correctly,': 0.16; 'distinct': 0.16; 'epoch': 0.16; 'exposes': 0.16; 'filename:fname piece:signature': 0.16; 'handler)': 0.16; 'main():': 0.16; 'processes.': 0.16; 'received:192.168.0.103': 0.16; 'received:io': 0.16; 'received:psf.io': 0.16; 'socket.': 0.16; 'true:': 0.16; 'wrote:': 0.16; 'memory': 0.17; 'case.': 0.18; 'retrieval': 0.18; '>>>': 0.20; '2015': 0.20; 'work,': 0.21; 'saying': 0.22; 'parse': 0.22; 'trying': 0.22; 'am,': 0.23; 'select': 0.23; 'bit': 0.23; 'dec': 0.23; 'split': 0.23; 'this:': 0.23; 'import': 0.24; 'assistance,': 0.24; 'unix': 0.24; 'header:In-Reply-To:1': 0.24; 'header:User- Agent:1': 0.26; 'appreciated.': 0.27; 'compare': 0.27; 'object,': 0.27; 'skip:e 30': 0.27; 'function': 0.28; 'values': 0.28; 'container': 0.29; 'queue': 0.29; 'socket,': 0.29; '15,': 0.30; 'skip:s 30': 0.31; 'another': 0.32; 'skip:_ 10': 0.32; '[1]': 0.32; 'class': 0.33; 'right?': 0.33; 'message-id:@gmail.com': 0.34; 'tue,': 0.34; 'received:google.com': 0.35; 'could': 0.35; 'files,': 0.35; 'something': 0.35; 'received:74.125.82': 0.35; "isn't": 0.35; 'needed': 0.36; 'there': 0.36; 'lines': 0.36; 'cases': 0.36; 'to:addr:python-list': 0.36; 'subject:: ': 0.37; 'suggestion': 0.37; 'skip:p 20': 0.38; 'thank': 0.38; 'files': 0.38; 'shared': 0.38; 'means': 0.39; 'does': 0.39; 'received:192': 0.39; 'to:addr:python.org': 0.40; 'where': 0.40; 'some': 0.40; 'easy': 0.60; 'your': 0.60; 'provide': 0.61; 'bring': 0.62; 'per': 0.62; 'watch': 0.62; 'here.': 0.62; 'skip:n 10': 0.62; 'more': 0.63; 'times': 0.63; 'due': 0.65; 'accessed': 0.66; 'act': 0.67; 'sum': 0.69; '100': 0.79; '10:43': 0.84; 'dict,': 0.84; 'setups': 0.84; 'watched': 0.91; 'average': 0.93; 'processes,': 0.93 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type; bh=gkqXoneygwvQsI8DecRItPk72qjCAvtNE2XoAvBnchc=; b=rOpdgd7W3emUawCZNGAeareKPv572Ov7IUB4lRd1Iluy7SdBM6o5BEmABk1MRTVWGP STptw2vzgfhZJAq01zNJskrSlhPOY51IVZbKgcDexadjIo16Muv7XHLI0cvv06TSROWc e+rcGB6jT3iYqHfsOd75d+bCOyyhZ762A+jewHtRymb43cvQhbeWrre/zGU80gWnE4cu f4pp9c3GnG4KmFR49//1qZEVHmStfJtkIUADofF4JUQ+pr4GOliPC4wowHtt1FjzMrDV jm5kkoGsHLAiTLYKms9mFM8rHxcZHP0Pre/YSLTLrmF2xzuhiN0suqZM9/8LMGGvRVXJ K8jQ== X-Received: by 10.28.137.138 with SMTP id l132mr7288150wmd.21.1450217909631; Tue, 15 Dec 2015 14:18:29 -0800 (PST) User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Icedove/31.8.0 In-Reply-To: X-BeenThere: python-list@python.org X-Mailman-Version: 2.1.20+ Precedence: list List-Id: General discussion list for the Python programming language List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Xref: csiph.com comp.lang.python:100483 This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --sU1wOloWdnhUpUqMsNHrm7Kbntcg3OAvI Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable On 15/12/2015 06:55 =CE=BC=CE=BC, Ian Kelly wrote: > On Tue, Dec 15, 2015 at 10:43 AM, Pavlos Parissis > wrote: >>> If you want your metrics container to act like a dict, then my >>> suggestion would be to just use a dict, with pseudo-collections for >>> the values as above. >>> >> >> If I understood you correctly, you are saying store all metrics in a >> dict and have a counter key as well to store the times metrics are >> pushed in, and then have a function to do the math. Am I right? >=20 > That would work, although I was actually thinking of something like thi= s: >=20 > class SummedMetric: > def __init__(self): > self.total =3D 0 > self.count =3D 0 >=20 > @property > def average(self): > return self.total / self.count >=20 > def add(self, value): > self.total +=3D value > self.count +=3D 1 >=20 > metrics =3D {} > for metric_name in all_metrics: > metrics[metric_name] =3D SummedMetric() >=20 > For averaged metrics, look at metrics['f'].average, otherwise look at > metrics['f'].total. >=20 With this approach I will have for each metric 1 object, which could cause performance issues for my case. Let me bring some context on what I am trying to do here. I want to provide a fast retrieval and processing of statistics metrics for HAProxy. HAProxy exposes stats over a UNIX socket(stats socket). HAProxy is a multi-process daemon and each process can only be accessed by a distinct stats socket. There isn't any shared memory for all these processes. That means that if a frontend or backend is managed by more than one processes, you have to collect metrics from all processes and do the sum or average based on type of the metric. stats are provided in a CSV format: https://gist.github.com/unixsurfer/ba7e3bb3f3f79dcea686 there is 1 line per frontend and backend. For servers is a bit more complicated. When there are 100 lines per process, it is easy to do the work even in setups with 24 processes(24 *100=3D2.4K lines). But, there are a lot of cases where a stats socket will return 10K lines, due to the amount of backends and servers in backends. This is 240K lines to process and provide stats per 10secs or 5 secs. My plan is to split the processing from the collection. A program will connect to all UNIX sockets asynchronously and dump the CSV to files, one per socket, and group them by EPOCH time. It will dump all files under 1 directory which will have as name the time of the retrieval. Another program in multi-process mode[1], will pick those files and parse them in sequentially to perform the aggregation. For this program I needed the CounterExt. I will try your approach as well as it is very simple and it does the work with fewer lines:-) I will compare both in terms of performance and select the fastest. Thank you very much for your assistance, very much appreciated. [1] pseudo-code from multiprocessing import Process, Queue import pyinotify wm =3D pyinotify.WatchManager() # Watch Manager mask =3D pyinotify.IN_CREATE # watched events class EventHandler(pyinotify.ProcessEvent): def __init__(self, queue): self.queue =3D queue def process_IN_CREATE(self, event): self.queue.put(event.pathname) def work(queue): while True: job =3D queue.get() if job =3D=3D 'STOP': break print(job) def main(): pnum =3D 10 queue =3D Queue() plist =3D [] for i in range(pnum): p =3D Process(target=3Dwork, args=3D(queue,)) p.start() plist.append(p) handler =3D EventHandler(queue) notifier =3D pyinotify.Notifier(wm, handler) wdd =3D wm.add_watch('/tmp/test', mask, rec=3DTrue) notifier.loop() --sU1wOloWdnhUpUqMsNHrm7Kbntcg3OAvI Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iQIcBAEBCAAGBQJWcJGzAAoJEIP8ktofcXa5wlYQANU3NvMKDi4ZULC1YjbyDoFn XGfbhKbrTF+5vX5F8z9oGPSP8Dbzw9vvBpvK4LXz1UF751f3cJRyRCWaElLcOD6N iqTVl3KkFagw7Dgo1qg57xqOKZttO1N0yPC1x93SBjyrHxYup4767cUKucaDpY7L ven2anu+G4vkLIPHrxndZ2JQSKbMFbhDSTpmAyhBClyEkGRngqWUY3T4J92GdUVh YIYuOnZeDdvs8RVN5Ufr0V6OnpVS6UG3MaNH8vchZbIyXiDbnS9zunkxK5OyDmS0 tJCRsQiyLTtMFrNVd/G0sOn1QzTQL0nBaFkM7496PcUgOsEssxg8MZJASsK70bro /MFzJ05Iub36nvmWTABxTIUdA6iJ/ooX6sNAlNMl7fBgxa3lT8r4Bu6X2pRWHf0p iPOtVJTVdCWsJEhWxSfegmMoOrAWhS+lkAnDrowilhLqTDbOrlpaSjeZ0qZsiVUN dLBG4L++17gOfpG+fc7edVt7rWwSNvFNxHkpa5Yf94q3KhqxKBzs2k37K5vC7YPc 4mL5qiAHZdi70OAIUDLeDJXIwH5hvbGxj4Xnj+X1KH2semED18HdTuORnjbDxxYM /JnR16FAsQ5911zUIce0MoKetI7Fm1k7CDDzgkL0xnq4PDCwLsxcSsOuLFNpNNRg GVrLL6zmeVOIMbPrq3st =DI+T -----END PGP SIGNATURE----- --sU1wOloWdnhUpUqMsNHrm7Kbntcg3OAvI--