Path: csiph.com!x330-a1.tempe.blueboxinc.net!usenet.pasdenom.info!news.dougwise.org!nntpfeed.proxad.net!proxad.net!feeder2-2.proxad.net!nx02.iad01.newshosting.com!newshosting.com!216.196.98.142.MISMATCH!border3.nntp.dca.giganews.com!border1.nntp.dca.giganews.com!border4.nntp.dca.giganews.com!border2.nntp.dca.giganews.com!nntp.giganews.com!npeer01.iad.highwinds-media.com!news.highwinds-media.com!feed-me.highwinds-media.com!spln!extra.newsguy.com!newsp.newsguy.com!news4 From: Michael Wojcik Newsgroups: comp.lang.java.programmer Subject: Re: analysis of java application logs Date: Thu, 26 May 2011 14:58:57 -0400 Organization: Micro Focus Lines: 40 Message-ID: References: NNTP-Posting-Host: p4a054717618dbd2ad44413d63ae4569a6f31117f72857744.newsdawg.com Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit User-Agent: Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.8.1.23) Gecko/20090812 Thunderbird/2.0.0.23 Mnenhy/0.7.5.0 In-Reply-To: Xref: x330-a1.tempe.blueboxinc.net comp.lang.java.programmer:4613 Martin Gregorie wrote: > On Tue, 24 May 2011 15:04:41 +0100, Tom Anderson wrote: > >> There are those who advocate awk for this sort of thing, but frankly >> that seems like overkill. > > Besides, I very often find I need to calculate maxima, minima, averages, > etc as part of the log analysis and find that very easy to do with awk. Yes, sometimes it's useful to maintain state while analyzing logs, and that's more convenient with awk (or any of a thousand other scripting languages) than it is with sed. A couple of months back I tracked down some tricky locking issues (multiple multi-threaded processes, with multiple components written by multiple developers in multiple languages) by analyzing lock files with an awk script; it made it easy to keep track of contention, reentrant locking, number of waiters, etc on a per-lock basis. Back in '91 I demonstrated a bug in AIX 3.1 libc's implementation of free by using awk to turn a malloc/free log into a series of cut-down C programs that duplicated the allocation access pattern, until I had a minimal test program that demonstrated the problem. Come to think of it, just a few weeks ago I was using a little awk script to churn through the output from my various sentiment-determination UIMA annotators[1] and compute precision, recall, and F1 values. Obviously, I could do any of those tasks in Perl, or Python, or Java, or COBOL, or what have you. I only use awk because I've been using it for more than 20 years now and so it has the advantage of familiarity. [1] ObCLJP: The annotators are written in Java. -- Michael Wojcik Micro Focus Rhetoric & Writing, Michigan State University