Groups > comp.lang.java.programmer > #4442 > unrolled thread

analysis of java application logs

Started by	Ulrich Scholz <d7@thispla.net>
First post	2011-05-23 00:50 -0700
Last post	2011-05-30 01:08 -0400
Articles	20 on this page of 39 — 14 participants

Back to article view | Back to comp.lang.java.programmer

  analysis of java application logs Ulrich Scholz <d7@thispla.net> - 2011-05-23 00:50 -0700
    Re: analysis of java application logs Robert Klemme <shortcutter@googlemail.com> - 2011-05-23 02:20 -0700
    Re: analysis of java application logs jlp <jlp@jlp.com> - 2011-05-23 11:45 +0200
    Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-23 09:11 -0400
      Re: analysis of java application logs Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2011-05-23 19:16 +0200
        Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-23 15:02 -0400
          Re: analysis of java application logs Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2011-05-23 22:03 +0200
            Re: analysis of java application logs Michael Wojcik <mwojcik@newsguy.com> - 2011-05-26 14:43 -0400
          Re: analysis of java application logs Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-05-23 18:32 -0300
            Re: analysis of java application logs Ulrich Scholz <d7@thispla.net> - 2011-05-25 06:00 -0700
              Re: analysis of java application logs Arved Sandstrom <asandstrom3minus1@eastlink.ca> - 2011-05-25 19:04 -0300
          Re: analysis of java application logs Martin Gregorie <martin@address-in-sig.invalid> - 2011-05-23 22:25 +0000
            Re: analysis of java application logs Nigel Wade <nmw-news@ion.le.ac.uk> - 2011-05-24 12:26 +0100
              Re: analysis of java application logs Martin Gregorie <martin@address-in-sig.invalid> - 2011-05-24 12:29 +0000
                Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-24 08:49 -0400
                  Re: analysis of java application logs Martin Gregorie <martin@address-in-sig.invalid> - 2011-05-24 14:37 +0000
                    Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-24 12:26 -0400
                      Re: analysis of java application logs Jim Gibson <jimsgibson@gmail.com> - 2011-05-24 11:00 -0700
                        Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-24 14:35 -0400
                          Re: analysis of java application logs Nigel Wade <nmw-news@ion.le.ac.uk> - 2011-05-25 09:53 +0100
              Re: analysis of java application logs Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2011-05-24 19:12 +0200
    Re: analysis of java application logs Patricia Shanahan <pats@acm.org> - 2011-05-23 06:17 -0700
      Re: analysis of java application logs Robert Klemme <shortcutter@googlemail.com> - 2011-05-23 20:33 +0200
        Re: analysis of java application logs Martin Gregorie <martin@address-in-sig.invalid> - 2011-05-23 19:07 +0000
    Re: analysis of java application logs CncShipper <anon@nowhere.com> - 2011-05-23 14:56 +0000
      Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-23 11:43 -0400
        Re: analysis of java application logs jlp <jlp@jlp.com> - 2011-05-23 18:00 +0200
          Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-23 12:20 -0400
            Re: analysis of java application logs Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2011-05-23 19:06 +0200
              Re: analysis of java application logs Robert Klemme <shortcutter@googlemail.com> - 2011-05-23 20:27 +0200
                Re: analysis of java application logs Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid> - 2011-05-23 21:02 +0200
                Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-23 15:06 -0400
                  Re: analysis of java application logs Robert Klemme <shortcutter@googlemail.com> - 2011-05-23 22:10 +0200
                    Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-23 17:04 -0400
    Re: analysis of java application logs Tom Anderson <twic@urchin.earth.li> - 2011-05-24 15:04 +0100
      Re: analysis of java application logs Martin Gregorie <martin@address-in-sig.invalid> - 2011-05-24 14:50 +0000
        Re: analysis of java application logs Michael Wojcik <mwojcik@newsguy.com> - 2011-05-26 14:58 -0400
        Re: analysis of java application logs Lawrence D'Oliveiro <ldo@geek-central.gen.new_zealand> - 2011-05-30 16:23 +1200
          Re: analysis of java application logs Lew <noone@lewscanon.com> - 2011-05-30 01:08 -0400

Page 1 of 2 [1] 2 Next page →

#4442 — analysis of java application logs

From	Ulrich Scholz <d7@thispla.net>
Date	2011-05-23 00:50 -0700
Subject	analysis of java application logs
Message-ID	<bd933ace-5641-4711-9105-4e949a602b87@c1g2000yqe.googlegroups.com>

Hi,

I'm looking for an approach to the problem of analyzing application
log files.

I need to analyse Java log files from applications (i.e., not logs of
web servers). These logs contain Java exceptions, thread dumps, and
free-form log4j messages issued by log statements inserted by
programmers during development. Right now, these man-made log entries
do not have any specific format.

What I'm looking for is a tool and/or strategy that supports in lexing/
parsing, tagging, and analysing the log entries. Because there is only
little defined syntax and grammar - and because you might not know
what you are looking for - the task requires the quick issuing of
queries against the log data base. Some sort of visualization would be
nice, too.

Pointers to existing tools and approaches as well as appropriate tools/
algorithms to develop the required system would be welcome.

Ulrich

[toc] | [next] | [standalone]

#4443

From	Robert Klemme <shortcutter@googlemail.com>
Date	2011-05-23 02:20 -0700
Message-ID	<ef3e4f7a-539f-4dd3-a82e-5982cd171a51@h9g2000yqk.googlegroups.com>
In reply to	#4442

On 23 Mai, 09:50, Ulrich Scholz <d...@thispla.net> wrote:
> I'm looking for an approach to the problem of analyzing application
> log files.
>
> I need to analyse Java log files from applications (i.e., not logs of
> web servers). These logs contain Java exceptions, thread dumps, and
> free-form log4j messages issued by log statements inserted by
> programmers during development. Right now, these man-made log entries
> do not have any specific format.
>
> What I'm looking for is a tool and/or strategy that supports in lexing/
> parsing, tagging, and analysing the log entries. Because there is only
> little defined syntax and grammar - and because you might not know
> what you are looking for - the task requires the quick issuing of
> queries against the log data base. Some sort of visualization would be
> nice, too.
>
> Pointers to existing tools and approaches as well as appropriate tools/
> algorithms to develop the required system would be welcome.

I once did a project for our Ruby Best Practices blog.  The code is
over there at github:
https://github.com/rklemme/muppet-laboratories

Explanations can be found in the blog.  This is the first posting of
the series:
http://blog.rubybestpractices.com/posts/rklemme/005_Enter_the_Muppet_Laboratories.html

This works different from what you want: log files are read and
written out to small log files according to particular criteria.  But
you could reuse the parsing part (including detection of multi line
log statements) and write what you found into a relational database.
If you have it in the DB you can query for at least timestamp, log
level, message content and probably also thread id and class.  If you
want to do custom tagging you could do that once the data is in the
database.

Since we do not know what goal your analysis has and how many
different questions to want to ask the data it's not entirely clear
whether that would be the optimal approach for your problem.  One
variant to the above would be to provide the parsing process a number
of regular expressions with a label attached and label all log entries
during insertion into the database.  But since modern relational
databases usually also support full text indexing and regular
expression matches that might also be solved with a view.  If your
data volume is large you need to additionally make sure this remains
efficient.

Kind regards

robert

[toc] | [prev] | [next] | [standalone]

#4445

From	jlp <jlp@jlp.com>
Date	2011-05-23 11:45 +0200
Message-ID	<4dda2ca1$0$30771$ba4acef3@reader.news.orange.fr>
In reply to	#4442

Le 23/05/2011 09:50, Ulrich Scholz a écrit :
> Hi,
>
> I'm looking for an approach to the problem of analyzing application
> log files.
>
> I need to analyse Java log files from applications (i.e., not logs of
> web servers). These logs contain Java exceptions, thread dumps, and
> free-form log4j messages issued by log statements inserted by
> programmers during development. Right now, these man-made log entries
> do not have any specific format.
>
> What I'm looking for is a tool and/or strategy that supports in lexing/
> parsing, tagging, and analysing the log entries. Because there is only
> little defined syntax and grammar - and because you might not know
> what you are looking for - the task requires the quick issuing of
> queries against the log data base. Some sort of visualization would be
> nice, too.
>
> Pointers to existing tools and approaches as well as appropriate tools/
> algorithms to develop the required system would be welcome.
>
> Ulrich
At work, so it is not free, with a colleague we have developped a such tool.

The colleague has developped the Viewer of CSV file with the library 
JFreeChart. The csv files are time series ( date are for example in 
format YYYY/MM/DD:HH:mm:ss )
I have developped my own  parser that translates native logs => csv files.
In java i have used the java regexp patterns.
In a file, we have to find the beginning and the end of an 
enregistrement ( it can be a multi-lines enregistrement). I can 
exclude/include enregistrements with java regexp patterns.

We have to match the pattern of the date ( regexp and java dateFormat 
pattern).
For every enregistrement, we can extract usefull values by pattern 
matching ( I use a two passes matching to simplify the patterns) the 
values can be bound to a filter ( http URL for example)
All this is embedded in swing components.

I can parse acces logs ( Apache, tomcat, weblogic), log4J logs, Verbse 
GC of JVM ( IBM JVM, Open JDK 7 ..), java Threads dumps, hibernate sql 
logs, Tuxedo logs and more generally all implicit or explicit dated 
enregistrements.
That are the main ways ...
I take me a long time, an still in developpement ... but we have not 
found any other tool.

[toc] | [prev] | [next] | [standalone]

#4447

From	Lew <noone@lewscanon.com>
Date	2011-05-23 09:11 -0400
Message-ID	<irdmdc$hfg$1@news.albasani.net>
In reply to	#4442

Ulrich Scholz wrote:
> I'm looking for an approach to the problem of analyzing application
> log files.
>
> I need to analyse Java log files from applications (i.e., not logs of
> web servers). These logs contain Java exceptions, thread dumps, and
> free-form log4j messages issued by log statements inserted by
> programmers during development. Right now, these man-made log entries
> do not have any specific format.
>
> What I'm looking for is a tool and/or strategy that supports in lexing/
> parsing, tagging, and analysing the log entries. Because there is only
> little defined syntax and grammar - and because you might not know
> what you are looking for - the task requires the quick issuing of
> queries against the log data base. Some sort of visualization would be
> nice, too.
>
> Pointers to existing tools and approaches as well as appropriate tools/
> algorithms to develop the required system would be welcome.

It helps if you have a logging strategy that mandates a consistent logging 
format, specific information in particular positions or marked by particular 
markup, logging levels and other such so that your analysis tool isn't faced 
with a completely open-ended input.  What you describe requires a general 
text-analysis approach, as you indicate that you can make no guarantees about 
the format.  Based on that, your best tool is "less" or equivalent text-file 
reader.

What is a tool supposed to do, read your mind?

It's really hard to extract information from a garbage can where people just 
randomly dumped whatever they individually felt like dumping without regard 
for operational needs.  You can't build a skyscraper on a bad foundation, and 
you can't build a good log analysis off a crappy log.

Fix the logging system, then the analysis problem will be tractable.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]

#4472

From	Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid>
Date	2011-05-23 19:16 +0200
Message-ID	<ire4op$fgq$1@dont-email.me>
In reply to	#4447

On 23/05/2011 15:11, Lew allegedly wrote:
> Ulrich Scholz wrote:
>> I'm looking for an approach to the problem of analyzing application
>> log files.
>>
>> I need to analyse Java log files from applications (i.e., not logs of
>> web servers). These logs contain Java exceptions, thread dumps, and
>> free-form log4j messages issued by log statements inserted by
>> programmers during development. Right now, these man-made log entries
>> do not have any specific format.
>>
>> What I'm looking for is a tool and/or strategy that supports in lexing/
>> parsing, tagging, and analysing the log entries. Because there is only
>> little defined syntax and grammar - and because you might not know
>> what you are looking for - the task requires the quick issuing of
>> queries against the log data base. Some sort of visualization would be
>> nice, too.
>>
>> Pointers to existing tools and approaches as well as appropriate tools/
>> algorithms to develop the required system would be welcome.
>
> It helps if you have a logging strategy that mandates a consistent
> logging format, specific information in particular positions or marked
> by particular markup, logging levels and other such so that your
> analysis tool isn't faced with a completely open-ended input. What you
> describe requires a general text-analysis approach, as you indicate that
> you can make no guarantees about the format. Based on that, your best
> tool is "less" or equivalent text-file reader.
>
> What is a tool supposed to do, read your mind?
>
> It's really hard to extract information from a garbage can where people
> just randomly dumped whatever they individually felt like dumping
> without regard for operational needs. You can't build a skyscraper on a
> bad foundation, and you can't build a good log analysis off a crappy log.
>
> Fix the logging system, then the analysis problem will be tractable.
>

I would argue around the same lines.

I've been faced a while ago with a situation where some orthogonal
organisational unit wanted to exploit my logs. I told them to GTFO.

My logs are my logs. I put in it what I consider necessary. I often
improve them as I step through the code. I might change the message, fix
the level, &c. I don't want to have them set in stone. Neither do I
generally have enough confidence in them to allow them to be used for
analysis.

"The solution, then, is simple", I told them, "spec out the exact
messages and arguments you want, and the exact situations you want them
logged in, and I'll add them for you. But leave me my precious debugging
logs."

Let me emphasize: IMHO debugging logs and logs for analysis are two
different things and should be kept strictly separated -- possibly
logged to a different target respectively.

-- 
DF.
An escaped convict once said to me:
"Alcatraz is the place to be"

[toc] | [prev] | [next] | [standalone]

#4482

From	Lew <noone@lewscanon.com>
Date	2011-05-23 15:02 -0400
Message-ID	<ireave$1m2$1@news.albasani.net>
In reply to	#4472

On 05/23/2011 01:16 PM, Daniele Futtorovic wrote:
> On 23/05/2011 15:11, Lew allegedly wrote:
>> Ulrich Scholz wrote:
>>> I'm looking for an approach to the problem of analyzing application
>>> log files.
>>>
>>> I need to analyse Java log files from applications (i.e., not logs of
>>> web servers). These logs contain Java exceptions, thread dumps, and
>>> free-form log4j messages issued by log statements inserted by
>>> programmers during development. Right now, these man-made log entries
>>> do not have any specific format.
>>>
>>> What I'm looking for is a tool and/or strategy that supports in lexing/
>>> parsing, tagging, and analysing the log entries. Because there is only
>>> little defined syntax and grammar - and because you might not know
>>> what you are looking for - the task requires the quick issuing of
>>> queries against the log data base. Some sort of visualization would be
>>> nice, too.
>>>
>>> Pointers to existing tools and approaches as well as appropriate tools/
>>> algorithms to develop the required system would be welcome.
>>
>> It helps if you have a logging strategy that mandates a consistent
>> logging format, specific information in particular positions or marked
>> by particular markup, logging levels and other such so that your
>> analysis tool isn't faced with a completely open-ended input. What you
>> describe requires a general text-analysis approach, as you indicate that
>> you can make no guarantees about the format. Based on that, your best
>> tool is "less" or equivalent text-file reader.
>>
>> What is a tool supposed to do, read your mind?
>>
>> It's really hard to extract information from a garbage can where people
>> just randomly dumped whatever they individually felt like dumping
>> without regard for operational needs. You can't build a skyscraper on a
>> bad foundation, and you can't build a good log analysis off a crappy log.
>>
>> Fix the logging system, then the analysis problem will be tractable.
>>
>
> I would argue around the same lines.
>
> I've been faced a while ago with a situation where some orthogonal
> organisational unit wanted to exploit my logs. I told them to GTFO.
>
> My logs are my logs. I put in it what I consider necessary. I often
> improve them as I step through the code. I might change the message, fix
> the level, &c. I don't want to have them set in stone. Neither do I
> generally have enough confidence in them to allow them to be used for
> analysis.
>
> "The solution, then, is simple", I told them, "spec out the exact
> messages and arguments you want, and the exact situations you want them
> logged in, and I'll add them for you. But leave me my precious debugging
> logs."
>
> Let me emphasize: IMHO debugging logs and logs for analysis are two
> different things and should be kept strictly separated -- possibly
> logged to a different target respectively.

That last is rather a brilliant idea, to use different targets.  Heretofore 
I've espoused that logs are primarily an operations tool, not a debugging 
tool, although in service of the former they inevitably and inherently must 
support the former.  The problem I've always seen is that logging statements 
are left up to the programmer, and not specified for the project.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]

#4490

From	Daniele Futtorovic <da.futt.news@laposte-dot-net.invalid>
Date	2011-05-23 22:03 +0200
Message-ID	<ireehm$n4r$1@dont-email.me>
In reply to	#4482

On 23/05/2011 21:02, Lew allegedly wrote:
> On 05/23/2011 01:16 PM, Daniele Futtorovic wrote:
>> I've been faced a while ago with a situation where some orthogonal
>> organisational unit wanted to exploit my logs. I told them to GTFO.
>>
>> My logs are my logs. I put in it what I consider necessary. I often
>> improve them as I step through the code. I might change the message, fix
>> the level, &c. I don't want to have them set in stone. Neither do I
>> generally have enough confidence in them to allow them to be used for
>> analysis.
>>
>> "The solution, then, is simple", I told them, "spec out the exact
>> messages and arguments you want, and the exact situations you want them
>> logged in, and I'll add them for you. But leave me my precious debugging
>> logs."
>>
>> Let me emphasize: IMHO debugging logs and logs for analysis are two
>> different things and should be kept strictly separated -- possibly
>> logged to a different target respectively.
>
> That last is rather a brilliant idea, to use different targets.
> Heretofore I've espoused that logs are primarily an operations tool, not
> a debugging tool, although in service of the former they inevitably and
> inherently must support the former. The problem I've always seen is that
> logging statements are left up to the programmer, and not specified for
> the project.
>

I'd call it (what I described): audit logging. I don't know if the 
meaning of that term normally extends beyond databases, but I don't see 
why it shouldn't.

-- 
DF.
An escaped convict once said to me:
"Alcatraz is the place to be"

[toc] | [prev] | [next] | [standalone]

#4614

From	Michael Wojcik <mwojcik@newsguy.com>
Date	2011-05-26 14:43 -0400
Message-ID	<irmgib12gdn@news4.newsguy.com>
In reply to	#4490

Daniele Futtorovic wrote:
> 
> I'd call it (what I described): audit logging. I don't know if the
> meaning of that term normally extends beyond databases, but I don't see
> why it shouldn't.

It's also used widely in security contexts.

Most of the products I work on have multiple logging targets, for
problem determination (which includes debugging), system history,
auditing, and other operational uses. Sometimes these use different
mechanisms, because they have different requirements - for example,
security audit logs need to be protected, insofar as that's possible,
from tampering by unauthorized users.

-- 
Michael Wojcik
Micro Focus
Rhetoric & Writing, Michigan State University

[toc] | [prev] | [next] | [standalone]

#4497

From	Arved Sandstrom <asandstrom3minus1@eastlink.ca>
Date	2011-05-23 18:32 -0300
Message-ID	<QnACp.3859$cs1.1878@newsfe15.iad>
In reply to	#4482

On 11-05-23 04:02 PM, Lew wrote:
> On 05/23/2011 01:16 PM, Daniele Futtorovic wrote:
>> On 23/05/2011 15:11, Lew allegedly wrote:
>>> Ulrich Scholz wrote:
>>>> I'm looking for an approach to the problem of analyzing application
>>>> log files.
>>>>
>>>> I need to analyse Java log files from applications (i.e., not logs of
>>>> web servers). These logs contain Java exceptions, thread dumps, and
>>>> free-form log4j messages issued by log statements inserted by
>>>> programmers during development. Right now, these man-made log entries
>>>> do not have any specific format.
>>>>
>>>> What I'm looking for is a tool and/or strategy that supports in lexing/
>>>> parsing, tagging, and analysing the log entries. Because there is only
>>>> little defined syntax and grammar - and because you might not know
>>>> what you are looking for - the task requires the quick issuing of
>>>> queries against the log data base. Some sort of visualization would be
>>>> nice, too.
>>>>
>>>> Pointers to existing tools and approaches as well as appropriate tools/
>>>> algorithms to develop the required system would be welcome.
>>>
>>> It helps if you have a logging strategy that mandates a consistent
>>> logging format, specific information in particular positions or marked
>>> by particular markup, logging levels and other such so that your
>>> analysis tool isn't faced with a completely open-ended input. What you
>>> describe requires a general text-analysis approach, as you indicate that
>>> you can make no guarantees about the format. Based on that, your best
>>> tool is "less" or equivalent text-file reader.
>>>
>>> What is a tool supposed to do, read your mind?
>>>
>>> It's really hard to extract information from a garbage can where people
>>> just randomly dumped whatever they individually felt like dumping
>>> without regard for operational needs. You can't build a skyscraper on a
>>> bad foundation, and you can't build a good log analysis off a crappy
>>> log.
>>>
>>> Fix the logging system, then the analysis problem will be tractable.
>>
>> I would argue around the same lines.
>>
>> I've been faced a while ago with a situation where some orthogonal
>> organisational unit wanted to exploit my logs. I told them to GTFO.
>>
>> My logs are my logs. I put in it what I consider necessary. I often
>> improve them as I step through the code. I might change the message, fix
>> the level, &c. I don't want to have them set in stone. Neither do I
>> generally have enough confidence in them to allow them to be used for
>> analysis.
>>
>> "The solution, then, is simple", I told them, "spec out the exact
>> messages and arguments you want, and the exact situations you want them
>> logged in, and I'll add them for you. But leave me my precious debugging
>> logs."
>>
>> Let me emphasize: IMHO debugging logs and logs for analysis are two
>> different things and should be kept strictly separated -- possibly
>> logged to a different target respectively.
> 
> That last is rather a brilliant idea, to use different targets. 
> Heretofore I've espoused that logs are primarily an operations tool, not
> a debugging tool, although in service of the former they inevitably and
> inherently must support the former.  The problem I've always seen is
> that logging statements are left up to the programmer, and not specified
> for the project.
> 
General agreement with all. I also am coming off one particular project
where part of the work - not a major part, but an important part - was
to improve logging. One of the first things we did was officially
recognize that we had many different clients of logging output. They
wanted different things at different levels at different times with
different storage stipulations.

The solution was pretty simple, and it's dynamic. I don't propose to get
into a logging framework war, but in this case we saw that JUL wouldn't
cut it, but log4j would do the trick. We had to do some arcane app
server-related stuff for JMX and log4j.xml, also integrate exception
handling with various "global" handlers that could also log, and wrap
log4j calls with a plethora of methods that would result in messages
formatted to our liking, but after that the heavy lifting was and is
done: it's now up to the clients - *not* to the developers - to request
what gets logged and in what manner.

Developers of course are clients themselves.

Again, not to get into a logging framework war, but for these purposes
log4j brings a lot to the table. It's common to need logging on specific
Java packages to be at a certain level, for output of that specific
logging to go to a specific target (like its own file) and have its own
storage policy, and for that logging to not be (or be, as the case
demands) to be additive to parent logging. Being able to do this is a
minimum for supporting different clients.

We also added, as part of our log4j method wrappers, an extra field for
all log messages that characterizes a "functional category". This allows
decorating all messages with information as to the identity of a
functional subsystem, and is helpful to post-processing tools like Splunk.

This system has been in production now for about 4 months, and
operational support staff and other clients are very pleased with it.
It's not perfect, because not all the log statements exist in the code
to support every informational requirement (known or unknown), but the
framework is not a problem.

One sidenote: despite doing everything I describe above, you can still
end up with logs that are difficult to interpret, and more log
statements aren't necessarily the answer. This typically happens when
your code itself is a spaghetti tangle. Sometimes to fix a logging
problem you really do need to refactor your logged code.

AHS

[toc] | [prev] | [next] | [standalone]

#4574

From	Ulrich Scholz <d7@thispla.net>
Date	2011-05-25 06:00 -0700
Message-ID	<1a96b712-59cd-4571-953b-eadaa0ec5322@r20g2000yqd.googlegroups.com>
In reply to	#4497

On 23 Mai, 23:32, Arved Sandstrom <asandstrom3min...@eastlink.ca>
wrote:
>
> The solution was pretty simple, and it's dynamic. [...] We had to [...]
> wrap
> log4j calls with a plethora of methods that would result in messages
> formatted to our liking, but after that the heavy lifting was and is
> done: it's now up to the clients - *not* to the developers - to request
> what gets logged and in what manner.

Thanks for your answer. Your approach sounds reasonable.

But what exactly do you mean with "wrap log4j calls"?
Couln´t you just use different log4j appenders, one for each client?
The
appenders would then (i) decide whether to log a given message for a
particular
client and (ii) format it accordingly.

Ulrich

[toc] | [prev] | [next] | [standalone]

#4582

From	Arved Sandstrom <asandstrom3minus1@eastlink.ca>
Date	2011-05-25 19:04 -0300
Message-ID	<V1fDp.9702$pi2.9482@newsfe11.iad>
In reply to	#4574

On 11-05-25 10:00 AM, Ulrich Scholz wrote:
> On 23 Mai, 23:32, Arved Sandstrom <asandstrom3min...@eastlink.ca>
> wrote:
>>
>> The solution was pretty simple, and it's dynamic. [...] We had to [...]
>> wrap
>> log4j calls with a plethora of methods that would result in messages
>> formatted to our liking, but after that the heavy lifting was and is
>> done: it's now up to the clients - *not* to the developers - to request
>> what gets logged and in what manner.
> 
> Thanks for your answer. Your approach sounds reasonable.
> 
> But what exactly do you mean with "wrap log4j calls"?
> Couln´t you just use different log4j appenders, one for each client?
> The
> appenders would then (i) decide whether to log a given message for a
> particular
> client and (ii) format it accordingly.
> 
> Ulrich

Using log4j as an example, if you look at the API for PatternLayout,
you've got a bunch of conversion specifiers, including %m for the
application-supplied message. The wrapping that we did, including those
functional categories I mentioned, is all about formatting the
application-supplied message. This included appropriate formatting of
exception stack traces and so forth.

We certainly also had different log4j conversion specifier combinations
in use for different appenders; the "wrapping" was purely for formatting
the application message, because that is opaque to log4j.

AHS

[toc] | [prev] | [next] | [standalone]

#4499

From	Martin Gregorie <martin@address-in-sig.invalid>
Date	2011-05-23 22:25 +0000
Message-ID	<iremt2$u3k$1@localhost.localdomain>
In reply to	#4482

On Mon, 23 May 2011 15:02:23 -0400, Lew wrote:

> On 05/23/2011 01:16 PM, Daniele Futtorovic wrote:
>> Let me emphasize: IMHO debugging logs and logs for analysis are two
>> different things and should be kept strictly separated -- possibly
>> logged to a different target respectively.
> 
> That last is rather a brilliant idea, to use different targets. 
> Heretofore I've espoused that logs are primarily an operations tool, not
> a debugging tool, although in service of the former they inevitably and
> inherently must support the former.  The problem I've always seen is
> that logging statements are left up to the programmer, and not specified
> for the project.
>
I tend to use at least two logging streams: debugging and operational. I 
leave debugging statements in production code: its normally off (of 
course) but can be turned on if needed. Operational debugging includes 
informational and error messages to be used by sysadmins which are always 
enabled and should be fairly infrequent as well as performance 
measurement messages. The latter can be configured on or off. As others 
have said, the messages need to be designed with both log stream 
selection and ease of parsing for later analysis in mind. 

In a C application for a *NIX OS its easiest to send all these messages 
to the system logger and let it deal with creating separate logs for the 
various message streams: its then trivial to use 'tail' to present the 
operational stream to sysadmins. 

If the application is written in a language that doesn't provide easy 
access to the system logger or is run on an OS that doesn't have one, I'd 
include a custom logging process as part of the application.


-- 
martin@   | Martin Gregorie
gregorie. | Essex, UK
org       |

[toc] | [prev] | [next] | [standalone]

#4523

From	Nigel Wade <nmw-news@ion.le.ac.uk>
Date	2011-05-24 12:26 +0100
Message-ID	<941ivgF1amU1@mid.individual.net>
In reply to	#4499

On 23/05/11 23:25, Martin Gregorie wrote:
> On Mon, 23 May 2011 15:02:23 -0400, Lew wrote:
> 
>> On 05/23/2011 01:16 PM, Daniele Futtorovic wrote:
>>> Let me emphasize: IMHO debugging logs and logs for analysis are two
>>> different things and should be kept strictly separated -- possibly
>>> logged to a different target respectively.
>>
>> That last is rather a brilliant idea, to use different targets. 
>> Heretofore I've espoused that logs are primarily an operations tool, not
>> a debugging tool, although in service of the former they inevitably and
>> inherently must support the former.  The problem I've always seen is
>> that logging statements are left up to the programmer, and not specified
>> for the project.
>>
> I tend to use at least two logging streams: debugging and operational. I 
> leave debugging statements in production code: its normally off (of 
> course) but can be turned on if needed. 

There is one caveat to leaving debug logging in production code; it may
affect performance. Even with output disabled the string arguments are
still constructed, unless they are constant. If logging is located in a
tight loop, or critical section, it might become significant.

In one particular application I profiled I found this to be eating up
80% of the CPU time. Admittedly, this was a rather special case, but
it's still something which ought to be borne in mind if you leave debug
logging code in production software.

-- 
Nigel Wade

[toc] | [prev] | [next] | [standalone]

#4529

From	Martin Gregorie <martin@address-in-sig.invalid>
Date	2011-05-24 12:29 +0000
Message-ID	<irg8b1$ibj$1@localhost.localdomain>
In reply to	#4523

On Tue, 24 May 2011 12:26:39 +0100, Nigel Wade wrote:

> On 23/05/11 23:25, Martin Gregorie wrote:
>> On Mon, 23 May 2011 15:02:23 -0400, Lew wrote:
>> 
>>> On 05/23/2011 01:16 PM, Daniele Futtorovic wrote:
>>>> Let me emphasize: IMHO debugging logs and logs for analysis are two
>>>> different things and should be kept strictly separated -- possibly
>>>> logged to a different target respectively.
>>>
>>> That last is rather a brilliant idea, to use different targets.
>>> Heretofore I've espoused that logs are primarily an operations tool,
>>> not a debugging tool, although in service of the former they
>>> inevitably and inherently must support the former.  The problem I've
>>> always seen is that logging statements are left up to the programmer,
>>> and not specified for the project.
>>>
>> I tend to use at least two logging streams: debugging and operational.
>> I leave debugging statements in production code: its normally off (of
>> course) but can be turned on if needed.
> 
> There is one caveat to leaving debug logging in production code; it may
> affect performance. Even with output disabled the string arguments are
> still constructed, unless they are constant. If logging is located in a
> tight loop, or critical section, it might become significant.
>
Fair point. However, my usual debugging statement takes the form:

if (debug > 0)
  debugger.trace(result + " = method(" + arg + ")");

Ugly, I know, but quite efficient, since when debugging is off even the 
cost of the method call cost is. I use an integer to control debugging 
rather than a boolean so I can control its volume: "java Application -dd" 
would be expected to provide more detailed debugging output than "java 
Application -d"

> In one particular application I profiled I found this to be eating up
> 80% of the CPU time. Admittedly, this was a rather special case, but
> it's still something which ought to be borne in mind if you leave debug
> logging code in production software.
>
Agreed. 

On occasion I've used a circular buffer to accumulate tracing information 
which gets only dumped if an exception occurs. This is great if you're 
chasing rarely occurring problems in a high volume, long running process 
since it obviates searching through megabytes of tracing info to find 
20-30 lines of relevant tracing. You'd expect this too to be quite 
expensive to run, so I'd use the same mechanism outlined earlier to 
ensure that tracing set off means that circular buffer is never filled 
rather than merely suppressing the buffer dump operation.

That said, I used to administer an OS (ICL's George 3) that continuously 
traced its internal operations to a fine-grained and a coarse-grained 
circular buffer which could be dumped after a crash and still managed to 
run fast enough to be a very usable OS.

-- 
martin@   | Martin Gregorie
gregorie. | Essex, UK-
org       |

[toc] | [prev] | [next] | [standalone]

#4532

From	Lew <noone@lewscanon.com>
Date	2011-05-24 08:49 -0400
Message-ID	<irg9gr$bqo$1@news.albasani.net>
In reply to	#4529

Martin Gregorie wrote:
> Fair point. However, my usual debugging statement takes the form:
>
> if (debug>  0)
>    debugger.trace(result + " = method(" + arg + ")");
>
> Ugly, I know, but quite efficient, since when debugging is off even the
> cost of the method call cost is. I use an integer to control debugging
> rather than a boolean so I can control its volume: "java Application -dd"
> would be expected to provide more detailed debugging output than "java
> Application -d"
>

What's wrong with

   if ( logger.isDebugEnabled() ) ...

or

   if ( logger.getLevel().isGreaterOrEqual( Logger.DEBUG )) ...
?

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]

#4536

From	Martin Gregorie <martin@address-in-sig.invalid>
Date	2011-05-24 14:37 +0000
Message-ID	<irgfs0$k9d$1@localhost.localdomain>
In reply to	#4532

On Tue, 24 May 2011 08:49:52 -0400, Lew wrote:

> Martin Gregorie wrote:
>> Fair point. However, my usual debugging statement takes the form:
>>
>> if (debug>  0)
>>    debugger.trace(result + " = method(" + arg + ")");
>>
>> Ugly, I know, but quite efficient, since when debugging is off even the
>> cost of the method call cost is. I use an integer to control debugging
>> rather than a boolean so I can control its volume: "java Application
>> -dd" would be expected to provide more detailed debugging output than
>> "java Application -d"
>>
>>
> What's wrong with
> 
>    if ( logger.isDebugEnabled() ) ...
> 
> or
> 
>    if ( logger.getLevel().isGreaterOrEqual( Logger.DEBUG )) ...
> ?
>
Not a lot, though that does involve method call overheads that you may 
not want in a tight loop. 

I initially created my simple ReportError logging class for Java 1.3, 
before the Logger class was introduced, and, as it does what I need, have 
seen no need to replace it. It was written to provide similar Java  
functionality to a set of C functions that I'd been using for many years.

Its trace() methods are roughly equivalent to Logger.log(Level, String) 
and Logger.log(Level, String, Object[]) with all output sent to stderr.   

-- 
martin@   | Martin Gregorie
gregorie. | Essex, UK
org       |

[toc] | [prev] | [next] | [standalone]

#4539

From	Lew <noone@lewscanon.com>
Date	2011-05-24 12:26 -0400
Message-ID	<irgm61$944$1@news.albasani.net>
In reply to	#4536

Martin Gregorie wrote:
> Lew wrote:
>> Martin Gregorie wrote:
>>> Fair point. However, my usual debugging statement takes the form:
>>>
>>> if (debug>   0)
>>>     debugger.trace(result + " = method(" + arg + ")");
>>>
>>> Ugly, I know, but quite efficient, since when debugging is off even the
>>> cost of the method call cost is. I use an integer to control debugging
>>> rather than a boolean so I can control its volume: "java Application
>>> -dd" would be expected to provide more detailed debugging output than
>>> "java Application -d"

>> What's wrong with
>>
>>     if ( logger.isDebugEnabled() ) ...
>>
>> or
>>
>>     if ( logger.getLevel().isGreaterOrEqual( Logger.DEBUG )) ...
>> ?

> Not a lot, though that does involve method call overheads that you may
> not want in a tight loop.

Seems like it's a simple equality test that would be HotSpotted away.

> I initially created my simple ReportError logging class for Java 1.3,
> before the Logger class was introduced, and, as it does what I need, have
> seen no need to replace it. It was written to provide similar Java
> functionality to a set of C functions that I'd been using for many years.
>
> Its trace() methods are roughly equivalent to Logger.log(Level, String)
> and Logger.log(Level, String, Object[]) with all output sent to stderr.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]

#4544

From	Jim Gibson <jimsgibson@gmail.com>
Date	2011-05-24 11:00 -0700
Message-ID	<240520111100099912%jimsgibson@gmail.com>
In reply to	#4539

In article <irgm61$944$1@news.albasani.net>, Lew <noone@lewscanon.com>
wrote:

> Martin Gregorie wrote:
> > Lew wrote:
> >> Martin Gregorie wrote:
> >>> Fair point. However, my usual debugging statement takes the form:
> >>>
> >>> if (debug>   0)
> >>>     debugger.trace(result + " = method(" + arg + ")");
> >>>
> >>> Ugly, I know, but quite efficient, since when debugging is off even the
> >>> cost of the method call cost is. I use an integer to control debugging
> >>> rather than a boolean so I can control its volume: "java Application
> >>> -dd" would be expected to provide more detailed debugging output than
> >>> "java Application -d"
> 
> >> What's wrong with
> >>
> >>     if ( logger.isDebugEnabled() ) ...
> >>
> >> or
> >>
> >>     if ( logger.getLevel().isGreaterOrEqual( Logger.DEBUG )) ...
> >> ?
> 
> > Not a lot, though that does involve method call overheads that you may
> > not want in a tight loop.
> 
> Seems like it's a simple equality test that would be HotSpotted away.

My understanding of the log4j and java.util.Logger classes is that each
logger exists within a hierarchy of classes, and that to determine
whether or not a specific logger call will produce output, the library
must test each level, starting at the specific class and traversing
upwards in the hierarchy. 

So using a static variable as below to enable and disable logging
output can save some execution time:

static final boolean debug = false;

if( debug ) {
  logger.debug(...);
}

Whether or not that is significant, depends upon the situation, of
course.

-- 
Jim Gibson

[toc] | [prev] | [next] | [standalone]

#4545

From	Lew <noone@lewscanon.com>
Date	2011-05-24 14:35 -0400
Message-ID	<irgtpb$qoi$1@news.albasani.net>
In reply to	#4544

On 05/24/2011 02:00 PM, Jim Gibson wrote:
> In article<irgm61$944$1@news.albasani.net>, Lew<noone@lewscanon.com>
> wrote:
>
>> Martin Gregorie wrote:
>>> Lew wrote:
>>>> Martin Gregorie wrote:
>>>>> Fair point. However, my usual debugging statement takes the form:
>>>>>
>>>>> if (debug>    0)
>>>>>      debugger.trace(result + " = method(" + arg + ")");
>>>>>
>>>>> Ugly, I know, but quite efficient, since when debugging is off even the
>>>>> cost of the method call cost is. I use an integer to control debugging
>>>>> rather than a boolean so I can control its volume: "java Application
>>>>> -dd" would be expected to provide more detailed debugging output than
>>>>> "java Application -d"
>>
>>>> What's wrong with
>>>>
>>>>      if ( logger.isDebugEnabled() ) ...
>>>>
>>>> or
>>>>
>>>>      if ( logger.getLevel().isGreaterOrEqual( Logger.DEBUG )) ...
>>>> ?
>>
>>> Not a lot, though that does involve method call overheads that you may
>>> not want in a tight loop.
>>
>> Seems like it's a simple equality test that would be HotSpotted away.
>
> My understanding of the log4j and java.util.Logger classes is that each
> logger exists within a hierarchy of classes, and that to determine
> whether or not a specific logger call will produce output, the library
> must test each level, starting at the specific class and traversing
> upwards in the hierarchy.
>
> So using a static variable as below to enable and disable logging
> output can save some execution time:
>
> static final boolean debug = false;
>
> if( debug ) {
>    logger.debug(...);
> }
>
> Whether or not that is significant, depends upon the situation, of
> course.

All this theory indicates what to measure in any real situation, but to echo 
your "of course", you won't know what to conclude prior to that measurement.

Now why you'd *ever* want to put a log statement inside a tight, 
performance-critical loop in the first place is a whole 'nother question.

-- 
Lew
Honi soit qui mal y pense.
http://upload.wikimedia.org/wikipedia/commons/c/cf/Friz.jpg

[toc] | [prev] | [next] | [standalone]

#4565

From	Nigel Wade <nmw-news@ion.le.ac.uk>
Date	2011-05-25 09:53 +0100
Message-ID	<943uboF6qmU1@mid.individual.net>
In reply to	#4545

On 24/05/11 19:35, Lew wrote:
> On 05/24/2011 02:00 PM, Jim Gibson wrote:
>> In article<irgm61$944$1@news.albasani.net>, Lew<noone@lewscanon.com>
>> wrote:
>>
>>> Martin Gregorie wrote:
>>>> Lew wrote:
>>>>> Martin Gregorie wrote:
>>>>>> Fair point. However, my usual debugging statement takes the form:
>>>>>>
>>>>>> if (debug>    0)
>>>>>>      debugger.trace(result + " = method(" + arg + ")");
>>>>>>
>>>>>> Ugly, I know, but quite efficient, since when debugging is off
>>>>>> even the
>>>>>> cost of the method call cost is. I use an integer to control
>>>>>> debugging
>>>>>> rather than a boolean so I can control its volume: "java Application
>>>>>> -dd" would be expected to provide more detailed debugging output than
>>>>>> "java Application -d"
>>>
>>>>> What's wrong with
>>>>>
>>>>>      if ( logger.isDebugEnabled() ) ...
>>>>>
>>>>> or
>>>>>
>>>>>      if ( logger.getLevel().isGreaterOrEqual( Logger.DEBUG )) ...
>>>>> ?
>>>
>>>> Not a lot, though that does involve method call overheads that you may
>>>> not want in a tight loop.
>>>
>>> Seems like it's a simple equality test that would be HotSpotted away.
>>
>> My understanding of the log4j and java.util.Logger classes is that each
>> logger exists within a hierarchy of classes, and that to determine
>> whether or not a specific logger call will produce output, the library
>> must test each level, starting at the specific class and traversing
>> upwards in the hierarchy.
>>
>> So using a static variable as below to enable and disable logging
>> output can save some execution time:
>>
>> static final boolean debug = false;
>>
>> if( debug ) {
>>    logger.debug(...);
>> }
>>
>> Whether or not that is significant, depends upon the situation, of
>> course.
> 
> All this theory indicates what to measure in any real situation, but to
> echo your "of course", you won't know what to conclude prior to that
> measurement.

In my specific example above I resolved the "problem" by using my own
class which did pretty much exactly as Lew suggested in his follow-up
post. When I subsequently profiled the code the execution time, and CPU
time, required by the remaining method invocations was negligible.

> 
> Now why you'd *ever* want to put a log statement inside a tight,
> performance-critical loop in the first place is a whole 'nother question.
> 

In my case the loop was reading messages from a network and generating
responses. The logging code was to log the message received, and the
response generated, during development. At this stage it wasn't deemed
to be performance critical, but was essential to verify correct receipt
of the network message.

It wasn't until much later that I was told that the response had to be
returned in less than 40uS (a fact that wasn't in the original
requirements). So I profiled the code to find out where optimisation
ought to be targeted. I still wanted to keep the logging in the code for
debugging purposes, since it was still under active development, but be
able to disable it (and its overhead) for testing.

-- 
Nigel Wade

[toc] | [prev] | [next] | [standalone]

Page 1 of 2 [1] 2 Next page →

csiph-web

analysis of java application logs

Contents

#4442 — analysis of java application logs

#4443

#4445

#4447

#4472

#4482

#4490

#4614

#4497

#4574

#4582

#4499

#4523

#4529

#4532

#4536

#4539

#4544

#4545

#4565