Difference between revisions of "Timeplot"

From HaskellWiki
Jump to navigation Jump to search
 
Line 47: Line 47:
 
timeplot currently supports the following kinds of charts:
 
timeplot currently supports the following kinds of charts:
 
* Event plots ('event')
 
* Event plots ('event')
  +
http://www.haskell.org/sitewiki/images/1/10/Event.png
  +
A fictionary event plot.
  +
 
* Line plots ('value')
 
* Line plots ('value')
  +
http://www.haskell.org/sitewiki/images/b/b8/Lineplot.png
  +
A fictionary line plot.
  +
 
* Counter histograms ('hist N')
 
* Counter histograms ('hist N')
  +
http://www.haskell.org/sitewiki/images/4/45/Errors.png
  +
How many errors occured in a program in 15-minute intervals.
  +
 
* Absolute and relative frequency histograms ('count N' and 'freq N')
 
* Absolute and relative frequency histograms ('count N' and 'freq N')
  +
http://www.haskell.org/sitewiki/images/1/16/Request-types-count-clustered.png
  +
http://www.haskell.org/sitewiki/images/5/57/Request-types-count-stacked.png
  +
http://www.haskell.org/sitewiki/images/e/e9/Request-types-freq-clustered.png
  +
http://www.haskell.org/sitewiki/images/5/5f/Request-types-freq-stacked.png
  +
  +
The distribution of request types along 15-minute intervals in a program where there are just 2 of them, presented in the form of absolute counts and relative frequencies, and in clustered and stacked fashion.
  +
 
* Quantile histograms ('quantile N q1,q2,..')
 
* Quantile histograms ('quantile N q1,q2,..')
  +
http://www.haskell.org/sitewiki/images/9/90/Quantile.png
  +
  +
That same line plot, but now for each day the minimum, median and maximum (0%, 50% and 100% quantiles respectively) values are shown
  +
 
* Absolute and relative interval frequency histograms ('binc N v1,v2,..', 'binf N v1,v2,..')
 
* Absolute and relative interval frequency histograms ('binc N v1,v2,..', 'binf N v1,v2,..')
  +
http://www.haskell.org/sitewiki/images/a/a1/Search-count.png
  +
http://www.haskell.org/sitewiki/images/d/d4/Search-freq.png
  +
  +
Distribution of response times of a search program into bins of 0..100ms, 100..500ms, 500..1000ms, 1000..5000ms and >5000ms.
  +
  +
  +
===Help===
  +
jkff@jkff-laptop:~/projects/hackage/timeplot$ tplot
  +
tplot - a tool for drawing timing diagrams. See http://www.haskell.org/haskellwiki/Timeplot
  +
Usage: tplot [-o OFILE] [-of {png|pdf|ps|svg|x}] [-or 640x480] -if IFILE [-tf TF]
  +
[-k Pat1 Kind1 -k Pat2 Kind2 ...] [-dk KindN]
  +
-o OFILE - output file (required if -of is not x)
  +
-of - output format (x means draw result in a window, default: extension of -o)
  +
-or - output resolution (default 640x480)
  +
-if IFILE - input file
  +
-tf TF - time format: 'num' means that times are integer numbers less than 2^31
  +
(for instance, line numbers); 'date PATTERN' means that times are dates
  +
in the format specified by PATTERN - see http://linux.die.net/man/3/strptime,
  +
for example, [%Y-%m-%d %H:%M:%S] parses dates like [2009-10-20 16:52:43].
  +
Default: 'date %Y-%m-%d %H:%M:%S'
  +
-k P K - set diagram kind for tracks matching pattern P to K
  +
(-k clauses are matched till first success)
  +
-dk - set default diagram kind
  +
  +
Input format: lines of the following form:
  +
1234 >A - at time 1234, during event A has begun
  +
1234 <A - at time 1234, during event A has ended
  +
1234 !B - at time 1234, pulse event B has occured
  +
1234 =C VAL - at time 1234, parameter C had numeric value VAL (for example, HTTP response time)
  +
1234 =D `EVENT - at time 1234, event EVENT occured in process D (for example, HTTP response code)
  +
It is assumed that many events of the same kind may occur at once.
  +
Diagram kinds:
  +
'event' is for event diagrams: during events are drawn like --[===]--- , pulse events like --|--
  +
'hist N' is for histograms: a histogram is drawn with granularity of N time units, where
  +
the bin corresponding to [t..t+N) has value 'what was the maximal number of active events
  +
in that interval'.
  +
'freq N [TYPE]' is for event frequency histograms: a histogram of type TYPE (stacked or
  +
clustered, default clustered) is drawn for each time bin of size N, about the distribution
  +
of various ` events
  +
'count N [TYPE]' is for event frequency histograms: a histogram of type TYPE (stacked or
  +
clustered, default clustered) is drawn for each time bin of size N, about the counts of
  +
various ` events
  +
'quantile N q1,q2,..' (example: quantile 100 0.25,0.5,0.75) - a bar chart of corresponding
  +
quantiles in time bins of size N
  +
'binf N v1,v2,..' (example: binf 100 1,2,5,10) - a bar chart of frequency of values falling
  +
into bins min..v1, v1..v2, .., v2..max in time bins of size N
  +
'binc N v1,v2,..' (example: binf 100 1,2,5,10) - a bar chart of counts of values falling
  +
into bins min..v1, v1..v2, .., v2..max in time bins of size N
  +
'value' - a simple line plot of numeric values
  +
N is measured in units or in seconds.

Revision as of 17:43, 23 October 2009

(This page tells about a program that will soon be uploaded to hackage)

Timeplot is a program for visualizing data from log files.

Usage scenario

A log file is preprocessed by an application-specific awk one-liner to produce input for timeplot.

Timeplot plots the input as several graphs of various kind with a common time axis.

Input

Conceptually, the input for timeplot is a collection of several tracks corresponding to different concurrent processes occuring in the logs, or to different characteristics of a process. There are 3 kinds of tracks:

  • Counter/Event tracks: Such a track represents a counter that may be bumped up and down: for example, a user login event bumps the user counter up and a logout event bumps the counter down. This may be also thought of as a binary event track: the event starts and ends at some time instants - for example, a process of periodical reloading of some data. Such a track may also have 'pulse' events that tell that at a time moment something happened with the track. For example, we might have a track for errors and pulse it every time a request happens.
  • Numeric tracks: Such a track represents observations of a numeric value at certain time instants. Example: HTTP server response time, memory usage etc.
  • Discrete tracks: Such a track represents observations of a discrete value at certain time instance. Example: current thread ID, HTTP server response code, request type, log message level.

Physically, the input for timeplot consists of lines, each line speaking about a time instant, the times being sorted in ascended order. Each line has a time field, an event type field and a track field. There are 5 kinds of input lines:

Bump up, bump down and pulse at a counter track:

 TIME >TRACK
 TIME <TRACK
 TIME !TRACK
 2009-10-23 21:03:56 >Users
 2009-10-23 22:13:02 <Users
 
 2009-10-23 18:45:00 !IncomingMail

Value observation at a numeric track:

 TIME =TRACK VALUE
 2009-10-23 21:03:56 6.452

Discrete observation at a numeric track:

 TIME =TRACK `VALUE
 2009-10-23 21:03:56 `register.php


Format of the time field is customizable: time may either be represented by an integer number in the interval 0..2^31-1, or by a date formatted according to a format string in the format of strptime.

Output

The output of timeplot is a vertical stack of charts of the kinds specified below, with their time axes aligned horizontally so that one may observe the interaction of events in different tracks.

timeplot currently supports the following kinds of charts:

  • Event plots ('event')

Event.png A fictionary event plot.

  • Line plots ('value')

Lineplot.png A fictionary line plot.

  • Counter histograms ('hist N')

Errors.png How many errors occured in a program in 15-minute intervals.

  • Absolute and relative frequency histograms ('count N' and 'freq N')

Request-types-count-clustered.png Request-types-count-stacked.png Request-types-freq-clustered.png Request-types-freq-stacked.png

The distribution of request types along 15-minute intervals in a program where there are just 2 of them, presented in the form of absolute counts and relative frequencies, and in clustered and stacked fashion.

  • Quantile histograms ('quantile N q1,q2,..')

Quantile.png

That same line plot, but now for each day the minimum, median and maximum (0%, 50% and 100% quantiles respectively) values are shown

  • Absolute and relative interval frequency histograms ('binc N v1,v2,..', 'binf N v1,v2,..')

Search-count.png Search-freq.png

Distribution of response times of a search program into bins of 0..100ms, 100..500ms, 500..1000ms, 1000..5000ms and >5000ms.


Help

jkff@jkff-laptop:~/projects/hackage/timeplot$ tplot
tplot - a tool for drawing timing diagrams. See http://www.haskell.org/haskellwiki/Timeplot
Usage: tplot [-o OFILE] [-of {png|pdf|ps|svg|x}] [-or 640x480] -if IFILE [-tf TF] 
             [-k Pat1 Kind1 -k Pat2 Kind2 ...] [-dk KindN]
  -o  OFILE - output file (required if -of is not x)
  -of       - output format (x means draw result in a window, default: extension of -o)
  -or       - output resolution (default 640x480)
  -if IFILE - input file
  -tf TF    - time format: 'num' means that times are integer numbers less than 2^31
              (for instance, line numbers); 'date PATTERN' means that times are dates
              in the format specified by PATTERN - see http://linux.die.net/man/3/strptime,
              for example, [%Y-%m-%d %H:%M:%S] parses dates like [2009-10-20 16:52:43]. 
              Default: 'date %Y-%m-%d %H:%M:%S'
  -k P K    - set diagram kind for tracks matching pattern P to K 
              (-k clauses are matched till first success)
  -dk       - set default diagram kind
Input format: lines of the following form:
1234 >A - at time 1234, during event A has begun
1234 <A - at time 1234, during event A has ended
1234 !B - at time 1234, pulse event B has occured
1234 =C VAL - at time 1234, parameter C had numeric value VAL (for example, HTTP response time)
1234 =D `EVENT - at time 1234, event EVENT occured in process D (for example, HTTP response code)
It is assumed that many events of the same kind may occur at once.
Diagram kinds:
  'event' is for event diagrams: during events are drawn like --[===]--- , pulse events like --|--
  'hist N' is for histograms: a histogram is drawn with granularity of N time units, where
     the bin corresponding to [t..t+N) has value 'what was the maximal number of active events
     in that interval'.
  'freq N [TYPE]' is for event frequency histograms: a histogram of type TYPE (stacked or 
     clustered, default clustered) is drawn for each time bin of size N, about the distribution 
     of various ` events
  'count N [TYPE]' is for event frequency histograms: a histogram of type TYPE (stacked or 
     clustered, default clustered) is drawn for each time bin of size N, about the counts of 
     various ` events
  'quantile N q1,q2,..' (example: quantile 100 0.25,0.5,0.75) - a bar chart of corresponding
     quantiles in time bins of size N
  'binf N v1,v2,..' (example: binf 100 1,2,5,10) - a bar chart of frequency of values falling
     into bins min..v1, v1..v2, .., v2..max in time bins of size N
  'binc N v1,v2,..' (example: binf 100 1,2,5,10) - a bar chart of counts of values falling
     into bins min..v1, v1..v2, .., v2..max in time bins of size N
  'value' - a simple line plot of numeric values
N is measured in units or in seconds.