webalizer [ option ... ] [ log-file ]
webazolver [ option ... ] [ log-file ]
The Webalizer is a web server log file analysis program which produces
usage statistics in HTML format for viewing with a browser. The
results are presented in both columnar and graphical format, which
facilitates interpretation. Yearly, monthly, daily and hourly usage
statistics are presented, along with the ability to display usage by
site, URL, referrer, user agent (browser), username, search strings,
entry/exit pages, and country (some information may not be available
if not present in the log file being processed).
The Webalizer supports CLF (common log format) log files, as well as
Combined log formats as defined by NCSA and others, and variations of
these which it attempts to handle intelligently. In addition, the
Webalizer also supports wu-ftpd xferlog formatted log files, allowing
analysis of ftp servers, and squid proxy logs. Logs may also be com-
pressed, via gzip. If a compressed log file is detected, it will be
automatically uncompressed while it is read. Compressed logs must have
the standard gzip extension of .gz.
webazolver is normally just a symbolic link to the webalizer. When run
as webazolver, only DNS file creation/updates are performed, and the
program will exit once complete. All normal options and configuration
directives are available, however many will not be used. In addition,
a DNS cache file must be specified. If the number of DNS children pro-
cesses to use are not specified, the webazolver will default to 5.
This documentation applies to The Webalizer Version 2.01
RUNNING THE WEBALIZER
The Webalizer was designed to be run from a Unix command line prompt or
as a crond(8) job. Once executed, the general flow of the program is:
o A default configuration file is scanned for. A file named
webalizer.conf is searched for in the current directory, and if
found, it's configuration data is parsed. If the file is not
present in the current directory, the file /etc/webalizer.conf
is searched for and, if found, is used instead.
o Any command line arguments given to the program are parsed.
This may include the specification of a configuration file,
which is processed at the time it is encountered.
o If a log file was specified, it is opened and made ready for
processing. If no log file was given, STDIN is used for input.
If the log filename '-' is specified, STDIN will be forced.
o If an output directory was specified, the program does a
chdir(2) to that directory in preparation for generating out-
directory) and read if found. This file keeps totals for pre-
vious months, which is used in the main index.html HTML docu-
ment. Note: The file location can now be specified with the
HistoryName configuration option.
o If incremental processing was specified, a data file is
searched for and loaded if found, containing the 'internal
state' data of the program at the end of a previous run. Note:
The file location can now be specified with the IncrementalName
o Main processing begins on the log file. If the log spans mul-
tiple months, a separate HTML document is created for each
o After main processing, the main index.html page is created,
which has totals by month and links to each months HTML docu-
o A new history file is saved to disk, which includes totals gen-
erated by The Webalizer during the current run.
o If incremental processing was specified, a data file is written
that contains the 'internal state' data at the end of this run.
Version 1.2x of The Webalizer adds incremental run capability. Simply
put, this allows processing large log files by breaking them up into
smaller pieces, and processing these pieces instead. What this means
in real terms is that you can now rotate your log files as often as you
want, and still be able to produce monthly usage statistics without the
loss of any detail. Basically, The Webalizer saves and restores all
internal data in a file named webalizer.current. This allows the pro-
gram to 'start where it left off' so to speak, and allows the preserva-
tion of detail from one run to the next. The data file is placed in
the current output directory, and is a plain ASCII text file that can
be viewed with any standard text editor. It's location and name may be
changed using the IncrementalName configuration keyword.
Some special precautions need to be taken when using the incremental
run capability of The Webalizer. Configuration options should not be
changed between runs, as that could cause corruption of the internal
data stored. For example, changing the MangleAgents level will cause
different representations of user agents to be stored, producing
invalid results in the user agents section of the report. If you need
to change configuration options, do it at the end of the month after
normal processing of the previous month and before processing the cur-
rent month. You may also want to delete the webalizer.current file as
The Webalizer also attempts to prevent data duplication by keeping
track of the timestamp of the last record processed. This timestamp is
then compared to current records being processed, and any records that
were logged previous to that timestamp are ignored. This, in theory,
REVERSE DNS LOOKUPS
The Webalizer supports reverse DNS lookups through a DNS cache file
that is either created/updated at run-time, or has been previously cre-
ated, either by a previous run of the webalizer, or by running the
stand-alone version, webazolver. In order to perform reverse DNS
lookups, a DNSCache filename must be specified. In order to cre-
ate/update the cache file at run-time, the DNSChildren number must be
non-zero. The DNSChildren value specifies the number of children pro-
cesses to fork, each of which will perform reverse DNS lookups in order
to create/update the DNS cache file. See the file DNS.README for addi-
COMMAND LINE OPTIONS
The Webalizer supports many different configuration options that will
alter the way the program behaves and generates output. Most of these
can be specified on the command line, while some can only be specified
in a configuration file. The command line options are listed below,
with references to the corresponding configuration file keywords.
-h Display all available command line options and exit program.
-V Display program version and exit program.
-d Debug. Display debugging information for errors and warnings.
-i IgnoreHist. Ignore history. USE WITH CAUTION. This will cause
The Webalizer to ignore any previous monthly history file only.
Incremental data (if present) is still processed.
-p Incremental. Preserve internal data between runs.
-q Quiet. Suppress informational messages. Does not suppress
warnings or errors.
-Q ReallyQuiet. Suppress all messages including warnings and
-T TimeMe. Force display of timing information at end of process-
-c file Use configuration file file.
-n name HostName. Use the hostname name.
-o dir OutputDir. Use output directory dir.
-t name ReportTitle. Use name for report title.
-F ( clf | ftp | squid )
LogType. Specify log type to be processed. Value can be
either clf, ftp or squid format. If not specified, will
default to CLF format. FTP logs must be in standard wu-ftpd
specified, defaults to html. Do not include the leading
-H HourlyStats. Suppress hourly statistics.
-L GraphLegend. Suppress color coded graph legends.
-l num GraphLines. Specify number of background lines. Default is 2.
Use zero ('0') to disable the lines.
-P name PageType. Specify file extensions that are considered pages.
Sometimes referred to as pageviews.
-m num VisitTimeout. Specify the Visit timeout period. Specified in
number of seconds. Default is 1800 seconds (30 minutes).
-I name IndexAlias. Use the filename name as an additional alias for
-M num MangleAgents. Mangle user agent names according to the mangle
level specified by num. Mangle levels are:
5 Browser name and major version.
4 Browser name, major and minor version.
3 Browser name, major version, minor version to two decimal
2 Browser name, major and minor versions and sub-version.
1 Browser name, version and machine type if possible.
0 All information (left unchanged).
-g num GroupDomains. Automatically group sites by domain. The group-
ing level specified by num can be thought of as 'the number of
dots' to display in the grouping. The default value of 0 dis-
ables any domain grouping.
-D name DNSCache. Use the DNS cache file name.
-N num DNSChildren. Use num DNS children processes to perform DNS
lookups, either creating or updating the DNS cache file. Spec-
ify zero (0) to disable cache file creation/updates. If given,
a DNS cache filename must be specified.
-a name HideAgent. Hide user agents matching name.
-r name HideReferrer. Hide referrer matching name.
-s name HideSite. Hide site matching name.
-U num TopURLs. Display the top num URL's table.
-C num TopCountries. Display the top num countries table.
-e num TopEntry. Display the top num entry pages table.
-E num TopExit. Display the top num exit pages table.
Configuration files are standard ascii(7) text files that may be cre-
ated or edited using any standard editor. Blank lines and lines that
begin with a pound sign ('#') are ignored. Any other lines are consid-
ered to be configuration lines, and have the form "Keyword Value",
where the 'Keyword' is one of the currently available configuration
keywords defined below, and 'Value' is the value to assign to that par-
ticular option. Any text found after the keyword up to the end of the
line is considered the keyword's value, so you should not include any-
thing after the actual value on the line that is not actually part of
the value being assigned. The file sample.conf provided with the dis-
tribution contains lots of useful documentation and examples as well.
General Configuration Keywords
Use log file named name. If none specified, STDIN will be
Specify log file type as name. Values can be either web, squid
or ftp, with the default being web.
Create output in the directory dir. If none specified, the
current directory will be used.
Filename to use for history file. Relative to output directory
unless absolute name is given (ie: starts with '/'). Defaults
to 'webalizer.hist' in the standard output directory.
Use the title string name for the report title. If none speci-
fied, use the default of (in english) "Usage Statistics for ".
Set the hostname for the report as name. If none specified, an
attempt will be made to gather the hostname via a uname(2) sys-
tem call. If that fails, localhost will be used.
UseHTTPS ( yes | no )
Use https:// on links to URLS, instead of the default http://,
in the 'Top URL's' table.
GMTTime ( yes | no )
Use GMT (UTC) time instead of local timezone for reports.
IgnoreHist ( yes | no )
Ignore previous monthly history file. USE WITH CAUTION. Does
not prevent Incremental file processing.
FoldSeqErr ( yes | no )
Fold out of sequence log records back into analysis by treating
them as if they had the same date/time as the last good record.
Normally, out of sequence log records are ignored.
CountryGraph ( yes | no )
Display Country Usage Graph in output report.
DailyGraph ( yes | no )
Display Daily Graph in output report.
DailyStats ( yes | no )
Display Daily Statistics in output report.
HourlyGraph ( yes | no )
Display Hourly Graph in output report.
HourlyStats ( yes | no )
Display Hourly Statistics in output report.
Define the file extensions to consider as a page. If a file is
found to have the same extension as name, it will be counted as
a page (sometimes called a pageview).
GraphLegend ( yes | no )
Allows the color coded graph legends to be enabled/disabled.
Specify the number of background reference lines displayed on
the graphs produced. Disable by using zero ('0'), default is
Specifies the visit timeout value. Default is 1800 seconds (30
minutes). A visit is determined by looking at the difference
in time between the current and last request from a specific
site. If the difference is greater or equal to the timeout
value, the request is counted as a new visit. Specified in
Use name as an additional alias for index.*.
Mangle user agent names based on mangle level num. See the -M
Filename to use for incremental data. Relative to output
directory unless an absolute name is given (ie: starts with
'/'). Defaults to 'webalizer.current' in the standard output
Filename to use for the DNS cache. Relative to output direc-
tory unless an absolute name is given (ie: starts with '/').
Number of children DNS processes to run in order to cre-
ate/update the DNS cache file. Specify zero (0) to disable.
Top Table Keywords
Display the top num User Agents table. Use zero to disable.
AllAgents ( yes | no )
Create separate HTML page with All User Agents.
Display the top num Referrers table. Use zero to disable.
AllReferrers ( yes | no )
Create separate HTML page with All Referrers.
Display the top num Sites table. Use zero to disable.
Display the top num Sites (by KByte) table. Use zero to dis-
AllSites ( yes | no )
Create separate HTML page with All Sites.
Display the top num URLs table. Use zero to disable.
Display the top num URLs (by KByte) table. Use zero to dis-
AllURLs ( yes | no )
Create separate HTML page with All URLs.
Display the top num Countries in the table. Use zero to dis-
Create separate HTML page with All Search Strings.
Display the top num Usernames in the table. Use zero to dis-
able. Usernames are only available if using http based authen-
AllUsers ( yes | no )
Create separate HTML page with All Usernames.
Hide User Agents that match name.
Hide Referrers that match name.
Hide Sites that match name.
HideAllSites ( yes | no )
Hide all individual sites. This causes only grouped sites to
Hide URL's that match name.
Hide Usernames that match name.
Ignore User Agents that match name.
Ignore Referrers that match name.
Ignore Sites that match name.
Ignore URL's that match name.
Ignore Usernames that match name.
GroupAgent name [Label]
Group User Agents that match name. Display Label in 'Top
Agent' table if given (instead of name).
GroupReferrer name [Label]
Group Referrers that match name. Display Label in 'Top Refer-
rer' table if given (instead of name).
if given (instead of name).
GroupUser name [Label]
Group Usernames that match name. Display Label in 'Top User-
names' table if given (instead of name).
Force inclusion of sites that match name. Takes precedence
over Ignore# keywords.
Force inclusion of URL's that match name. Takes precedence
over Ignore# keywords.
Force inclusion of Referrers that match name. Takes precedence
over Ignore# keywords.
Force inclusion of User Agents that match name. Takes prece-
dence over Ignore* keywords.
Force inclusion of Usernames that match name. Takes precedence
over Ignore* keywords.
HTML Generation Keywords
Defines the HTML file extension to use. Default is html. Do
not include the leading period!
Insert text at the very beginning of the generated HTML file.
Defaults to a standard html 3.2 DOCTYPE record.
Insert text within the <HEAD></HEAD> block of the HTML file.
Insert text in HTML page, starting with the <BODY> tag. If
used, the first line must be a <BODY ...> tag. Multiple lines
may be specified.
Insert text at top (before horiz. rule) of HTML pages. Multi-
ple lines may be specified.
Insert text at bottom of the HTML page. The text is top and
right aligned within a table column at the end of the report.
Insert text at the very end of the HTML page. If not speci-
ing slash (/fP).
Use name as the filename extension for dump files. If not
given, the default of tab will be used.
DumpHeader ( yes | no )
Print a column header as the first record of the file.
DumpSites ( yes | no )
Dump the sites data to a tab delimited file.
DumpURLs ( yes | no )
Dump the url data to a tab delimited file.
DumpReferrers ( yes | no )
Dump the referrer data to a tab delimit file. This data is
only available if using a log that contains referrer informa-
tion (ie: a combined format web log).
DumpAgents ( yes | no )
Dump the user agent data to a tab delimited file. This data is
only available if using a log that contains user agent informa-
tion (ie: a combined format web log).
DumpUsers ( yes | no )
Dump the username data to a tab delimited file. This data is
only available if processing a wu-ftpd xferlog or a web log
that contains http authentication information.
DumpSearchStr ( yes | no )
Dump the search string data to a tab delimited file. This data
is only available if processing a web log that contains refer-
rer information and had search string information present.
ColorHit ( rrggbb | 00805c )
Sets the graph's hit-color to the specified html color (no
ColorFile ( rrggbb | 0000ff )
Sets the graph's file-color to the specified html color (no
ColorSite ( rrggbb | ff8000 )
Sets the graph's site-color to the specified html color (no
ColorKbyte ( rrggbb | ff0000 )
Sets the graph's kilobyte-color to the specified html color (no
ColorPage ( rrggbb | 00c0ff )
Sets the graph's page-color to the specified html color (no
PieColor3 ( rrggbb | ff00ff )
Sets the pie's third optinal color to the specified html color
PieColor4 ( rrggbb | ffc480 )
Sets the pie's fourth optional color to the specified html
color (no '#').
webalizer.conf Default configuration file. Is searched for in the
current directory and if not found, in the /etc/
webalizer.hist Monthly history file for previous 12 months. (can
webalizer.current Current state data file (Incremental processing).
(can be changed)
xxxxx_YYYYMM.html Various monthly HTML output files produced. (exten-
sion can be changed)
xxxxx_YYYYMM.png Various monthly image files used in the reports.
xxxxx_YYYYMM.tab Monthly tab delimited text files. (extension can
Report bugs to firstname.lastname@example.org.
Copyright (C) 1997-2000 by Bradford L. Barrett. Distributed under the
GNU GPL. See the files "COPYING" and "Copyright", supplied with all
distributions for additional information.
Bradford L. Barrett <email@example.com>
Version 2.01 22-Oct-2001 webalizer(1)
Man Pages Copyright Respective Owners. Site Copyright (C) 1994 - 2017
All Rights Reserved.