PMLOGREWRITE(1)PMLOGREWRITE(1)NAME
pmlogrewrite - rewrite Performance Co-Pilot archives
SYNOPSIS
$PCP_BINADM_DIR/pmlogrewrite [-Cdiqsvw ] [-c config] inlog [outlog]
DESCRIPTION
pmlogrewrite reads a Performance Co-Pilot (PCP) archive log identified
by inlog and creates a PCP archive log in outlog. Under normal usage,
the -c option will be used to nominate a configuration file or files
that contains specifications (see the REWRITING RULES SYNTAX section
below) that describe how the data and metadata from inlog should be
transformed to produce outlog.
The typical uses for pmlogrewrite would be to accommodate the evolution
of Performance Metric Domain Agents (PMDAs) where the names, metadata
and semantics of metrics and their associated instance domains may
change over time, e.g. promoting the type of a metric from a 32-bit to
a 64-bit integer, or renaming a group of metrics. Refer to the EXAM‐
PLES section for some additional use cases.
pmlogrewrite is most useful where PMDA changes, or errors in the pro‐
duction environment, result in archives that cannot be combined with
pmlogextract(1). By pre-processing the archives with pmlogrewrite the
resulting archives may be able to be merged with pmlogextract(1).
The input inlog must be a PCP archive log created by pmlogger(1), or
possibly one of the tools that read and create PCP archives, e.g.
pmlogextract(1) and pmlogreduce(1).
If no -c option is specified, then the default behavior simply creates
outlog as a copy of inlog. This is a little more complicated than
cat(1), as each PCP archive is made up of several physical files.
While pmlogrewrite may be used to repair some data consistency issues
in PCP archives, there is also a class of repair tasks that cannot be
handled by pmlogrewrite and pmloglabel(1) may be a useful tool in these
cases.
COMMAND LINE OPTIONS
The command line options for pmlogrewrite are as follows:
-C Parse the rewriting rules and quit. outlog is not created.
When -C is specified, this also sets -v and -w so that all warn‐
ings and verbose messages are displayed as config is parsed.
-c config
If config is a file or symbolic link, read and parse rewriting
rules from there. If config is a directory, then all of the
files or symbolic links in that directory (excluding those
beginning with a period ``.'') will be used to provide the
rewriting rules. Multiple -c options are allowed.
-d Desperate mode. Normally if a fatal error occurs, all trace of
the partially written PCP archive outlog is removed. With the
-d option, the partially created outlog archive log is not
removed.
-i Rather than creating outlog, inlog is rewritten in place when
the -i option is used. A new archive is created using temporary
file names and then renamed to inlog in such a way that if any
errors (not warnings) are encountered, inlog remains unaltered.
-q Quick mode, where if there are no rewriting actions to be per‐
formed (none of the global data, instance domains or metrics
from inlog will be changed), then pmlogrewrite will exit (with
status 0, so success) immediately after parsing the configura‐
tion file(s) and outlog is not created.
-s When the ``units'' of a metric are changed, if the dimension in
terms of space, time and count is unaltered, then the scaling
factor is being changed, e.g. BYTE to KBYTE, or MSEC-1 to
USEC-1, or the composite MBYTE.SEC-1 to KBYTE.USEC-1. The moti‐
vation may be (a) that the original metadata was wrong but the
values in inlog are correct, or (b) the metadata is changing so
the values need to change as well. The default pmlogrewrite be‐
haviour matches case (a). If case (b) applies, then use the -s
option and the values of all the metrics with a scale factor
change in each result will be rescaled. For finer control over
value rescaling refer to the RESCALE option for the UNITS clause
of the metric rewriting rule described below.
-v Increase verbosity of diagnostic output.
-w Emit warnings. Normally pmlogrewrite remains silent for any
warning that is not fatal and it is expected that for a particu‐
lar archive, some (or indeed, all) of the rewriting specifica‐
tions may not apply. For example, changes to a PMDA may be cap‐
tured in a set of rewriting rules, but a single archive may not
contain all of the modified metrics nor all of the modified
instance domains and/or instances. Because these cases are
expected, they do not prevent pmlogrewrite executing, and rules
that do not apply to inlog are silently ignored by default.
Similarly, some rewriting rules may involve no change because
the metadata in inlog already matches the intent of the rewrit‐
ing rule to correct data from a previous version of a PMDA. The
-w flag forces warnings to be emitted for all of these cases.
The argument outlog is required in all cases, except when -i is speci‐
fied.
REWRITING RULES SYNTAX
A configuration file contains zero or more rewriting rules as defined
below.
Keywords and special punctuation characters are shown below in
bolditalic font and are case-insensitive, so METRIC, metric and Metric
are all equivalent in rewriting rules.
The character ``#'' introduces a comment and the remainder of the line
is ignored. Otherwise the input is relatively free format with
optional white space (spaces, tabs or newlines) between lexical items
in the rules.
A global rewriting rule has the form:
GLOBAL { globalspec ... }
where globalspec is zero or more of the following clauses:
HOSTNAME -> hostname
Modifies the label records in the outlog PCP archive, so that
the metrics will appear to have been collected from the host
hostname.
TIME -> delta
Both metric values and the instance domain metadata in a PCP
archive carry timestamps. This clause forces all the time‐
stamps to be adjusted by delta, where delta is an optional sign
``+'' (the default) or ``-'', an optional number of hours fol‐
lowed by a colon ``:'', an optional number of minutes followed
by a colon ``:'', a number of seconds, an optional fraction of
seconds following a period ``.''. The simplest example would
be ``30'' to increase the timestamps by 30 seconds. A more
complex example would be ``-23:59:59.999'' to move the time‐
stamps backwards by one millisecond less than one day.
TZ -> "timezone"
Modifies the label records in the outlog PCP archive, so that
the metrics will appear to have been collected from a host with
a local timezone of timezone. timezone must be enclosed in
quotes, and should conform to the valid timezone syntax rules
for the local platform.
An indom rewriting rule modifies an instance domain and has the form:
INDOM domain.serial { indomspec ... }
where domain and serial identify one or more existing instance domains
from inlog - typically domain would be an integer in the range 1 to 510
and serial would be an integer in the range 0 to 4194304.
As a special case serial could be an asterisk ``*'' which means the
rule applies to every instance domain with a domain number of domain.
If a designated instance domain is not in inlog the rule has no effect.
The indomspec is zero or more of the following clauses:
INAME "oldname" -> "newname"
The instance identified by the external instance name oldname
is renamed to newname. Both oldname and newname must be
enclosed in quotes.
As a special case, the new name may be the keyword DELETE (with
no quotes), and then the instance oldname will be expunged from
outlog which removes it from the instance domain metadata and
removes all values of this instance for all the associated met‐
rics.
If the instance names contain any embedded spaces then special
care needs to be taken in respect of the PCP instance naming
rule that treats the leading non-space part of the instance
name as the unique portion of the name for the purposes of
matching and ensuring uniqueness within an instance domain,
refer to pmdaInstance(3) for a discussion of this issue.
As an illustration, consider the hypothetical instance domain
for a metric which contains 2 instances with the following
names:
red
eek urk
Then some possible INAME clauses might be:
"eek" -> "yellow like a flower"
Acceptable, oldname "eek" matches the "eek urk"
instance.
"red" -> "eek"
Error, newname "eek" matches the existing "eek urk"
instance.
"eek urk" -> "red of another hue"
Error, newname "red of another hue" matches the
existing "red" instance.
INDOM -> newdomain.newserial
Modifies the metadata for the instance domain and every metric
associated with the instance domain. As a special case, newse‐
rial could be an asterisk ``*'' which means use serial from the
indom rewriting rule, although this is most useful when serial
is also an asterisk. So for example:
indom 29.* { indom -> 109.* }
will move all instance domains from domain 29 to domain 109.
INST oldid -> newid
The instance identified by the internal instance identifier
oldid is renumbered to newid. Both oldid and newid are inte‐
gers in the range 0 to 231-1.
As a special case, newid may be the keyword DELETE and then the
instance oldid will be expunged from outlog which removes it
from the instance domain metadata and removes all values of
this instance for all the associated metrics.
A metric rewriting rule has the form:
METRIC metricid { metricspec ... }
where metricid identifies one or more existing metrics from inlog using
either a metric name, or the internal encoding for a metric's PMID as
domain.cluster.item. In the latter case, typically domain would be an
integer in the range 1 to 510, cluster would be an integer in the range
0 to 4095, and item would be an integer in the range 0 to 1023.
As special cases item could be an asterisk ``*'' which means the rule
applies to every metric with a domain number of domain and a cluster
number of cluster, or cluster could be an asterisk which means the rule
applies to every metric with a domain number of domain and an item num‐
ber of item, or both cluster and item could be asterisks, and rule
applies to every metric with a domain number of domain.
If a designated metric is not in inlog the rule has no effect.
The metricspec is zero or more of the following clauses:
DELETE
The metric is completely removed from outlog, both the metadata
and all values in results are expunged.
INDOM -> newdomain.newserial [ pick ]
Modifies the metadata to change the instance domain for this
metric. The new instance domain must exist in outlog.
The optional pick clause may be used to select one input value,
or compute an aggregate value from the instances in an input
result, or assign an internal instance identifier to a single
output value. If no pick clause is specified, the default be‐
haviour is to copy all input values from each input result to
an output result, however if the input instance domain is sin‐
gular (indom PM_INDOM_NULL) then the one output value must be
assigned an internal instance identifier, which is 0 by
default, unless over-ridden by a INST or INAME clause as
defined below.
The choices for pick are as follows:
OUTPUT FIRST
choose the value of the first instance from each
input result
OUTPUT LAST choose the value of the last instance from each
input result
OUTPUT INST instid
choose the value of the instance with internal
instance identifier instid from each result; the
sequence of rewriting rules ensures the OUTPUT pro‐
cessing happens before instance identifier renum‐
bering from any associated indom rule, so instid
should be one of the internal instance identifiers
that appears in inlog
OUTPUT INAME "name"
choose the value of the instance with name for its
external instance name from each result; the
sequence of rewriting rules ensures the OUTPUT pro‐
cessing happens before instance renaming from any
associated indom rule, so name should be one of the
external instance names that appears in inlog
OUTPUT MIN choose the smallest value in each result (metric
type must be numeric and output instance will be 0
for a non-singular instance domain)
OUTPUT MAX choose the largest value in each result (metric
type must be numeric and output instance will be 0
for a non-singular instance domain)
OUTPUT SUM choose the sum of all values in each result (metric
type must be numeric and output instance will be 0
for a non-singular instance domain)
OUTPUT AVG choose the average of all values in each result
(metric type must be numeric and output instance
will be 0 for a non-singular instance domain)
If the input instance domain is singular (indom PM_INDOM_NULL)
then independent of any pick specifications, there is at most
one value in each input result and so FIRST, LAST, MIN, MAX,
SUM and AVG are all equivalent and the output instance identi‐
fier will be 0.
In general it is an error to specify a rewriting action for the
same metadata or result values more than once, e.g. more than
one INDOM clause for the same instance domain. The one excep‐
tion is the possible interaction between the INDOM clauses in
the indom and metric rules. For example the metric sample.bin
is defined over the instance domain 29.2 in inlog and the fol‐
lowing is acceptable (albeit redundant):
indom 29.* { indom -> 109.* }
metric sample.bin { indom -> 109.2 }
However the following is an error, because the instance domain
for sample.bin has two conflicting definitions:
indom 29.* { indom -> 109.* }
metric sample.bin { indom -> 123.2 }
INDOM -> NULL[ pick
]
The metric (which must have been previously defined over an
instance domain) is being modified to be a singular metric.
This involves a metadata change and collapsing all results for
this metric so that multiple values become one value.
The optional pick part of the clause defines how the one value
for each result should be calculated and follows the same rules
as described for the non-NULL INDOM case above.
In the absence of pick, the default is OUTPUT FIRST.
NAME -> newname
Renames the metric in the PCP archive's metadata that supports
the Performance Metrics Name Space (PMNS). newname should not
match any existing name in the archive's PMNS and must follow
the syntactic rules for valid metric names as outlined in
pmns(4).
PMID -> newdomain.newcluster.newitem
Modifies the metadata and results to renumber the metric's
PMID. As special cases, newcluster could be an asterisk ``*''
which means use cluster from the metric rewriting rule and/or
item could be an asterisk which means use item from the metric
rewriting rule. This is most useful when cluster and/or item
is also an asterisk. So for example:
metric 30.*.* { pmid -> 123.*.* }
will move all metrics from domain 30 to domain 123.
SEM -> newsem
Change the semantics of the metric. newsem should be the XXX
part of the name of one of the PM_SEM_XXX macros defined in
<pcp/pmapi.h> or pmLookupDesc(3), e.g. COUNTER for
PM_TYPE_COUNTER.
No data value rewriting is performed as a result of the SEM
clause, so the usefulness is limited to cases where a version
of the associated PMDA was exporting incorrect semantics for
the metric. pmlogreduce(1) may provide an alternative in cases
where re-computation of result values is desired.
TYPE -> newtype
Change the type of the metric which alters the metadata and may
change the encoding of values in results. newtype should be
the XXX part of the name of one of the PM_TYPE_XXX macros
defined in <pcp/pmapi.h> or pmLookupDesc(3), e.g. FLOAT for
PM_TYPE_FLOAT.
Type conversion is only supported for cases where the old and
new metric type is numeric, so PM_TYPE_STRING, PM_TYPE_AGGRE‐
GATE and PM_TYPE_EVENT are not allowed. Even for the numeric
cases, some conversions may produce run-time errors, e.g. inte‐
ger overflow, or attempting to rewrite a negative value into an
unsigned type.
UNITS -> newunits [ RESCALE ]
newunits is six values separated by commas. The first 3 values
describe the dimension of the metric along the dimensions of
space, time and count; these are integer values, usually 0, 1
or -1. The remaining 3 values describe the scale of the met‐
ric's values in the dimensions of space, time and count. Space
scale values should be 0 (if the space dimension is 0), else
the XXX part of the name of one of the PM_SPACE_XXX macros,
e.g. KBYTE for PM_TYPE_KBYTE. Time scale values should be 0
(if the time dimension is 0), else the XXX part of the name of
one of the PM_TIME_XXX macros, e.g. SEC for PM_TIME_SEC.
Count scale values should be 0 (if the time dimension is 0),
else ONE for PM_COUNT_ONE.
The PM_SPACE_XXX, PM_TIME_XXX and PM_COUNT_XXX macros are
defined in <pcp/pmapi.h> or pmLookupDesc(3).
When the scale is changed (but the dimension is unaltered) the
optional keyword RESCALE may be used to chose value rescaling
as per the -s command line option, but applied to just this
metric.
When changing the domain number for a metric or instance domain,
the new domain number will usually match an existing PMDA's domain
number. If this is not the case, then the new domain number should
not be randomly chosen; consult $PCP_VAR_DIR/pmns/stdpmid for
domain numbers that are already assigned to PMDAs.
EXAMPLES
To promote the values of the per-disk IOPS metrics to 64-bit to allow
aggregation over a long time period for capacity planning, or because
the PMDA has changed to export 64-bit counters and we want to convert
old archives so they can be processed alongside new archives.
metric disk.dev.read { type -> U64 }
metric disk.dev.write { type -> U64 }
metric disk.dev.total { type -> U64 }
The instances associated with the load average metric kernel.all.load
could be renamed and renumbered by the rules below.
# for the Linux PMDA, the kernel.all.load metric is defined
# over instance domain 60.2
indom 60.2 {
inst 1 -> 60 iname "1 minute" -> "60 second"
inst 5 -> 300 iname "5 minute" -> "300 second"
inst 15 -> 900 iname "15 minute" -> "900 second"
}
If we decide to split the ``proc'' metrics out of the Linux PMDA, this
will involve changing the domain number for the PMID of these metrics
and the associated instance domains. The rules below would rewrite an
old archive to match the changes after the PMDA split.
# all Linux proc metrics are in 7 clusters
metric 60.8.* { pmid -> 123.*.* }
metric 60.9.* { pmid -> 123.*.* }
metric 60.13.* { pmid -> 123.*.* }
metric 60.24.* { pmid -> 123.*.* }
metric 60.31.* { pmid -> 123.*.* }
metric 60.32.* { pmid -> 123.*.* }
metric 60.51.* { pmid -> 123.*.* }
# only one instance domain for Linux proc metrics
indom 60.9 { indom -> 123.0 }
FILES
For each of the inlog and outlog archive logs, several physical files
are used.
archive.meta
metadata (metric descriptions, instance domains, etc.) for
the archive log
archive.0 initial volume of metrics values (subsequent volumes have
suffixes 1, 2, ...).
archive.index
temporal index to support rapid random access to the other
files in the archive log.
PCP ENVIRONMENT
Environment variables with the prefix PCP_ are used to parameterize the
file and directory names used by PCP. On each installation, the file
/etc/pcp.conf contains the local values for these variables. The
$PCP_CONF variable may be used to specify an alternative configuration
file, as described in pcp.conf(4).
SEE ALSOPCPIntro(1), pmdaInstance(3), pmdumplog(1), pmlogger(1), pmlogex‐
tract(1), pmloglabel(1), pmlogreduce(1), pmLookupDesc(3), pmns(4),
pcp.conf(4) and pcp.env(4).
DIAGNOSTICS
All error conditions detected by pmlogrewrite are reported on stderr
with textual (if sometimes terse) explanation.
Should the input archive log be corrupted (this can happen if the
pmlogger instance writing the log suddenly dies), then pmlogrewrite
will detect and report the position of the corruption in the file, and
any subsequent information from that archive log will not be processed.
If any error is detected, pmlogrewrite will exit with a non-zero sta‐
tus.
Performance Co-Pilot PMLOGREWRITE(1)