NAME

liboml2.conf - OML2 client library configuration file format

SYNOPSIS

<oml2-enabled-app> [APP-OPTIONS] --oml-config <config-file>

DESCRIPTION

liboml2 is the client library for OML2. It provides an API for application writers to collect measurements from their applications via user-defined Measurement Points (MPs). It also provides a flexible filtering and collection mechanism that allows application users to customize how measurements are processed and stored by an OML-enabled application.

This man page documents the format of the configuration file that liboml2 uses to configure itself at runtime. Application writers who want to learn how to write an application using the liboml2 API should consult liboml2(3). OML also supports simpler but less powerful configuration via the command line, for testing and debugging. It is documented in liboml2(1).

When an OML application starts up, it passes the command line arguments to liboml2, which scans them for options that it understands, and uses those to configure itself. liboml2 then removes its own options from the command line so that the application proper does not get confused by them. All OML options start with the --oml- prefix. For more information, see liboml2(1).

The --oml-config command line option accepts the name of a configuration file. If specified, this file is used to configure the library internally. All other --oml-* command line options are ignored when it is specified. The file is an XML document. The configuration file serves two main purposes: configuring the destinations for measurement outputs, and configuring the filtering of the MP inputs prior to sending them to their destinations. It is therefore important to understand how OML filters work.

The data path between a measurement point and the measurement destination looks like this:

        ---> F1 ---
       /           \
MP *--- ---> F2 --- ---> [ Collector ]---> [ File/Network ]
       \           /
        |   ...   |
        |         |
         --> FN --

The samples injected to the MP are sent to N filters. The outputs of all the filters are then multiplexed together into a combined output stream by a collector. A "collector" takes measurement streams from the output of the filters, multiplexes them together, and then sends the combined measurement stream on to the destination, which is either a file on the local disk or a remote OML server (see oml2-server(1)). Each collector sends measurements to just one destination, but the configuration can specify multiple collectors, each with a different destination.

Note that although the diagram above shows the outputs of the filters attached to one MP being sent to one collector, that same collector may be accepting the output streams of filters attached to several MPs. Further, the samples injected to an MP can be sent to a second (or third, etc.) set of filters and then on to another collector. This will become clear in the description of the configuration file format below.

Note that there is no round-robin scheme operating among the filters; all injected samples are sent to all filters attached to the MP.

There are two mechanisms available for generating outputs from filters. The first is the "samples" method. It causes a filter output sample to be generated for every n-th sample injected into the filter. The second is the "interval" method. It causes a filter output sample to be generated every t seconds. The parameters t and n can be specified in the configuration file. The output mechanism can be selected independently for each MP (or more accurately, for each <stream/> element in the configuration file).

An OML filter is a processing block that accepts a stream of typed tuples from an MP on its input, computes a function of the input tuples, and outputs the result as another stream of tuples. The schema of the input tuples can be different from the schema of the output tuples, i.e. they can differ in number, name and type. Further, the output samples can be generated at a different rate to the input samples. For instance, an output might be generated only on every tenth input sample. For more on MP and filter schemas and how measurement outputs are generated, see the section MEASUREMENT OUTPUT AND SCHEMAS below.

CONFIGURATION FILE FORMAT

The configuration file must be an XML file. The root element must be omlc, and its children must be collect elements. The following example shows the skeleton of an OML config file:

<omlc domain="my_experiment" id="my_source_id">
  <collect url="..." encoding="binary">
     ...
  </collect>
  ...
  <collect url="...">
     ...
  </collect>
</omlc>

The omlc element recognizes two attributes: experiment and id. The domain attribute names the experimental domain that the client wants to join. It names the database that the measurements will be stored in by the oml2-server(1) if one of the destinations is a server. This is the same as the --oml-domain flag on the command line. The obsolescent experiment attribute can still be used for this purpose.

The id attribute identifies the source of these measurements. Typically it is used to identify the machine, but it is up to the experimenter what to put in the id field. This is the same as the --oml-id flag on the command line.

The encoding attribute can be use to specify which protocol mode to use. binary is the default binary marshalling mechanism, while text switches to text mode.

The collect elements identify separate destinations for the measurements generated by the client programme. The url attribute identifies the destination. It can be either a file, or the IP address (or hostname) and port of an oml2-server(1). See URI FORMAT in liboml2(1). Within the collect element should be a sequence of stream elements. Each stream element defines the sampling policy and filtering to apply to a measurement stream derived form a particular named MP. Here is an example of a configuration with two stream elements:

<omlc domain="my_experiment" id="my_source_id">
  <collect url="tcp:192.0.2.200">
     <stream mp="radiotap" interval="2">
      <filter field="sig_strength_dBm" />
      <filter field="noise_strength_dBm" />
      <filter field="power" />
     </stream>
     <stream mp="udp" samples="10">
      <filter field="udp_len" />
     </stream>
  </collect>
</omlc>

This example connects to a server on the default TCP port (3003) at IP address 192.0.2.200, and extracts measurements from two of the application’s defined measurement points, named radiotap and udp, respectively. The radiotap measurements are sampled every two seconds, and the udp measurements are sampled every 10 samples. For the radiotap measurements, three filters are defined: one for the sig_strength_dBm field, one for the noise_strength_dBm field, and one for the power field. For the udp measurements, a single filter is established on the udp_len field.

The stream element also accepts a name attribute that allows the stream name to be controlled by the user. If name is omitted then the stream is named according to <app-name>_<mp-name>, where <app-name> is the application name as set by the application developer, and <mp-name> is the name of the MP. If name is used then its value is used to construct the stream name as <app-name>_<stream-name>. A stream must have a unique name, so if the user wants to create two streams from the same MP then at least one of them must have a name attribute. Omitting it, or having two streams with identical name attributes, will result in the liboml2 configuration process aborting with an error message about the duplicate stream in the OML client log file.

Filters operate on a single scalar input value. The filter element establishes a filter and the field attribute selects the field of the MP that should form the input for the filter. The field attribute is mandatory.

Without any further attributes, the filter element establishes a default filter. The default filter type is avg for numeric values and first for non-numeric values such as strings.

The filter element also recognizes the operation attribute, which allows the user to select what type of filter to apply. For instance, the following selects a standard deviation filter, stddev:

<filter field="udp_len" operation="stddev"/>

The rename attribute allows the user to name the stream output from this filter:

<filter field="udp_len" operation="stddev" rename="udp_measurements"/>

It is possible to include several stream elements using the same mp attribute value. In that case, to avoid ambiguity the second will be internally renamed to "<name>_2", the third to "<name>_3", etc. This renaming will appear in the schema of the measurement outputs (either in the local file or the database on the server end). This behaviour may be augmented in a future version to give more control of the renaming to the user.

CONFIGURATION WITHOUT XML FILE

When no configuration file is given, liboml2 provides a basic set of filters for each measurement point, and sends measurements to just one collection URI (given by either the --oml-collect command line option). For each measurement point, each element of the measurement point’s injected tuple is given its own filter. The filter created depends on the type of the element and the current sampling policy.

For instance, suppose a measurement point is defined with a measurement tuple as follows:

("source"      : OML_UINT64_VALUE,
 "destination" : OML_UINT64_VALUE,
 "length"      : OML_INT32_VALUE,
 "snr"         : OML_DOUBLE_VALUE,
 "name"        : OML_STRING_VALUE)

Then liboml2 will create a separate filter for each of "source", "destination", "length", "snr", and "name". The filters for the first four numeric elements will be an averaging filter (filter type avg), and the last string element will be given a first filter. The first filter keeps the first injected value in the current sampling period and throws away all others, passing the first value on to the measurement output stage.

For more information on measurement points and how they are defined, see liboml(3) and omlc_add_mp(3).

MEASUREMENT OUTPUT AND SCHEMAS

The measurement output of an OML program goes either to an SQL database (if using a network address in the collect element’s url attribute) or a file (if using the file:// url protocol). Measurement points are created with a schema, as above, a schema being an ordered list of (name, type) pairs.

OML filters also generate output with a declared schema. For each measurement stream, liboml2 generates a single output measurement that is the union of the outputs of all filters attached to the MP. The names of the fields (or columns) of the schema are derived from the names of the original MP fields, and the output schemas of the filters. The schemas can be observed directly in the file output (identical schemas are sent to the server when a server is used). For instance, here is the output schema for stream that takes its inputs from a simple example MP that measures a string ("label") and an integer ("seq_no"):

schema: 1 generator_lin label:string seq_no:uint32

The schema name is "generator_lin" — a combination of the application name ("generator") and the stream name ("lin"). (The number 1 on this line is an index used in the output columns to identify a line of measurement with the schema to which it conforms.) This output can be generated using an mp element with samples="1" and no explicit filter:

<omlc domain="my_experiment" id="my_source_id">
 <collect url="file:-">
   <stream mp="lin" samples="1" />
 </collect>
</omlc>

This creates a first filter for both of the fields of the measurement point. The first filter outputs a single value that has the same type as the filter’s input.

If we change the configuration file to use samples="2", then an averaging filter is used for the numeric "seq_no" field ("label" is unchanged). The schema therefore changes as well:

schema: 1 generator_lin label:string seq_no_avg:double seq_no_min:double seq_no_max:double

An avg filter picks one field of the MP to filter (in this case "seq_no") and then produces a 3-tuple as output (avg, min, max). Therefore liboml2 creates a schema for this filter output that looks like:

("seq_no_avg" : OML_DOUBLE_VALUE,
 "seq_no_min" : OML_DOUBLE_VALUE,
 "seq_no_max" : OML_DOUBLE_VALUE)

This is the general pattern for filters: their output schemas are formed by appending the name of the source MP with the name of the filter output field. (The first filter is an exception in that it just takes the name of the input field and uses that as the output field name.)

When output is sent to a server, a database table is created for each measurement point using the combined OML output schema as schema for the table. For instance, the above example would translate to an SQL CREATE statement like:

CREATE TABLE generator_lin (label TEXT, seq_no_avg REAL, seq_no_min REAL, seq_no_max REAL);

Note that even though an MP field may have an integral type, it may be represented as a floating point type in the output because the filter may output floating point values. For instance, the average of a set of integers is real valued because of the division in the averaging operation.

If we use the name attribute of the <stream/> element, the name of the schema will follow the name attribute rather than the name of the source MP. For instance consider this configuration file:

<omlc domain="my_experiment" id="my_source_id">
  <collect url="file:-">
     <stream mp="lin" samples="1" name="foo"/>
  </collect>
</omlc>

It will generate the following schema declaration:

schema: 1 generator_foo label:string seq_no:uint32

AVAILABLE FILTERS

The following lists the filters that are available in OML, and describes how they should be used. We plan to add more filters with each release of OML.

First Filter (first)

This filter saves the first sample in a sample set and throws away all the rest, outputting just the first sample. It accepts any type of value as its input. It outputs a single value:

("first" : OML_INPUT_VALUE)

The pseudo-type OML_INPUT_VALUE indicates that this filter’s output has the same type as its input.

To use this filter, use operation="first" in the filter element.

Last Filter (last)

This filter saves the last sample in a sample set and throws away all the rest, outputting just the last sample. It accepts any type of value as its input. It outputs a single value:

("last" : OML_INPUT_VALUE)

The pseudo-type OML_INPUT_VALUE indicates that this filter’s output has the same type as its input.

To use this filter, use operation="last" in the filter element.

Averaging Filter (avg)

This filter computes the average of its input samples. It accepts numeric inputs only (one of the OML integer types or OML_DOUBLE_VALUE). It outputs a pair of values, namely:

("avg" : OML_DOUBLE_VALUE,
 "min" : OML_DOUBLE_VALUE,
 "max" : OML_DOUBLE_VALUE)

where avg is the average over the current sample set, min is the minimum value of the current sample set, and max is the maximum value of the current sample set.

To use this filter, use operation="avg" in the filter element.

Standard Deviation Filter (stddev)

This filter computes the standard deviation and variance of its inputs samples. It accepts numeric inputs only (one of the OML integer types or OML_DOUBLE_VALUE). It outputs a pair of values, namely:

("stddev"   : OML_DOUBLE_VALUE,
 "variance" : OML_DOUBLE_VALUE)

where stddev is the standard deviation over the current sample set and variance is the variance (i.e. the square of stddev).

To use this filter, put operation="stddev" in the filter element.

Sum Filter (sum)

This filter computes the sum of its input samples. It accepts numeric inputs only (one of the OML integer types or OML_DOUBLE_VALUE). It outputs a single value, namely:

("sum" : OML_DOUBLE_VALUE)

where sum is the sum all the sample values in the current sample set.

To use this filter, use operation="sum" in the filter element.

Delta Filter (delta)

This filter computes the change in its input value between the end of the previous sample set and the start of the current one. If the value at the end of the previous sample set was last and the value at the end of the current sample set was current, then the filter computes delta=current-last. It accepts numeric inputs only (one of the OML integer types or OML_DOUBLE_VALUE). It outputs a pair of values, namely:

("delta" : OML_DOUBLE_VALUE,
 "last   : OML_DOUBLE_VALUE)

where delta is the change in the input over the current sample set and last is the value that the input had at the end of the current sample set.

The value of delta in the first sample set is computed as current-first, where first is the first value in the sample set and current is the final value.

To use this filter, use operation="delta" in the filter element.

NOTES

Prior to OML 2.6, the configuration file format was identical in structure but used a different set of names for the XML attributes and elements. These old names were very confusing, so they have been replaced names that better reflect the underlying concepts. The old names are still supported so old configuration files and tools will not break. Here is a summary of the old names and how they relate to the new ones:

OLD

NEW

<omlc exp_id="abc" …>

<omlc domain="abc" … >

<mp name="def" rename="bar"…>

<stream mp="def" name="bar"…>

<f fname="avg" …>

<filter operation="avg" …>

<f pname="foo" …>

<filter field="foo" …>

<f sname="bar" …>

<filter rename="bar" …>

BUGS

The selection of the first filter when samples=1 is used can be confusing for numeric MP fields because it results in a different schema in the measurement output compared to other possible configurations available from the command line, which use the avg filter. It is not clear whether this is a feature or a bug.

If a problem you are experiencing is not addressed in the FAQ (http://oml.mytestbed.net/projects/oml/wiki/FAQ_and_Support) nor already present in the list of know bugs (http://oml.mytestbed.net/projects/oml/issues). You could discuss it on the mailing list (details and archives at http://oml.mytestbed.net/tab/show?id=oml).

It is however advisable to open a ticket on our issue tracker at http://oml.mytestbed.net/projects/oml/issues/new. Don’t forget to include details such as client and server logs (at [--oml-log-level|-d] 2). It also helps if you can share the source code of a (minimal, if possible) example reliably triggering the problem.

SECURITY CONSIDERATIONS

oml2-server does not use any authentication, and should thus be considered insecure. It is intended to be deployed behind firewalls on a dedicated testbed network. It should not be run as a daemon on an open network. Future versions of OML may be re-designed to be suitable for use in insecure environments.