Doxygen Trac – oml

Context Navigation

NAME

liboml2 - OML2 client library

SYNOPSIS

<oml2-enabled-app> [APP-OPTIONS]
            [--oml-id OML_NAME --oml-domain OML_DOMAIN
              [--oml-collect OML_COLLECT] | --oml-noop]
            [--oml-instr-interval SECONDS]
            [--oml-interval SECONDS | --oml-samples COUNT]
            [--oml-log-level -2..4] [--oml-log-file]
            [--oml-config liboml2.conf]
            [--oml-bufsize BYTES]
            [--oml-text|--oml-binary]
            [--oml-help] [--oml-list-filters]
            [--oml-…]

DESCRIPTION

liboml2 is the client library for OML2. It provides an API for application writers to collect measurements from their applications via user-defined Measurement Points (MPs). It also provides a flexible filtering and collection mechanism that allows application users to customize how measurements are processed and stored by an OML-enabled application. Data injected into an MP by the application is serialised into a Measurement Stream (MS), and sent to a Collection Point (e.g., an oml2-server(1) or a file). Measurement samples can also be filtered, pre-processed or aggregated using a set of built-in filters.

This man page documents the command-line user configuration of OML-enabled applications. Application writers who want to learn how to write an application using the liboml2 API should consult liboml2(3). OML also supports an external configuration file, documented in liboml2.conf(5).

When an OML application starts up, it passes the command line arguments to liboml2, which scans them for options that it understands, and uses those to configure itself. liboml2 then removes its own options from the command line so that the application proper does not get confused by them. All OML options start with the prefix --oml-. The --oml-help option lists all the known OML command line options.

The command line options recognized by liboml2 allow for a basic configuration of the measurement library. They are adequate for testing and debugging, but do not offer as much flexibility as the external configuration file format supports. You can pass a configuration file to liboml2 using the --oml-config option. For details, see liboml2.conf(5).

By default, any instrumentation starts with two OML-specific MPs. The first one, _experiment_metadata, contains information about the experiment currently running, as well as application specific details that would not justify a full MP on their own such as version, command line invokation, or unit or precision of some fields. The second automatic MP is _client_instrumentation. When enabled (see option --oml-instr-interval in OML OPTIONS below), the application will periodically inject status report into that stream. Status reports contain the following information

MEASUREMENT FILTERING

The liboml2 can process measurement tuples in-line before they are sent to a collection point. The filtering mechanism allows to select a subset of the fields of an MP which need to be sent along, apply aggregation or summary functions on them rather than sending them verbatim, and set the periodicity (time- or sample- based) at which aggregate reports are sent (see --oml-interval or --oml-samples in OML OPTIONS below).

When no configuration file is given, liboml2 provides a basic set of filters for each MP, and sends measurements to just one location (using the --oml-collect option). For each measurement point, each element of the measurement point’s injected tuple is given its own filter. The filter created depends on the type of the element and the current sampling policy.

For instance, suppose a measurement point defined with a measurement tuple as follows:

MeasurementPoint (name = "link_properties")
("source"      : OML_UINT64_VALUE,
 "destination" : OML_UINT64_VALUE,
 "length"      : OML_INT32_VALUE,
 "snr"         : OML_DOUBLE_VALUE,
 "name"        : OML_STRING_VALUE)

Then liboml2 will create a separate filter for each of source, destination, length, snr, and name. The filters for the first four numeric elements will be an averaging filter (filter type avg), and the last string element will be given a first filter. The first filter keeps the first injected value in the current sampling period and throws away all others, passing the first value on to the measurement output stage.

The exception to this rule is that if --oml-samples was given, and if the argument is 1, then numeric elements are filtered using the first filter, not the averaging filter.

For more information on OML filters and how they are configured, see the documentation for the configuration file format, liboml2.conf(5). For more information on measurement points and how they are defined, see liboml2(3) and omlc_add_mp(3).

MEASUREMENT STREAMS AND SCHEMAS

Measurement data is serialised using the OML Measurement Stream Protocol (OMSP). OMSP has two variant: binary marshalling, and text mode. The former is presumably more compact, but the latter is easier to read and generate.

Where MSs generated by an OML instrumentation are sent depends on the collection URI (see URI FORMAT below). If using the tcp scheme, it is sent to the named oml2-server(1), for storage in an SQL database, while using the file or flush schemes will create a local file with the given name containing OML text mode protocol which can later on be streamed (e.g., using nc(1)) to an oml2-server. Measurement points are created with a schema, as above, a schema being an ordered list of (name, type) pairs.

OML filters also generate output with a declared schema. For each measurement point, liboml2 generates a single output measurement that is the union of the outputs of all filters attached to the MP. The names of the fields (or columns) of the schema are derived from the names of the original MP fields, and the output schemas of the filters. The schemas can be observed directly with the file collection URI schemes (e.g., file:-) output (identical schemas are sent to the server when the tcp scheme is used). For instance, here is the output schema for a simple example MP that measures a string (label) and an integer (seq_no):

schema: 1 generator_lin label:string seq_no:uint32

The schema name is generator_lin — a combination of the application name (generator) and the measurement point name (lin). (The number 1 on this line is an index used in the output columns to identify a line of measurement with the schema to which it conforms.) This output was made using --oml-samples 1 on the command line. This creates a first filter for both of the fields of the measurement point. The first filter outputs a single value that has the same type as the filter’s input.

If we change the command line to use --oml-samples 2, then an averaging filter is used for the numeric seq_no field (label is unchanged). The schema therefore changes as well:

schema: 1 generator_lin label:string \
                        seq_no_avg:double \
                        seq_no_min:double \
                        seq_no_max:double

(liboml2 always emits schemas on a single line, but this schema is split into several for readability.)

An avg filter picks one field of the MP to filter (in this case seq_no) and then produces a 3-tuple as output (avg, min, max). Therefore liboml2 creates a schema for this filter output that looks like:

("seq_no_avg" : OML_DOUBLE_VALUE,
 "seq_no_min" : OML_DOUBLE_VALUE,
 "seq_no_max" : OML_DOUBLE_VALUE)

This is the general pattern for filters: their output schemas are formed by appending the name of the source MP with the name of the filter output field. (The first filter is an exception in that it just takes the name of the input field and uses that as the output field name.)

When a tcp collection URI is specified, the received oml2-server creates a database table for each measurement point using the combined OML output schema as schema for the table. For instance, the above example would translate to an SQL CREATE TABLE statement like:

CREATE TABLE generator_lin (label TEXT, seq_no_avg REAL, seq_no_min REAL, seq_no_max REAL);

Note that even though an MP field may have an integral type, it may be represented as a floating point type in the output because the filter may output floating point values. For instance, the average of a set of integers is real valued because of the division in the averaging operation.

Another situation may arise when a table by the same name, but with a different schema, already exists in the target database. In this case, the oml2-server(1) tries to incrementally number the new table’s name (e.g., mymp_2, mymp_3,… for stream mymp). It makes 9 attempts at renaming the table before failing. Indeed, it would be unwise for experiments that use a large number of instances of the same application with different filter configuration to not explicitly name each of them. This can be done through the use of a configuration file, with these rename attribute of the relevant mp element, see liboml2.conf(5).

STREAM COMPRESSION

The OML client library can be instructed (see URI FORMAT below) to compress its data streams (using the GZip algorithm). In this case, all input data, be it text or binary OMSP.

Currently, OML leaves it up to the zlib(3) deflate() algorithm to decide when to output compressed data, affording a good compression ratio, particularly for the text mode OMSP. However, this means that some delay will generally be introduced between the time when a sample is injected, and that when a compressed block containing it is output towards the collection point. It is therefore not recommended to use compression for applications where (near) real-time collection of data is required.

Note: for compressed streams to be successfully received by the collection point, it needs to have support for this feature. In the case of the oml2-server(1), this means that it should have been compiled and linked against the Zlib.

OML OPTIONS

--oml-id OML_NAME: Set the OML client’s identity to OML_NAME. This is used by the oml2-server(1) to distinguish measurements of the same type from different sources (for example, the same application running on a different machine). If --oml-id is not given, an error is printed and measurement collection will not occur (although the application may still run).
--oml-domain OML_DOMAIN
--oml-exp-id OML_DOMAIN: Set the OML client’s experimental domain to OML_DOMAIN. This is used by the oml2-server(1) to group measurements from different clients that logically belong to the same group. Multiple applications running on different machines can contribute measurements to the same experiment. If --oml-domain is not given, an error is written to the log file and measurement collection will not occur (although the application may still run). The --oml-exp-id option is obsolescent, and --oml-domain should be preferred.
--oml-collect OML_COLLECT
--oml-server OML_COLLECT: Send measurements to the OML server specified by the OML_COLLECT URI. Only one of --oml-collect or --oml-server can be active at a time. The format of OML_COLLECT is described in URI FORMAT below. The OML_COLLECT can specify the address and port at which either an oml2-server or an oml2-proxy-server is listening. The --oml-server option is obsolescent, and --oml-collect should be preferred.
--oml-file FILENAME: Write measurements to file FILENAME, in a human readable text format. This is an exact equivalent to --oml-collect file:FILENAME, see below. Only one of --oml-collect (or --oml-server) or*--oml-file* can be specified at a time. This command line option is obsolescent, and --oml-collect with the file scheme should be preferred.
--oml-noop: If this option is given, no measurements are collected, and the application does not attempt to connect to an OML server or write measurements to file.
--oml-instr-interval SECONDS: This option allows to set the periodicity of self-instrumentation reports into the _client_instrumentation MS. The defaults is 1000ms, and the feature can be disabled altogether by setting it to 0.
--oml-interval SECONDS: Make all measurement point filters produce an output periodically with a time period of SECONDS. Only one of --oml-interval and --oml-samples can be given on the same command line.
--oml-samples COUNT: Make all measurement point filters produce an output after every COUNT samples have been injected into them. Only one of --oml-samples and --oml-interval can be given on the same command line. If neither --oml-samples nor --oml-interval are given, then liboml2 behaves as if --oml-samples was given with an argument of COUNT=1.
--oml-config FILE: Read the contents of FILE and use them to configure the liboml2 client. See liboml2.conf(5) for details of the configuration file format. Generally, the configuration taken from FILE overrides any equivalents from the command line. Command line options that cannot be set using the configuration file are --oml-noop, --oml-instr-interval, --oml-bufsize, --oml-log-level, and --oml-log-file.
--oml-log-level n: Record logging information at a level of detail given by n, which should be an integer from 0 to 4. The default level of logging is 0, which prints ERROR, WARNING, and INFO messages. Levels 1 to 4 add gradually larger amounts of debug logging (DEBUG, DEBUG2, DEBUG3, DEBUG4). It is possible to set n to -1 for only ERROR and WARNING logging, or -2 for only ERROR logging.
--oml-log-file file: Write OML logging information to file. The amount of logging information recorded is controlled by --oml-log-level.

By default, the OML library logs messages to the application’s stderr, prefixed with the level of the messages. If logging to a file, messages are also prefixed with a timestamp. The application writer can override this behaviour by providing a custom logging function when calling omlc_init(3).
--oml-bufsize size (bytes): Set the total target size of internal buffers used by liboml2. The library maintains a chain of output buffers to accomodate for mismatches in the speed of sample inputs and OMSP outputs. Each of these buffers is allowed to grow to about 1024B in size before the client starts writing into the next one. The --oml-bufsize option specifies a lower bound on the maximum total buffer size. The default is 2048B (i.e., 2 buffers in the chain). If the number of buffers, calculated from these sizes, is exceeded, liboml2 will start dropping measurement data (with a message in the client log file), by overwriting data in an existing buffer rather than allocating a new one. Increasing the buffer size may prevent this from happening, depending on the application design.
--oml-text: Encode measurements using text OMSP when writing to either a local file or a remote server. Text format is easy for scripts to parse, with one measurement per line, and is the default when a file URI scheme is used. Only one of --oml-text and --oml-binary should be used on the same command line.
--oml-binary: Encode measurements using binary OMSP when writing to either a local file or a remote server. The binary format is the default when tcp URI scheme is given, and provides better performance. Only one of --oml-text and --oml-binary should be used on the same command line.
--oml-help: Prints a summary of the available OML options.
--oml-list-filters: This option prints the available filters to the console and then quits the application.

URI FORMAT

liboml2 accepts a uri argument for the --oml-collect option that is similar to an IETF URI (see e.g. RFC3986). The OML URI consists of an optional network protocol, a host identifier, and an optional port number, or a mandatory file (or flush )scheme and a local filesystem path. The format of the network server version is:

[tcp://]<host>[:<port>]

The formats for the local file version is:

(file|flush):<local-path>

For instance, tcp://collect.example.net:3003 will send measurements to an oml2-server listening on port 3003 on host collect.example.net, using TCP. The // is recommended for URIs with a host part, but not mandatory to preserve backward compatibility. Either a hostname or an IP address can be used as the <host> specifier. Both IPv4 and IPv6 addresses are supported, but it is mandatory to put the latter between square brackets; this is optional for the former. tcp://[2001:db8::1]:3003, tcp://[192.0.2.200]:3003 and tcp://192.0.2.200:3003 are all valid forms. The <port> specifier is optional, defaulting to port 3003. The tcp scheme is the default if this part is omitted.

Alternatively, file:/tmp/myfile.txt writes to the /tmp/myfile.txt file in the local filesystem. Relative paths are also accepted. There should be no double-slash after the colon: file://myfile.txt will try to write the output in the root directory. The special file URL file:- will write output to the standard output. The flush scheme behaves exactly in the same way as file except the file descriptor is flushed after each a measurement line is written out. This allows to prevent any size-based buffering in the C library from delaying the recording of samples. This is useful in case of, e.g., real time graphing of the data based on the contents of the file.

Both URI schemes can be prefixed by gzip+ to enable compression (see STREAM COMPRESSION above), e.g., gzip+tcp://collect.example.net:3003 or gzip+file:/tmp/myfile.txt. Note however that the main scheme is not optional in this case, and both gzip://collect.example.net:3003 and gzip:/tmp/myfile.txt are invalid.

ENVIRONMENT VARIABLES

liboml2 recognizes the following environment variables. Note that the equivalent command line options override any value set in an environment variable.

OML_NAME

The name to identify this client to the OML server, equivalent to the --oml-id command line option.

OML_DOMAIN

OML_EXP_ID

The name of the experiment to which this client’s measurements belong. Used for grouping measurements into the same database on the server. Equivalent to the --oml-domain command line option. OML_EXP_ID is obsolescent and kept for backward compatibility.

OML_CONFIG

The path to the configuration file to use to configure the client. Equivalent to the --oml-config command line option.

OML_COLLECT

OML_SERVER

The URI of the server to which the measurements should be sent. The value of this environment variable should be a URI in the same format specified above (URI FORMAT). If the URI specifies the file scheme, the measurements are written to the local text file specified in the URI, rather than being sent to an OML server. Equivalent to the --oml-collect command line option. OML_SERVER is obsolescent and kept for backward compatibility.

OML_FEATURES

A comma-separated list of run-time features to enable. Currently the following are recognized:

default-log-simple (default): All log output will be sent to stderr if no log file was selected using --oml-log-file, or if the argument of --oml-log-file was -, and the logging will use new-style logging to stderr. This means all log messages are prefixed with a label indicating the severity, i.e., ERROR, WARN, INFO, DEBUG, DEBUG2, etc.

BUGS

The selection of the first filter when --oml-samples 1 is used can be confusing for numeric MP fields because it results in a different schema in the measurement output compared to other possible configurations available from the command line, which use the avg filter. It is not clear whether this is a feature or a bug.

If a problem you are experiencing is not addressed in the FAQ (http://oml.mytestbed.net/projects/oml/wiki/FAQ_and_Support) nor already present in the list of know bugs (http://oml.mytestbed.net/projects/oml/issues). You could discuss it on the mailing list (details and archives at http://oml.mytestbed.net/tab/show?id=oml).

It is however advisable to open a ticket on our issue tracker at http://oml.mytestbed.net/projects/oml/issues/new. Don’t forget to include details such as client and server logs (at [--oml-log-level|-d] 2). It also helps if you can share the source code of a (minimal, if possible) example reliably triggering the problem.

SECURITY CONSIDERATIONS

oml2-server does not use any authentication, and should thus be considered insecure. It is intended to be deployed behind firewalls on a dedicated testbed network. It should not be run as a daemon on an open network. Future versions of OML may be re-designed to be suitable for use in insecure environments.