Plugins

Plugins connect data-producing applications to DSI core functionalities. Plugins have writers or readers functions. A Plugin reader function deals with existing data files or input streams. A Plugin writer deals with generating new data. Plugins are modular to support user contribution.

Plugin contributors are encouraged to offer custom Plugin abstract classes and Plugin implementations. A contributed Plugin abstract class may extend another plugin to inherit the properties of the parent. In order to be compatible with DSI core, Plugins should produce data in Python built-in data structures or data structures sourced from the Python collections library.

Note that any contributed plugins or extension should include unit tests in plugins/tests to demonstrate the new Plugin capability.

Figure depicting the current plugin class hierarchy.

Figure depicts the current DSI plugin class hierarchy.

class dsi.plugins.plugin.Plugin(path)

Plugin abstract class for DSI core product.

A Plugin connects a data reader or writer to a compatible middleware data structure.

abstract add_to_output(path)

Initialize Plugin setup.

Read a Plugin file. Return a Plugin object.

class dsi.plugins.metadata.StructuredMetadata(**kwargs)

plugin superclass that provides handy methods for structured data

add_to_output(row: list) None

Adds a row of data to the output_collector and guarantees good structure. Useful in a plugin’s add_rows method.

schema_is_set() bool

Helper method to see if the schema has been set

set_schema(column_names: list, validation_model=None) None

Initializes columns in the output_collector and column_cnt. Useful in a plugin’s pack_header method.

class dsi.plugins.env.Environment

Environment Plugins inspect the calling process’ context.

Environments assume a POSIX-compliant filesystem and always collect UID/GID information.

class dsi.plugins.env.GitInfo(git_repo_path='./')

A Plugin to capture Git information.

Adds the current git remote and git commit to metadata.

add_rows() None

Adds a row to the output with POSIX info, git remote, and git commit

pack_header() None

Set schema with POSIX and Git columns

class dsi.plugins.env.Hostname(**kwargs)

An example Environment implementation.

This plugin collects the hostname of the machine, and couples this with the POSIX information gathered by the Environment base class.

add_rows() None

Parses environment provenance data and adds the row.

pack_header() None

Set schema with keys of prov_info.

class dsi.plugins.env.SystemKernel

Plugin for reading environment provenance data.

An environment provenance plugin which does the following: 1. System Kernel Version 2. Kernel compile-time config 3. Kernel boot config 4. Kernel runtime config 5. Kernel modules and module config 6. Container information, if containerized

add_rows() None

Parses environment provenance data and adds the row.

static get_cmd_output(cmd: list, ignore_stderr=False) str

Runs a given command and returns the stdout if successful.

If stderr is not empty, an exception is raised with the stderr text.

get_kernel_bt_config() dict

Kernel boot-time configuration is collected by looking at /proc/cmdline.

The output of this command is one string of boot-time parameters. This string is returned in a dict.

get_kernel_ct_config() dict

Kernel compile-time configuration is collected by looking at /boot/config-(kernel version) and removing comments and empty lines.

The output of said command is newline-delimited option=value pairs.

get_kernel_mod_config() dict

Kernel module configuration is collected with the “lsmod” and “modinfo” commands.

Each module and modinfo are stored as a key-value pair in the returned dict.

get_kernel_rt_config() dict

Kernel run-time configuration is collected with the “sysctl -a” command.

The output of this command is lines consisting of two possibilities: option = value (note the spaces), and sysctl: permission denied … The option = value pairs are added to the output dict.

get_kernel_version() dict

Kernel version is obtained by the “uname -r” command, returns it in a dict.

get_prov_info() str

Collect and return the different categories of provenance info.

pack_header() None

Set schema with keys of prov_info.