Syntax: Program Matching Rules
To write program matching rules, which Pepperdata compares to every process that’s running, you need to know the required YAML elements. It’s beneficial to know what pid files the programs that you want to monitor generate, and to be aware of the process names of other programs that are running on your hosts so you can know what differentiates their names from the names of custom programs you want to monitor.
We recommend that you study the examples and use them as models for your program matching rules.
On This Page
Example: Annotated Resource Manager Matching Rule
This example shows a typical program matching rule, for the Resource Manager. The rule matches any program pid whose launch command matches any (one or more) of the following conditions:
-
Contains substring “-Dproc_resourcemanager” AND whose pid matches the process id string in the
/var/run/hadoop-yarn/yarn-yarn-resourcemanager.pid
file -
Contains substring “ResourceManager”
-
Contains regex “Dproc_resourcemanager.*”
When Pepperdata finds a process that matches this program matching rule, PepAgent monitors the process throughout its lifetime and displays it in the Pepperdata dashboard, with the label name, “ResourceManager”.
Override Preconfigured Program Monitoring
By default, Pepperdata software is preconfigured to monitor Impala®, Apache Spark History Server, and MapReduce Job History Server processes. If you do not want to monitor these programs you can override their program matching rules in your own yaml file for program matching rules.
The program matching rules for the preconfigured program monitoring are in the /opt/pepperdata/supervisor/lib/pepagent-program-monitor-config-default.yaml
file.
Do not edit this file, but use it as a reference to find the applicable configuration labels.
For this file’s listing, see Preconfigured Custom Program Monitoring.
To configure your overrides, create rules in your custom rules file—the file specified by the pepperdata.agent.program.monitor.configPath
property—as follows:
-
To deactivate program matching for a given label, add the label to your custom rules file, and assign the
active
key a value of “no”. -
To replace the default matching rules with custom rules, add the label and a rules dictionary to your custom rules file. Your rules completely replace the default rules so that the default rules are not applied.
-
To add rules for a preconfigured program label, add the label to your custom rules file, and add the new rules to an
add-rules
key. The syntax for theadd-rules
key is identical to that of therules
key. Your rules are added to the default rules.
YAML Sections: Program Matching Rules
The following yaml file snippet shows the structure for program matching rules. Unless labeled optional, keys are required.
The structure includes the following (all keys are required unless labeled optional):
-
programs: Top level dictionary; “programs” specifies Pepperdata custom program matching.
-
Each entry in programs is a program monitoring definition. A program monitoring definition is identified by a label of the form \\w++ (from Java Regex definition constructs ). The label must be unique within the programs dictionary. In this example, the label is “NodeManager”.
- Each label has the following keys:
- (optional; default=yes) active: yes|no
- rules: <list of program matching rules>
- Each program matching rule is a dictionary with the following keys:
- (optional) pid-locations: One or more pid files that might contain the pid to monitor. If the program daemon does not generate pid files, do not include this key.
- command-match: One or more dictionaries that contain a substring or regex to match.
- (optional) ignore-match: One or more dictionaries that contain a substring or regex that, if matched, removes the child (forked process) match from the result set.
- Each command-match key is a dictionary with the following keys:
- [ regex: a regex pattern in Java regex syntax | substring: case-sensitive character string ]
- (optional) ignore-match: One or more dictionaries that contain a substring or regex to match.
- Each ignore-match key has one of the following keys to specify a substring or regex that, if matched, removes the child match from the result set:
- regex: a regex pattern in Java regex syntax
- substring: case-sensitive character string
Best Practices: Writing Program Matching Rules
There is so much flexibility when writing program matching rules that it’s easy to end up monitoring more programs than you intended or with data from so many programs that it’s difficult to find what you’re looking for on the Pepperdata dashboard. But by following a few best practices around naming, command matching, and referencing appropriate pid files, you can ensure an easy-to-use result.
-
Choose program label names that are easy to understand and unambiguous. Remember that the labels appear in the Pepperdata dashboard.
-
If there is a program pid file for the program that you want to monitor, specify it in the rule. This enables PepAgent to take an optimized code path for faster matching as it scans the process tree.
-
Use as specific a matching rule as possible for your program of interest to ensure that your matching rules do not match processes that you do not want to monitor.
-
Use the custom program matching rules linter in the Pepperdata package to ensure that your rules are valid; see Verify and Validate Program Matching Rules.