Configure Pepperdata Logs Retention and Disk Usage (RPM/DEB)
By default, Pepperdata sets a disk usage cap (PD_MAX_LOG_DIR_SIZE
) of 5 GB as the maximum size for its accumulated metrics and message log files.
So long as this cap is not reached, Pepperdata retains log files for seven (7) days (PD_MAX_LOG_AGE_DAYS
) before deleting them, and the Pepperdata Collector (the pepcollectd
daemon) uploads data that is up to seven (7) days old (PD_LOG_PROC_MAX_AGE_DAYS
).
When the disk usage cap is reached, Pepperdata deletes enough log files, starting with the oldest ones, to reduce the disk usage to less than the cap.
Although the age caps—limits on how long log files are eligible for uploading and when they’re ready for deletion—can be important for business requirements such as retaining sensitive files for a given amount of time or for custom processing, the PD_MAX_LOG_DIR_SIZE
size cap is the appropriate focus for controlling disk usage.
To override the default disk usage cap and/or log retention policies, you can add any of the following environment variables to the Pepperdata configuration.
For RPM/DEB-based installations, add the environment variables to the Pepperdata configuration file, /etc/pepperdata/pepperdata-config.sh
.
For Parcel for Cloudera/Cloudera Manager-based installations/management, add the environment variables to the appropriate Cloudera Manager template.
See the procedure for details.
-
PD_LOG_DIR
: (default=/var/log/pepperdata
) Directory to which Pepperdata writes its log files. -
PD_MAX_LOG_DIR_SIZE
: (default=5368709120, which is 5 GB) Size cap (maximum total size), in bytes, of all the log files in the directory specified by thePD_LOG_DIR
environment variable (default=/var/log/pepperdata
).When the PepAgent (the pepagentd daemon) starts, it verifies that there is sufficient capacity on the partition where
PD_LOG_DIR
is located. If the capacity is less thanPD_MAX_LOG_DIR_SIZE
, the PepAgent will not start. -
PD_MAX_LOG_AGE_DAYS
: (default=the value ofPD_LOG_PROC_MAX_AGE_DAYS
) Number of days a log file is retained before Pepperdata deletes it. -
PD_LOG_PROC_MAX_AGE_DAYS
: (default=7) Maximum age of a log file that the Pepperdata Collector (the pepcollectd daemon) will upload to the Pepperdata dashboard.If you lose connectivity to Pepperdata for longer than the
PD_LOG_PROC_MAX_AGE_DAYS
value, pepcollectd will be unable to upload the log file before it exceedsPD_LOG_PROC_MAX_AGE_DAYS
, and the log file’s data will be lost. -
PD_ARCHIVE_DIR
: (no default) Directory in which to archive old log files instead of deleting them when they exceed the maximum age (thePD_LOG_PROC_MAX_AGE_DAYS
environment variable value). Not applicable unless thePD_CLEAN_LOG_DIR
environment variable is enabled (its value set to 1). -
PD_CLEAN_LOG_DIR
: (default=1/enabled) Enable/disable Pepperdata from cleaning (deleting or archiving) its log files.
Procedure
-
Add the environment variables that you want to configure.
-
On any host in the cluster, open the Pepperdata configuration file,
/etc/pepperdata/pepperdata-config.sh
, for editing. -
Add any of the disk usage environment variables, in the following format. Be sure to replace
THE-VARIABLE-NAME
andthe-variable-value
with the actual environment variable’s name and value.export THE-VARIABLE-NAME=the-variable-value
-
Save your changes and close the file.
-
(Only for the
PD_LOG_DIR
environment variable) Add the associated property,pepperdata.log.baseDir
, to the Pepperdata site file.By default, the Pepperdata site file,
pepperdata-site.xml
, is located in/etc/pepperdata
. If you customized the location, the file is specified by thePD_CONF_DIR
environment variable. See Change the Location of pepperdata-site.xml for details.Be sure that you set the logging directory environment variable and property to the same location. If the locations do not match, not all metrics are sent to Pepperdata, and not all metric log files will be deleted or archived.
<property> <name>pepperdata.log.baseDir</name> <value>your/pepperdata/log/dir</value> </property>
-
(Only for the
PD_LOG_DIR
environment variable) Restart the ResourceManagers and NodeManagers.
-
-
On every host in the cluster, restart the PepCollector and PepAgent services.
Although restarting the PepAgent is optional, we recommend restarting it to prevent:
-
Subsequent, difficult-to-diagnose Pepperdata startup failure. When the
pepagentd
daemon starts, it verifies that there is sufficient capacity on the partition wherePD_LOG_DIR
is located. If the capacity is less than thePD_MAX_LOG_DIR_SIZE
size cap, the PepAgent will not start. -
Disk write errors. If you lowered the
PD_MAX_LOG_DIR_SIZE
size cap because the associated partition’s capacity was reduced for any reason, but did not restart the PepAgent, the PepAgent could attempt to write a log file when there is insufficient capacity. The result would be a runtime disk write error.
-
Restart the Pepperdata Collector.
You can use either the
service
(if provided by your OS) orsystemctl
command:sudo service pepcollectd restart
sudo systemctl restart pepcollectd
-
Restart the PepAgent.
You can use either the
service
(if provided by your OS) orsystemctl
command:sudo service pepagentd restart
sudo systemctl restart pepagentd
-