Obtain Hadoop Container Logs
The Pepperdata dashboard can tell you a lot about how your application is performing, but sometimes you need to answer a “why” question such as “Why did my reducers start seeing heavy GC pauses?” or “Why are my mappers taking an hour to complete?” The best way to answer these sorts of questions is to go to the source: the Hadoop container logs.
Unfortunately, the container logs are typically discarded as soon as an application finishes. However, you can configure YARN to retain the logs for any given length of time.
Configure Logs Retention
-
Starting with any NodeManager, configure the
yarn.nodemanager.delete.debug-delay-sec
property.-
For manually configured clusters, add the property to the host’s
/etc/hadoop/conf/yarn-default.xml
file.Malformed XML files can cause operational errors that can be difficult to debug. To prevent such errors, we recommend that you use a linter, such asxmllint
, after you edit any .xml configuration file. -
If you are using Cloudera Manager, you can set the property value individually for each NodeManager or set the same value for all NodeManagers by using the service configuration page.
-
For an individual NodeManager, navigate to YARN (MR2 INCLUDED) >the-NodeManager-node > Configuration > Localized Dir Deletion Delay.
-
For the service configuration page, navigate to YARN (MR2 INCLUDED) > Configuration > Localized Dir Deletion Delay.
-
-
If you are using Ambari, navigate to YARN > Configs > Advanced > Advanced yarn-site > yarn.nodemanager.delete.debug-delay-sec.
Set the property’s value as described in the YARN r2.7.0 documentation :
“Number of seconds after an application finishes before the NodeManager’s DeletionService will delete the application’s localized file directory and log directory. To diagnose Yarn application problems, set this property’s value large enough (for example, to 600 = 10 minutes) to permit examination of these directories. After changing the property’s value, you must restart the NodeManager in order for it to have an effect. The roots of Yarn applications’ work directories is configurable with the
yarn.nodemanager.local-dirs
property … and the roots of the Yarn applications’ log directories is configurable with theyarn.nodemanager.log-dirs
property.”Important: In a busy production environment, the Hadoop container logs can consume a large amount of local disk space. Be sure to plan accordingly, and remember to delete the properties when you are done debugging. -
-
Repeat step 1 on every NodeManager in your cluster.
-
Restart all NodeManagers in the cluster.
Log Files and Locations
The Hadoop container logs directory is configured by the yarn.nodemanager.log-dirs
property in the yarn-default.xml
file.
The directory naming format is ${yarn.nodemanager.log-dirs}/<applicationID>/<containerID/
, with files for syslog (in the Hadoop log4j-style output for the container), stdout, and stderr.
In the local-dirs
directory, the top level looks as follows:
${yarn.nodemanager.local-dirs}
|--filecache
|--nmPrivate
|--registeredExecutors.ldb
|--usercache
The usercache
is our primary interest. An application’s localized file directory is in ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}
, which includes subdirectories for individual containers’ work directories, container_${contid}
.