Upgrade the Hadoop Distribution
To upgrade the Hadoop distribution on a cluster, begin with any host, and stop the Pepperdata agents, remove the Pepperdata software, upgrade the Hadoop distribution, reinstall the Pepperdata software, and restart the Pepperdata agents. Repeat this process on every ResourceManager host and NodeManager host in your cluster.
If you want to perform an upgrade in a CDP Public Cloud environment, you must create a new environment and Data Hub cluster, and install the Pepperdata Supervisor version that you want; see Installing Pepperdata (CDP Public Cloud).
On This Page
Run all commands as the root
user.
Prerequisites
- Ensure that the Hadoop distro to which you’re upgrading is supported by the currently-installed version of the Pepperdata Supervisor (see Pepperdata-Platform Support). If the new distro is not supported, do not use this Upgrade the Hadoop Distribution procedure. You must instead upgrade both the distro and Pepperdata; see Upgrade Hadoop Distribution and Pepperdata.
Task 1: Stop the Pepperdata Agents
Procedure
- In Cloudera Manager, select the Stop action for the Pepperdata service.
Task 2: Remove/Deactivate the Old Pepperdata Supervisor
Procedure
- In Cloudera Manager, deactivate the existing (old) Pepperdata Supervisor parcel. (For details, see the Cloudera documentation for your version of Cloudera Manager.)
Task 3: Upgrade Your Hadoop Distribution
Procedure
-
Upgrade the Hadoop distribution according to the distribution’s instructions.
-
Verify that the snippet below is still included in the appropriate template(s), based on which services are configured to run on the host.
- YARN ResourceManager/NodeManager: YARN > Configs > Advanced > Advanced yarn-env > yarn-env template
- HBase Master or HBase RegionServer: HBase > Configs > Advanced > Advanced hbase-env > hbase-env template
- Apache Spark: Spark > Configs > Advanced > Advanced spark-env > spark-env template
- Apache Spark 2: Spark 2 > Configuration > Gateway > Spark Client Advanced Configuration Snippet (Safety Valve) for spark-conf/spark-env.sh
Important: Add the snippet to the end of the template(s).
This ensures that the activation script’s variable appends (YARN_NODEMANAGER_OPTS
,YARN_RESOURCEMANAGER_OPTS
,HBASE_REGIONSERVER_OPTS
, andSPARK_SUBMIT_OPTS
) are not overwritten by other assignments in the template(s).PD_HOME=/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR PEPPERDATA_ACTIVATE_SCRIPT_PATH="/opt/cloudera/parcels/PEPPERDATA_SUPERVISOR/supervisor/lib/pepperdata-activate.sh" if [ -e $PEPPERDATA_ACTIVATE_SCRIPT_PATH ]; then . $PEPPERDATA_ACTIVATE_SCRIPT_PATH fi
Task 4: Reinstall the Pepperdata Software
Prerequisite
Creation of the pepperdata user and pepperdata log directories uses the CM Agent, a CM Component, at the time of parcel activation and at the time of adding the pepperdata service. Each of these operations requires the CM agent to run as the root user. This requires one of the following permissions during the initial CM installation:
-
Access to the root user account using a password or SSH key file.
-
Passwordless sudo access for a specific user.
Procedure
-
Download the following artifacts from the Downloads page to any local directory, and copy them to the Cloudera Manager Server.
- The appropriate
PepperdataSupervisor
parcel for your distro; see Downloads: CDP Private Cloud Base and CDP Public Cloud or Downloads: CDH. - The latest
pepperdata-csd-X.Y.Z.tgz
CSD (custom service descriptor) for Supervisor 8.1; see Downloads: CDP Private Cloud Base and CDP Public Cloud or Downloads: CDH.
- The appropriate
-
Extract the contents of the TGZ archives and move the files as follows:
- Move the parcel (the
*.parcel
file) and corresponding SHA checksum file (*.parcel.sha
) to the/opt/cloudera/parcel-repo
directory. - Move the CSD JAR file to the
/opt/cloudera/csd
directory.
- Move the parcel (the
-
Restart the Cloudera Service and Configuration Manager (SCM) server (service: cloudera-scm-server).
Note: Restarting the SCM server is not the same as restarting the Cloudera Management Service by using the Cloudera Manager interface. Unless you use the command line to explicitly restart the SCM server (thecloudera-scm-server
service), you will be unable to use Cloudera Manager to add the Pepperdata service.service cloudera-scm-server restart
After the restart, the new parcels and the Pepperdata service (in the CSD JAR file) are available for activation.
-
In Cloudera Manager, distribute and activate the Pepperdata Supervisor parcel—the
*.parcel
file.
Task 5: Restart Hadoop YARN Services
Procedure
-
In Cloudera Manager, navigate to your cluster’s YARN (MR2 Included) service > Instances, select all ResourceManager and NodeManager hosts, and in the Actions for Selected, select Restart.
-
(If using HBase) Navigate back to the cluster view, and for the HBase service, select the Restart action.
Task 6: Restart the Pepperdata Agents
Procedure
- In Cloudera Manager, select the Start action for the Pepperdata service.