Monitoring Apache® Impala Query Metrics (YARN)
Supported Versions of Impala: Versions supported by any Cloudera CDH/CDP distribution that Pepperdata supports; see Pepperdata-Platform Support
Cluster administrators often need precise resource usage information, even down to the query level of detail, in order to create accurate chargeback reports. If you’re using Apache Impala, you can enable Pepperdata to collect Impala query metrics for CPU and memory usage. When the queries are finished, Pepperdata reads the Impala query profiles to calculate the resource usage.
On This Page
Typical Use Cases
The Pepperdata dashboard shows charts and tables of data about your Impala queries. You can:
-
Determine which Impala queries are resource hogs.
-
Learn how a query impacts the system by filtering the metrics on a per Query, DB, connection user, fragment, and so on basis.
-
Determine who (database or connection user) contributes the biggest impact over time, to the system as a whole, by monitoring the Chargeback metrics—aggregate CPU/memory usage numbers per database or connection user).
Metrics for Impala Queries In Flight
Pepperdata collects metrics for Impala queries in flight, which provide information about the query’s state (CREATED, INITIALIZED, COMPILED, RUNNING, FINISHED, and EXCEPTION).
In flight queries refer to currently running queries as reported in the queries page of the Impala impalad
daemon’s debug web UI at http://impalaserverhostname:25000/queries
.
These metrics enable you to create alarms and alerts, such as queries in the RUNNING state for more than a given amount of time and too many queries are in the EXCEPTION state for the last 10 minutes.
The table describes the queries in flight metrics.
Metric Name | Description |
impala.in_flight_queries.duration-secs | Current duration: difference between now and when the query began. |
impala.in_flight_queries.progress‐percent | Progress through the Impala SQL statement. For SQL queries, this represents how many of the target rows of the table(s) have already been processed. |
impala.in_flight_queries.rows‐fetched | Number of rows in the query result set. |
impala.in_flight_queries.state | Enumeration of possible states of a query:
|
impala.in_flight_queries.waiting | User-controlled flag to indicate that a query's execution is finished and is waiting for manual inspection and resource cleanup. |
impala.in_flight_queries.waiting‐time‐secs | Length of time that the query's impala.in_flight_queries.waiting flag has been true. |
Related Information
-
For information about creating alarms from the applicable metrics’ charts, see Create Alarms From a Chart View.
-
For information about the Impala
impalad
daemon’s debug web UI, refer to Queries Page or the comparable page for your version of Impala. -
For one approach for using the
impala.in_flight_queries.waiting
andimpala.in_flight_queries.waiting‐time‐secs
metrics, refer to the impala-user mailing list archives message, queries “waiting to be closed” .
Show Chart View of Impala Query Metrics Data
To display a group of Impala query metrics, navigate to the dashboard’s Charts page, and use the Metrics filter bar to search for “Impala”. To show a single metric, select it. To show all the metrics in a group, select the All… checkbox for the group. After you show the metrics, you can proceed as usual to optionally select breakdowns and apply filters.
Procedure
-
In the left-nav menu, select Charts.
-
Choose the metric(s) that you want.
-
In the filter bar, click Metrics.
-
In the search box, clear any previously selected metrics, and enter the search term, “impala”.
-
Select the metric(s) that you want to see.
-
-
Select the Impala container type.
-
In the filter bar, click Breakdown By.
-
Select the Container Type series breakdown, and filter it for Impala by clicking the drop down list and clearing all the container types except Impala.
-
-
(Optional) Select additional breakdowns and apply additional filters.
-
Click Apply.
Show Tabular View of Impala Query Metrics Data
To show tabular data of Impala metrics that are grouped by Impala database or Impala query, first show the charts, and then switch to the table view. To highlight issues, such as queries that took a lot of resources, sort the tables by memory usage, or CPU runtime. You can also sort the tables by database or query, which lets you focus on specific databases or queries of interest.
Procedure
-
Show the charts of Impala query metrics, and filter as you want; see Show Charts of the Impala Query Metrics.
-
In the upper-right, click View as Table.
Filters and Breakdowns for Impala Query Charts
As with other metrics charts that you view on the Pepperdata dashboard, you can filter Impala query charts for specific series of interest, such as host or user; exclude a series you’re not interested in; or explicitly include a series that is filtered out by default. Likewise, you can filter Impala query charts by Impala-specific criteria, such as query, query state, Impala database on which the query was run, and so on. You can specify regular text for exact matching or use regular expressions to match patterns.
For detailed instructions for applying filters, see Filter the Charts & Tables by Dimensions: Hosts, Users, Etc..
The table describes the Impala-specific series breakdowns.
Breakdown | Description |
---|---|
Query State | Final state of the query (when it finishes): FINISHED, UNKNOWN, or EXCEPTION |
Impala DB | Database on which the query was run |
Impala ConnUser | Connected user; if the query is run from an external client (for example, Apache Hue), the connected user could be different from the user |
Impala Query | ID of the query |
Impala Fragment | ID of the fragment, which is a smaller unit of work that is distributed across the cluster |
Impala Instance | ID of the subtask of a fragment |