Presto/Trino Recommendations
Pepperdata recommendations for Presto/Trino queries are generated by the Query Profiler, which you must enable for Presto/Trino monitoring when you configure Query Spotlight. The Presto tile in the Recommendations section of the Pepperdata dashboard shows how many recommendations were made during the last 24 hours, along with their severity levels.
Recommendations information is shown in several places in the Pepperdata dashboard:
-
To see a table of all the Presto/Trino queries that received recommendations at a given severity level, click the linked severity text in the Presto tile.
-
To see the recommendations’ severity levels for all recently run queries, show the Queries Overview page by using the left-nav menu to select Query Spotlight > Queries.
-
To view the Query Profiler report, click the title of the Presto tile, or use the left-nav menu to select Query Spotlight > Query Profiler.
The table describes the Pepperdata recommendations for Presto/Trino queries: each recommendation’s name, its type (general guidance or specific tuning values to change), what triggered the recommendation (the cause), the text of the actual recommendation, and notes that provide additional information.
For details about how the recommendations appear in an application’s detail page, see Recommendations Tab.
Name | Type | Cause | Recommendation | Notes | |
Guidance | Tuning | ||||
Too large a result set from Presto Cross join |
The result set of the query’s Cross join is greater than or equal to <N>. |
Rewrite the query to add join conditions and eliminate Cross joins. |
Cross join of large tables can be resource-intensive. |
||
Too many joins and too much data processed by joins |
The query has more than 5 join operations. The data processed by the join operations exceeds <N SIZE>. |
Denormalize tables to reduce or eliminate the need for joins. |
A lot of joins that process a lot of data can be resource-intensive. |
||
Query missing |
|
|
Sorting a large data set can be resource intensive. |
||
Query selecting all columns |
All columns are selected in the query. |
When running queries, limit the final |
Selecting all columns can be resource-intensive. |
||
Hive tables missing statistics |
The following tables are missing relevant table and/or column statistics: <table1>, <table2>. |
Run compute stats on the following tables: <table1>, <table2>. |
Hive query planner uses table and column statistics to generate effective plans. Missing statistics can result in inaccurate plans and poor query performance. |