Quantcast
Viewing all articles
Browse latest Browse all 1853

Interpreting Hardware Issues

When the event-based sampling analysis is complete, start exploring the collected data with the Hardware Issues viewpoint. This viewpoint displays data per hardware-related performance metric. Each metric is an event ratio defined by Intel architects and has its own predefined threshold. Intel® VTune™ Amplifier analyzes a ratio value for each aggregated program unit (for example, function). When this value exceeds the threshold and the program unit has more than 5% of CPU time from collection CPU time, it signals a potential performance problem and highlights such a value in pink.

To interpret the performance data provided during the Hardware event-based sampling analysis, you may follow the steps below:

  1. Learn performance metrics and define a performance baseline.

  2. Identify hardware issues.

  3. Analyze source.

  4. Explore other analysis types.

Learn Performance Metrics and Define a Performance Baseline

In the Hardware Issues viewpoint, click the Summary tab to switch to the Summary window. The first section displays the summary statistics on the overall application execution per hardware-related metrics. To view the metric description, hover over the question mark icon:

Image may be NSFW.
Clik here to view.
Hardware Issues Viewpoint: Summary Window

In the example above, hovering over the Contested Accesses metric displays the metric description in the tooltip. Values for the CPI Rate and Contested Accesses metrics are highlighted in pink, which signals a performance issue for the whole application execution. VTune Amplifier describes each detected performance issue under the corresponding metric entry. If the description is lengthy, it may be truncated. Hover over the truncated text to see the full description.

You may use the performance issues identified by the VTune Amplifier as a baseline for comparison of versions before and after optimization. Your primary performance indicator is the Elapsed time value.

Identify Hardware Issues

To view hardware issues per a program unit, switch to the Bottom-up pane. Each row represents a program unit and percentage of the CPU cycles used by this unit. Program units that take more than 5% of the CPU time are considered as hotspots. If you apply a filter, then the row needs to be more than 5% of the CPU time for just the data that was filtered in. For example, if a function is 2.5% of the CPU time, but you filter out half of the modules, then the function will be highlighted because it is 5% of what is left.

By default, the VTune Amplifier sorts the data in the descending order and provides the hotspots at the top of the list.

Each column in the Bottom-up pane represents a hardware performance metric. VTune Amplifier calculates a metric based on the formulae provided by Intel architects and checks the threshold defined for this metric. If the metric value exceeds the threshold and the program unit is a hotspot, the VTune Amplifier highlights this value in pink as performance-critical. Hover over each pink cell to read a description of the issue and recommended solution and view the formula used to calculate the threshold for this issue.

Image may be NSFW.
Clik here to view.
Hardware Issues Viewpoint: Bottom-up Window

In the example above, the VTune Amplifier identified the execute function as a major hotspot for your application that took the most CPU time. VTune Amplifier detected that three types of hardware issues that impact the performance of this function: Clockticks per Instructions Retired (CPI), Last-level Cache Miss (LLC Miss), and Contested Accesses. For example, to handle the LLC Miss issues during the execution of the execute function, about 23% (0.234) of CPU cycles were waiting for data to arrive. This means that if you focus on this function hotspot and optimize the memory access, you can potentially gain up to 23% of performance boost for this function.

Analyze Source

When you identified a critical function, double-click it to open the Source/Assembly window and analyze the source code. The Source/Assembly window displays event data. Focus on the events included into the hardware metric identified as performance-critical in the Bottom-up pane. You may sort the columns to locate the required event data leftmost or set the required event column as a Data of Interest. VTune Amplifier remembers your settings and restores them each time you open your result.

Explore Other Analysis Types

  • Typically, for the hardware event-based analysis, you are recommended to start with the General Exploration analysis type to collect the maximum number of information and identify where you have hardware issues. Thus, for example, if during the General Exploration analysis you see LLC Miss issues, you may run the Memory Access analysis type to get more detailed information on memory issues.

  • Run the comparison analysis to understand the performance gain you obtain after your optimization.

Note

Inglese

Viewing all articles
Browse latest Browse all 1853

Trending Articles