General Exploration Analysis

General Exploration analysis type uses event-based sampling collection and is targeted for the Intel® Xeon™ Phi coprocessor.

This analysis is a good way to triage hardware issues in programs running on the Intel Xeon Phi coprocessor. Once you have used Hotspots analysis to determine hotspots in your code, you can perform General Exploration analysis to understand how efficiently your code is utilizing the Intel Xeon Phi coprocessor architecture.

The Intel Xeon Phi coprocessor is ideally suited for highly parallel applications that feature a high ratio of computation to data access. It is composed of up to 61 CPU cores connected on-die via a bi-directional ring bus. Each core is capable of switching between up to 4 hardware threads in a round-robin manner, resulting in a total of up to 244 hardware threads available. Each core consists of an in-order, dual-issue x86 pipeline, a local L1 and L2 cache, and a separate vector processing unit (VPU). Being an in-order machine, the coprocessor can be sensitive to stalls on memory access so the round-robin scheduling of the threads and aggressive compiler-generated software prefetching are used to mitigate that. It is also important that each hardware thread uses the available vectorization width as much as possible.

To provide a dive into possible issues, the General Exploration analysis type provides ability to collect the following groups of metrics:

L1 cache usage , and estimated maximum latency for L1 cache misses
L2 cache usage. The data for additional L2 cache events should be used with caution since they include cache misses from software prefetch instructions.
Vectorization usage
TLB usage efficiency

All of the metrics in this analysis type measure activity within one Intel Xeon Phi coprocessor.

The metrics that you get after collecting one or several groups have programmed thresholds. When the value for the metric is outside the threshold, the cell corresponding to that hotspot will turn pink, giving you a hint when more investigation may be warranted. Memory bandwidth can be calculated using another profile as an additional performance metric. For details on tuning methodology and metrics, see the Optimization and Performance Tuning for Intel® Xeon Phi™ Coprocessors article.

To see the full list of events used for General Exploration analysis type:

Click the (standalone GUI)/ (Visual Studio IDE) New Analysis toolbar button.
The Analysis Type window opens.
From the left pane, select Microarchitecture Analysis > General Exploration.
The General Exploration configuration pane opens on the right. The Details section provides a table with the processor events used for this analysis type.

Note

Analysis on the Intel Xeon Phi coprocessor is supported with the VTune Amplifier XE only. You can see a list of analysis types applicable to the coprocessor analysis only when you specify the Intel Xeon Phi coprocessor (native) or Intel Xeon Phi coprocessor (host launch)target system type in the Project Properties: Target tab.

You can choose to view General Exploration analysis results in any of the following viewpoints:

Viewpoint	Description
General Exploration	Helps identify where the application is not making the best use of available hardware resources. This viewpoint displays metrics derived from hardware events. The Summary window reports the overall metrics for the entire execution along with explanations of the metrics. From the Bottom-up and Top-down Tree windows you can locate the hardware issues in your application. Cells are highlighted when potential opportunities to improve performance are detected. Hover over the highlighted metrics in the grid to see explanations of the issues.
Hardware Event Counts	Displays the event count for all collected processor events. While the Hardware Event Sample Counts viewpoint provides the actual number of samples collected for an event, Hardware Event Count viewpoint estimates the number of times this event occurred during the collection.
Hardware Event Sample Counts	Displays the sample count for all collected processor events. While the Hardware Event Counts viewpoint estimates the number of times an event occurred during the collection, the Hardware Event Sample Counts viewpoint provides the actual number of samples collected for this event.
Hardware Issues	Helps identify where the application is not making the best use of available hardware resources. This viewpoint displays metrics derived from hardware performance counters. Hover over the highlighted metrics values in the grid to read why the extreme value might represent a performance problem.
Hotspots	Helps identify hotspots - code regions in the application that consume a lot of CPU time.
Bandwidth	Helps identify where the application is generating significant bandwidth to DRAM. Memory bandwidth, in GB/sec, is plotted in the timeline, while events often associated with DRAM requests are shown in the grid. In the timeline, select a region of high bandwidth, and filter that region in. Use the grid to discover where in the code DRAM accesses are being generated.
Task Time	Visualizes tasks, logical units of work on specific threads, based on ITT API annotations. Identify tasks with the highest execution time and analyze threads responsible for a particular task.

Note

The Bandwidth viewpoint is available only if you enable the Analyze memory bandwidth option in the General Exploration configuration pane.

These viewpoints may include the following windows:

Summary window displays statistics on the overall application execution.
Bottom-up pane displays performance data per metric (event ratio/event count/sample count) for each hotspot function.
Top-down Tree window displays hotspot functions in the call tree, performance metrics for a function only (Self value) and for a function and its children together (Total value).
PMU Events window displays a count of PMU events selected for the analysis.
Uncore Events window displays a count of uncore events selected for the analysis. If there are no uncore events, the upper pane of the window is empty.
Tasks, Tasks over Time, and Tasks by Threads windows provide details on tasks specified in your code with the Task API.

Optimization Notice
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice. Notice revision #20110804

Optimization Notice

Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel. Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.

Notice revision #20110804

Parent topic: Intel® Xeon Phi™ Coprocessor (Code Name: Knights Corner) Analysis

General Exploration Analysis

Note

Note

See Also

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112