Quantcast
Channel: C#
Viewing all articles
Browse latest Browse all 1853

GPU Analysis

$
0
0

Use the Intel® VTune™ Amplifier to profile applications that use a Graphics Processing Unit (GPU) for rendering, video processing, and computations. VTune Amplifier can monitor, analyze, and correlate activities on both the CPU and GPU.

Track the GPU activity to:

  • Identify code regions where your application is CPU or GPU bound

  • Analyze hot OpenCL™ kernels running on the GPU

  • Estimate how effectively the Intel Integrated Graphics is used

To enable the GPU analysis, you have to configure your predefined or custom configuration to collect DirectX* pipeline and Processor Graphics events and, optionally, trace OpenCL kernels execution on a GPU (if your application uses OpenCL on a GPU). VTune Amplifier starts the analysis and provides collected GPU performance data in all available viewpoints.

Use This

To Analyze This

Graphics window

  • GPU usage by GPU engines over time

  • GPU performance over time by metrics selected from the Analyze Processor Graphics events drop-down menu in the Analysis Type window: Overview or Global/local memory accesses.

  • GPU OpenCL kernels execution on Processor graphics

Bottom-up window/PMU Events window

  • GPU usage by GPU engines over time

  • GPU Computing Tasks corresponding to OpenCL kernels submitted and executed on Processor Graphics

GPU and CPU Usage Correlation

Configure the VTune Amplifier to analyze DirectX* pipeline events to explore GPU busyness over time and understand whether your application is CPU or GPU bound.

Theoretically, if the Timeline pane shows that the GPU is busy most of the time and having small idle gaps between busy intervals and the GPU software queue is rarely decreased to zero, your application is GPU bound. If the gaps between busy intervals are big and the CPU is busy during these gaps, your application is CPU bound. But such obvious situations are rare and you need a detailed analysis to understand all dependencies. For example, an application may be mistakenly considered GPU bound when GPU engines usage is serialized (for example, when GPU engines responsible for video processing and for rendering are loaded in turns). In this case, an ineffective scheduling on the GPU results from the application code running on the CPU.

When the GPU is intensely busy over time, you may look deeper and understand whether it is used effectively and whether there is some room for improvement. Such an analysis is possible with the hardware metrics collected by the VTune Amplifier for the Render and GPGPU engine of the Intel HD graphics.

Intel HD Graphics Render Engine and Hardware Metrics

A GPU is a highly parallel machine where graphical or computational work is done by an array of small cores, or execution units (EUs). Each EU simultaneously runs several lightweight threads. When one of those threads is picked up for an execution, it can hide stalls in the other threads if the other threads are stalled waiting for data from memory or other units.

To use the full potential of the GPU, applications should enable the scheduling of as many threads as possible and minimize idle cycles. Minimizing stalls is also very important for graphics and general purpose computing GPU applications.

VTune Amplifier provides an option to monitor Intel GPU hardware events and display metrics about integral GPU resource usage over a sampled period, for example, ratio of cycles when EUs were idle, stalled, or active as well as statistics on memory accesses and other functional units. If the VTune Amplifier traces GPU OpenCL kernels execution, it annotates each kernel with GPU metrics.

The scheme below displays metrics collected by the VTune Amplifier across different parts of the Intel HD Graphics:

GPU metrics help identify how efficiently GPU hardware resources are used and whether any performance improvements are possible. Many metrics are represented as a ratio of cycles when the GPU functional unit(s) is in a specific state over all the cycles available for a sampling period. To see a formula used for a metric calculation, hover over a corresponding column name in the grid. For example, the VTune Amplifier collects data for the following basic GPU hardware metrics:

Metric

Formula

EU Array Active

EU Array Stalled

EU Array Idle

See Also


Inglese

Viewing all articles
Browse latest Browse all 1853

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>