Intel® VTune™ Amplifier can process and integrate performance statistics collected externally with a custom collector or with your target application in parallel with the native VTune Amplifier analysis. To achieve this, provide the collected custom data as a csv
file with a predefined structure and save this file to the VTune Amplifier result directory.
VTune Amplifier can load and process the following data types:
Interval data with start time and end time
Samples with a set of counters
Data may be optionally bound to thread ID. VTune Amplifier represents data not bound to a particular thread (there are no TID values in the csv
file) as frames. Data bound to a thread (there are TID values in the csv
file) is represented as tasks.
To make the VTune Amplifier interpret the custom statistics from the csv
file, make sure the file format meets the following requirements:
File Name
csv
filename should specify the hostname where your custom collector gathered the data, following these format requirements:
Filename format:[user-defined]-hostname-<hostname-of-system>.csv
Where:
[user-defined]
is an option string, for example, describing the type of data collected-hostname-
is a required text that must be specified verbatim<hostname-of-system>
is the name of the system where the data is collected. If you use a custom collector you can retrieve the hostname by using theAMPLXE_HOSTNAME
environment variable. If you create a CSV file to import into an existing result, you can either refer to the Summary window that provides the required hostname in the Collection and Platform Info section > Computer name, or check the correspondingamplxe-cl
summary report:amplxe-cl -r <result> -R summary
.
Example:phases-hostname-octagon53.csv
Format for Interval Values
For imported interval values, use 5 columns, where the order of columns is important:
name,start_tsc.[QPC|CLOCK_MONOTONIC_RAW|RDTSC|UTC],end_tsc,[pid],[tid]
Example with the performance counter timestamp:
name,start_tsc.QPC,end_tsc,pid,tid
frame1,2,30,,
frame1,33,59,,
taskType1,3,43,1,1
taskType2,5,33,1,1
taskType1,46,59,1,1
taskType2,45,54,1,1
Example with the system counter timestamp:
name,start_tsc.UTC,end_tsc,pid,tid
Frame1,2013-08-28 01:02:03.0004,2013-08-28 01:02:03.0005,,
Task,2013-08-28 01:02:03.0004,2013-08-28 01:02:03.0005,1234,1235
Format for Discrete Values
You can import discrete values for using continuous incrementing values with PID/TID.
The following format is required:
tsc.[QPC|CLOCK_MONOTONIC_RAW|RDTSC|UTC],CounterName1.COUNT[,CounterName2.COUNT],[pid],[tid]
Example with the performance counter timestamp bound to a particular process/thread:
tsc.QPC,MyCounter1.COUNT,MyCounter2.COUNT,pid,tid
2,1,3,1,1
5,2,5,1,1
10,3,9,1,1
23,10,23,1,1
Example with the performance counter timestamp not bound to a particular process/thread:
tsc.QPC,MyCounter1.COUNT,MyCounter2.COUNT,pid,tid
2,1,3,,
5,2,5,,
10,3,9,,
23,10,23,,
In the examples above, the first line is a header and other lines are samples with a set of two counters.
Example with the system counter timestamp:
tsc.UTC,MyCounter1.COUNT,MyCounter2.COUNT,pid,tid
2013-08-28 01:02:03.0004,1234,,1234,1235
2013-08-28 01:02:03.0005,1234,,1234,1235
2013-08-28 01:02:03.0006,,1000234,,
Additional Requirements
Make sure each
csv
file contains only one table. If you need to load several tables, create severalcsv
files with one table per file.Use commas as value separators.
Use RDTSC, UTC or performance counter (
QueryPerformanceCounter
on Windows OS andCLOCK_MONOTONIC_RAW
on Linux OS) to specify events timestamp.
Example 1: Integrating Interval Data Not Bound to a Particular Process
You have a csv
file with the following data types:
name,start_tsc.QPC,end_tsc,pid,tid one,3264639089043,3264641632738,, one,3264635198786,3264712364569,, two,3265157481653,3265244163253,,
VTune Amplifier processes this data as frames (there are no TID and PID values specified) and displays the result as follows:
Image may be NSFW.
Clik here to view.
With the VTune Amplifier, you can easily correlate the frame data in the Timeline pane and grid view. You see that frame 4 took longer time to process than subsequent frames 5 and 6 due to the poll_idle()
call.
Example 2: Integrating Interval Data Bound to a Process
You have a csv
file with the following data types:
name,start_tsc.QPC,end_tsc,pid,tid poll,587837823903,587837823903,6062 read,961264123487437,961264123489641,6062,6062 read,961264123494333,961264123494942,6062,6062 read,961264123517261,961264123518420,6062,6062 poll,961264123522190,961264761068013,6062,6062 read,961264761075744,961264761078562,6062,6062 read,961264761083972,961264761084888,6062,6062 poll,961264761113162,961264761115356,6062,6062
VTune Amplifier processes this data as tasks (TID and PID values are specified) and displays the result in the Tasks and Frames window as follows:
Image may be NSFW.
Clik here to view.
When you analyze a thread activity and performance per hardware event metrics in this sample, you see that most of the application threads were idle during these tasks execution.
Example 3: Viewing Integrated Hardware Event Data from Command Line
This example provides the hw-events
report with external discrete data (counters) integrated into a VTune Amplifier hardware event-based sampling analysis result cl_result.amplxe
:
amplxe-cl -R hw-events -group-by=process -r <path> amplxe: Using result path '<path>' amplxe: Executing actions 50 % Generating a report Process Counter:victim_counter:Self Counter:victim_counter_x2:Self --------------- --------------------------- ------------------------------ itt_and_csv.exe 2 4 amplxe: Executing actions 100 % done
Example 4: Viewing Integrated Hotspots Data from Command Line
In this example, the hotspots
report shows counters bound to a specific process/thread grouped by tasks:
amplxe-cl -R hotspots -group-by=task -r <path> amplxe: Using result path '<path>' amplxe: Executing actions 50 % Generating a report Task Type CPU Time:Self Task Time:Self Overhead Time:Self Spin Time:Self Thread Counter:victim_counter:Self Thread Counter:victim_counter_x2:Self ------------------ ------------- -------------- ------------------ -------------- --------------------------- --------------------------------- [Outside any task] 0 0 0 0 0 2 ITT Task 0 0.009 0 0 2 6 victim_task 0 0.000 0 0 0 0 amplxe: Executing actions 100 % done
In this example, the hotspots
report shows counters not bound to a specific thread/process grouped by frame domain:
amplxe-cl -R hotspots -group-by=frame-domain -r <path> amplxe: Using result path `<path>' amplxe: Executing actions 50 % Generating a report Frame Domain Frame Time:Self Counter:global_counter:Self Counter:global_counter_x2:Self ------------ --------------- --------------------------- ----------------------------- cuscol_frame 0.126 4 8 cuscol_utc_frame 0.126 4 8 amplxe: Executing actions 100 % done