Intel Advisor offers Vectorization Advisor, a vectorization optimization tool, and Threading Advisor, a threading design and prototyping tool, to help ensure your Fortran, C and C++ native/managed applications realize full performance potential on modern processors, such as Intel® Xeon Phi™ processors. This topic is part of a tutorial that shows how to use the Vectorization Advisor on a Windows* platform to add efficient SIMD parallelism to a C++ sample application.
There are many ways to take advantage of the power and flexibility of the Vectorization Advisor. The following workflows (usage scenarios) help you maximize your productivity as quickly as possible.
Note
This tutorial demonstrates the Dig Deeper workflow.
Survey Workflow
In the Survey workflow, the Survey analysis is required and the Trip Counts analysis is optional.
Survey Analysis - Offers integrated compiler report data and performance data all in one place. Use the Survey Report to help identify:
Where vectorization will pay off the most
If vectorized loops are providing benefit, and if not, why not
Un-vectorized and under-vectorized loops, and the estimated expected performance gain of vectorization or better vectorization
How data accessed by vectorized loops is organized and the estimated expected performance gain of reorganization
Trip Counts Analysis - Dynamically identifies the number of times loops are invoked and execute (sometimes called call count/loop count and iteration count respectively). Use this added information in the Survey Report to make better decisions about your vectorization strategy for particular loops, as well as optimize already-parallel loops.
If investigation of the Survey Report shows all loops are vectorizing properly and performance is satisfactory: You are done! Congratulations!
If investigation shows one or more loops is not vectorizing properly and performance is unsatisfactory:
Improve application performance using Recommendations and Compiler Diagnostic Details information to guide your efforts.
Rebuild your modified code.
Run another Survey analysis to verify all loops are vectorizing properly and performance is satisfactory.
Image may be NSFW.
Clik here to view.
Dig Deeper Workflow
In the Dig Deeper workflow, the Survey analysis is required and the Trip Counts analysis is optional (just as in the Survey workflow). The Dig Deeper workflow also offers optional Refinement analyses.
Dependencies analysis - For safety purposes, the compiler is often conservative when assuming data dependencies. Use a Dependencies-focused Refinement Report to check for real data dependencies in loops the compiler did not vectorize because of assumed dependencies. If real dependencies are detected, the analysis can provide additional details to help resolve the dependencies. Your objective: Identify and better characterize real data dependencies that could make forced vectorization unsafe.
Memory Access Patterns (MAP) analysis - Use a MAP-focused Refinement Report to check for various memory issues, such as non-contiguous memory accesses and unit stride vs. non-unit stride accesses. Your objective: Eliminate issues that could lead to significant vector code execution slowdown or block automatic vectorization by the compiler.
If investigation of the Survey Report shows you need more information (because, for example, there is an assumed dependency compiler diagnostic, or there are expensive memory instructions like gathers, inserts, or shuffles), continue your investigation by:
Marking one or more loops for deeper analysis
Defining the appropriate Project Properties for the Refinement analysis you plan to run
Running one or more Refinement analyses
If this further investigation shows there is room for improvement:
Make the improvements.
Rebuild your modified code.
Run another Survey analysis to verify your application still runs correctly and all test cases pass, all loops are vectorizing properly, and performance is satisfactory.
Otherwise, you are done!
Image may be NSFW.
Clik here to view.