Intel® Xeon Phi™ Advanced Modeling Options

When you select a Target System of Intel Xeon Phi or Offload to Intel Xeon Phi coprocessor, additional modeling parameters appear below Runtime Modeling area under Intel Xeon Phi Advanced Modeling:

Select Consider Code Vectorization if you agree to modify your parallel code later to improve vector parallel execution. If checked, you can specify:
- Reference CPU Vectorization Speedup you expect can be achieved. This value indicates the speedup multiplier gain for the current site by using vectorization techniques with the reference CPU (dual-socket 8-core Intel Xeon processor E5-26xx product family at 2.7 GHz, 16 cores total). When providing this estimate, base your estimates on target device characteristics and your expertise of how much and how well this part of code can be vectorized.
- Intel Xeon Phi Vectorization Speedup you expect can be achieved. This value indicates the speedup multiplier gain for current site by using vectorization techniques with an Intel Xeon Phi processor. When providing this estimate, base your estimates on target device characteristics and your expertise of how much and how well this part of code can be vectorized.
When you choose Target System as Offload to Intel Xeon Phi, you can select the Offload Transfer Data Size to specify data transfer size value you expect can be achieved (unit is KB).
Click Apply after modifying any of these values.

In some cases, you can restructure your code to enable more efficient vector operations. Loop vectorization allows hardware to process data independently in smaller units (usually 64-byte), such as operations on data arrays.

One way to enable more efficient vector operations is to modify a single loop to create a new outer loop where the two loops cover the same iteration space. A technique called strip-mining allows the innermost loop to use vector operations in small chunks.

Other ways to enable more efficient vector operations include examining outermost loops where threading parallelism might already be used, and consider vectorizing its innermost loops and/or callee functions:

Certain innermost loops may benefit from OpenMP 4 constructs. That is, under certain conditions you can use both an omp parallel for threading pragma and a omp simd (or similar) simd vectorization pragma (see the compiler vectorization report and descriptions at http://openmp.org).
Certain innermost loops may benefit from the Intel Cilk Plus cilk_for pragma. With Intel Cilk Plus, under certain conditions using a single cilk_for pragma can result in threading parallelism of outer loops and vector parallelism of inner loops (see the compiler vectorization report).

The processor microarchitecture determines the type of vector instructions that will be supported and thus the size of data the hardware can process efficiently (see http://en.wikipedia.org/wiki/List_of_Intel_CPU_microarchitectures).

For a description of the Intel® Xeon Phi™ coprocessor architecture, visit the Intel® Developer Zone and read such articles as http://software.intel.com/en-us/articles/intel-xeon-phi-coprocessor-codename-knights-corner.

Parent topic: Checking Performance Implications

Intel® Xeon Phi™ Advanced Modeling Options

See Also

Trending Articles

Practice Sheet of Right form of verbs for HSC Students

Download: FK ft Shenky – Nakuyewa ”Prod by: Shenky”

How to win at Markstrat (Markstrat Tips and Tricks) – Vodites

Ominde Commission Report and Recommendations – Ominde Report of 1964

Bureau of Internal Revenue: Regional Offices (Directory)

GO 53 on Enhancement of Ex-gratia upto 5 Lakhs Toddy Tappers in Telangana

Cakewalk CA-2A Leveling Amplifier v2.0.1.97 WiN, v2.0.1.96 OSX Incl Keygen

Mp3 Download: Mdu - Kunjenjenjena

How the kill the job , when DTP request running for long hours.

Microsoft Intune から展開しているアプリのアップデートについて

18-year-old girl was beaten for half an hour by two Northampton men in 'an...

Car crash in Dunton Bassett leaves driver in critical condition

Macky 2, Two Others In Road Accident

Application log 00000000000000089514: Could not convert queue DLVST90CLNT

Detroit mafia: D’Anna Brothers agree to plea deal

Delivery block field greyed out using VA02

Muloraki Au

【個人撮影】スマホのプライベート映像♪「中に出さないで///」カラオケ屋での生ハメ撮りが流出ｗ【リベンジポルノ】＠PornHub

BREAKING NEWS: Diamond Platnumz Is Reported Dead After Ghastly Car Accident

FIAT 500 B0111 B0112