Created
February 23, 2019 04:57
-
-
Save reinsteam/a973a8e7545cc8ae1ac7afdef98645a8 to your computer and use it in GitHub Desktop.
Throughput analysis dump from IACA 2.3
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Intel(R) Architecture Code Analyzer Version - 2.3 build:c151d5a (Thu, 6 Jul 2017 09:41:36 +0300) | |
Analyzed File - aosoa_packet.obj | |
Binary Format - 64Bit | |
Architecture - HSW | |
Analysis Type - Throughput | |
******************************************************************* | |
Intel(R) Architecture Code Analyzer Mark Number 1 | |
******************************************************************* | |
Throughput Analysis Report | |
-------------------------- | |
Block Throughput: 48.00 Cycles Throughput Bottleneck: Backend. Port5 | |
Port Binding In Cycles Per Iteration: | |
--------------------------------------------------------------------------------------- | |
| Port | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | 6 | 7 | | |
--------------------------------------------------------------------------------------- | |
| Cycles | 36.0 0.0 | 36.0 | 27.0 27.0 | 27.0 27.0 | 0.0 | 48.0 | 1.0 | 0.0 | | |
--------------------------------------------------------------------------------------- | |
N - port number or number of cycles resource conflict caused delay, DV - Divider pipe (on port 0) | |
D - Data fetch pipe (on ports 2 and 3), CP - on a critical path | |
F - Macro Fusion with the previous instruction occurred | |
* - instruction micro-ops not bound to a port | |
^ - Micro Fusion happened | |
# - ESP Tracking sync uop was issued | |
@ - SSE instruction followed an AVX256/AVX512 instruction, dozens of cycles penalty is expected | |
X - instruction not supported, was not accounted in Analysis | |
| Num Of | Ports pressure in cycles | | | |
| Uops | 0 - DV | 1 | 2 - D | 3 - D | 4 | 5 | 6 | 7 | | | |
--------------------------------------------------------------------------------- | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm3, xmm15, xmm15, 0x0 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm5, xmm15, xmm15, 0x55 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm7, xmm15, xmm15, 0xaa | |
| 2^ | 1.0 | | 1.0 1.0 | | | | | | | vpand xmm2, xmm3, xmmword ptr [rip] | |
| 2^ | 1.0 | | | 1.0 1.0 | | | | | | vpand xmm4, xmm5, xmmword ptr [rip] | |
| 2^ | 1.0 | | 1.0 1.0 | | | | | | | vpand xmm6, xmm7, xmmword ptr [rip] | |
| 2 | | | | 1.0 1.0 | | 1.0 | | | CP | vxorps xmm2, xmm2, xmmword ptr [rsi+rcx*1] | |
| 2 | | | 1.0 1.0 | | | 1.0 | | | CP | vxorps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x20] | |
| 2 | | | | 1.0 1.0 | | 1.0 | | | CP | vxorps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x40] | |
| 2 | | 1.0 | 1.0 1.0 | | | | | | | vaddps xmm2, xmm2, xmmword ptr [rsi+rcx*1+0x10] | |
| 2 | | 1.0 | | 1.0 1.0 | | | | | | vaddps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x30] | |
| 2 | | 1.0 | 1.0 1.0 | | | | | | | vaddps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x50] | |
| 1 | 1.0 | | | | | | | | | vmulps xmm2, xmm2, xmm3 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm4, xmm4, xmm5 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm6, xmm6, xmm7 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm8, xmm15, xmm15, 0xff | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm4 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm6, xmm6, xmm8 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm6 | |
| 1 | | | | | | 1.0 | | | CP | vxorps xmm9, xmm9, xmm2 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm3, xmm14, xmm14, 0x0 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm5, xmm14, xmm14, 0x55 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm7, xmm14, xmm14, 0xaa | |
| 2^ | 1.0 | | | 1.0 1.0 | | | | | | vpand xmm2, xmm3, xmmword ptr [rip] | |
| 2^ | 1.0 | | 1.0 1.0 | | | | | | | vpand xmm4, xmm5, xmmword ptr [rip] | |
| 2^ | 1.0 | | | 1.0 1.0 | | | | | | vpand xmm6, xmm7, xmmword ptr [rip] | |
| 2 | | | 1.0 1.0 | | | 1.0 | | | CP | vxorps xmm2, xmm2, xmmword ptr [rsi+rcx*1] | |
| 2 | | | | 1.0 1.0 | | 1.0 | | | CP | vxorps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x20] | |
| 2 | | | 1.0 1.0 | | | 1.0 | | | CP | vxorps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x40] | |
| 2 | | 1.0 | | 1.0 1.0 | | | | | | vaddps xmm2, xmm2, xmmword ptr [rsi+rcx*1+0x10] | |
| 2 | | 1.0 | 1.0 1.0 | | | | | | | vaddps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x30] | |
| 2 | | 1.0 | | 1.0 1.0 | | | | | | vaddps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x50] | |
| 1 | 1.0 | | | | | | | | | vmulps xmm2, xmm2, xmm3 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm4, xmm4, xmm5 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm6, xmm6, xmm7 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm8, xmm14, xmm14, 0xff | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm4 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm6, xmm6, xmm8 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm6 | |
| 1 | | | | | | 1.0 | | | CP | vxorps xmm9, xmm9, xmm2 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm3, xmm13, xmm13, 0x0 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm5, xmm13, xmm13, 0x55 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm7, xmm13, xmm13, 0xaa | |
| 2^ | 1.0 | | 1.0 1.0 | | | | | | | vpand xmm2, xmm3, xmmword ptr [rip] | |
| 2^ | 1.0 | | | 1.0 1.0 | | | | | | vpand xmm4, xmm5, xmmword ptr [rip] | |
| 2^ | 1.0 | | 1.0 1.0 | | | | | | | vpand xmm6, xmm7, xmmword ptr [rip] | |
| 2 | | | | 1.0 1.0 | | 1.0 | | | CP | vxorps xmm2, xmm2, xmmword ptr [rsi+rcx*1] | |
| 2 | | | 1.0 1.0 | | | 1.0 | | | CP | vxorps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x20] | |
| 2 | | | | 1.0 1.0 | | 1.0 | | | CP | vxorps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x40] | |
| 2 | | 1.0 | 1.0 1.0 | | | | | | | vaddps xmm2, xmm2, xmmword ptr [rsi+rcx*1+0x10] | |
| 2 | | 1.0 | | 1.0 1.0 | | | | | | vaddps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x30] | |
| 2 | | 1.0 | 1.0 1.0 | | | | | | | vaddps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x50] | |
| 1 | 1.0 | | | | | | | | | vmulps xmm2, xmm2, xmm3 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm4, xmm4, xmm5 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm6, xmm6, xmm7 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm8, xmm13, xmm13, 0xff | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm4 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm6, xmm6, xmm8 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm6 | |
| 1 | | | | | | 1.0 | | | CP | vxorps xmm9, xmm9, xmm2 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm3, xmm12, xmm12, 0x0 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm5, xmm12, xmm12, 0x55 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm7, xmm12, xmm12, 0xaa | |
| 2^ | 1.0 | | | 1.0 1.0 | | | | | | vpand xmm2, xmm3, xmmword ptr [rip] | |
| 2^ | 1.0 | | 1.0 1.0 | | | | | | | vpand xmm4, xmm5, xmmword ptr [rip] | |
| 2^ | 1.0 | | | 1.0 1.0 | | | | | | vpand xmm6, xmm7, xmmword ptr [rip] | |
| 2 | | | 1.0 1.0 | | | 1.0 | | | CP | vxorps xmm2, xmm2, xmmword ptr [rsi+rcx*1] | |
| 2 | | | | 1.0 1.0 | | 1.0 | | | CP | vxorps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x20] | |
| 2 | | | 1.0 1.0 | | | 1.0 | | | CP | vxorps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x40] | |
| 2 | | 1.0 | | 1.0 1.0 | | | | | | vaddps xmm2, xmm2, xmmword ptr [rsi+rcx*1+0x10] | |
| 2 | | 1.0 | 1.0 1.0 | | | | | | | vaddps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x30] | |
| 2 | | 1.0 | | 1.0 1.0 | | | | | | vaddps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x50] | |
| 1 | 1.0 | | | | | | | | | vmulps xmm2, xmm2, xmm3 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm4, xmm4, xmm5 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm6, xmm6, xmm7 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm8, xmm12, xmm12, 0xff | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm4 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm6, xmm6, xmm8 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm6 | |
| 1 | | | | | | 1.0 | | | CP | vxorps xmm9, xmm9, xmm2 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm3, xmm11, xmm11, 0x0 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm5, xmm11, xmm11, 0x55 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm7, xmm11, xmm11, 0xaa | |
| 2^ | 1.0 | | 1.0 1.0 | | | | | | | vpand xmm2, xmm3, xmmword ptr [rip] | |
| 2^ | 1.0 | | | 1.0 1.0 | | | | | | vpand xmm4, xmm5, xmmword ptr [rip] | |
| 2^ | 1.0 | | 1.0 1.0 | | | | | | | vpand xmm6, xmm7, xmmword ptr [rip] | |
| 2 | | | | 1.0 1.0 | | 1.0 | | | CP | vxorps xmm2, xmm2, xmmword ptr [rsi+rcx*1] | |
| 2 | | | 1.0 1.0 | | | 1.0 | | | CP | vxorps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x20] | |
| 2 | | | | 1.0 1.0 | | 1.0 | | | CP | vxorps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x40] | |
| 2 | | 1.0 | 1.0 1.0 | | | | | | | vaddps xmm2, xmm2, xmmword ptr [rsi+rcx*1+0x10] | |
| 2 | | 1.0 | | 1.0 1.0 | | | | | | vaddps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x30] | |
| 2 | | 1.0 | 1.0 1.0 | | | | | | | vaddps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x50] | |
| 1 | 1.0 | | | | | | | | | vmulps xmm2, xmm2, xmm3 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm4, xmm4, xmm5 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm6, xmm6, xmm7 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm8, xmm11, xmm11, 0xff | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm4 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm6, xmm6, xmm8 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm6 | |
| 1 | | | | | | 1.0 | | | CP | vxorps xmm9, xmm9, xmm2 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm3, xmm10, xmm10, 0x0 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm5, xmm10, xmm10, 0x55 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm7, xmm10, xmm10, 0xaa | |
| 2^ | 1.0 | | | 1.0 1.0 | | | | | | vpand xmm2, xmm3, xmmword ptr [rip] | |
| 2^ | 1.0 | | 1.0 1.0 | | | | | | | vpand xmm4, xmm5, xmmword ptr [rip] | |
| 2^ | 1.0 | | | 1.0 1.0 | | | | | | vpand xmm6, xmm7, xmmword ptr [rip] | |
| 2 | | | 1.0 1.0 | | | 1.0 | | | CP | vxorps xmm2, xmm2, xmmword ptr [rsi+rcx*1] | |
| 2 | | | | 1.0 1.0 | | 1.0 | | | CP | vxorps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x20] | |
| 2 | | | 1.0 1.0 | | | 1.0 | | | CP | vxorps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x40] | |
| 2 | | 1.0 | | 1.0 1.0 | | | | | | vaddps xmm2, xmm2, xmmword ptr [rsi+rcx*1+0x10] | |
| 2 | | 1.0 | 1.0 1.0 | | | | | | | vaddps xmm4, xmm4, xmmword ptr [rsi+rcx*1+0x30] | |
| 2 | | 1.0 | | 1.0 1.0 | | | | | | vaddps xmm6, xmm6, xmmword ptr [rsi+rcx*1+0x50] | |
| 1 | 1.0 | | | | | | | | | vmulps xmm2, xmm2, xmm3 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm4, xmm4, xmm5 | |
| 1 | 1.0 | | | | | | | | | vmulps xmm6, xmm6, xmm7 | |
| 1 | | | | | | 1.0 | | | CP | vshufps xmm8, xmm10, xmm10, 0xff | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm4 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm6, xmm6, xmm8 | |
| 1 | | 1.0 | | | | | | | | vaddps xmm2, xmm2, xmm6 | |
| 1 | | | | | | 1.0 | | | CP | vxorps xmm9, xmm9, xmm2 | |
| 1 | | | | | | | 1.0 | | | add rcx, 0x60 | |
| 0F | | | | | | | | | | jnz 0xfffffffffffffd54 | |
Total Num Of Uops: 175 |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment