Created
November 23, 2022 23:58
-
-
Save vsrinivas/1e30df3bf4d9fd2b6af336d57edaece1 to your computer and use it in GitHub Desktop.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
always [madvise] never | |
mode: always | |
[always] madvise never | |
Initializing array made of 33554432 64-bit words (256.00 MiB). | |
Source pages allocated with transparent hugepages: 100.0% (65536 pages, 100.0% flagged) | |
Legend: | |
BandW: Implied bandwidth (assuming 64-byte cache line) in MB/s | |
% Eff: Effectiness of this lane count compared to the prior, as a % of ideal | |
Speedup: Speedup factor for this many lanes versus one lane | |
Applying Sattolo's algorithm... chain total: 33554432 | |
Time to sum up the array (linear scan) 0.029 s (x 8 = 0.229 s), bandwidth = 8962.3 MB/s | |
Size: 33554432 (262144.00 KiB, 256.00 MiB) | |
--------------------------------------------------------------------- | |
- # of lanes --- time (s) ---- BandW -- ns/hit -- % Eff -- Speedup -- | |
--------------------------------------------------------------------- | |
1 2.630047 779 78.4 0% 1.0 | |
2 1.330493 1539 39.7 99% 2.0 | |
3 0.906941 2258 27.0 96% 2.9 | |
4 0.692072 2959 20.6 95% 3.8 | |
5 0.564003 3631 16.8 93% 4.7 | |
6 0.483775 4233 14.4 85% 5.4 | |
7 0.432110 4740 12.9 75% 6.1 | |
8 0.377198 5430 11.2 102% 7.0 | |
9 0.345835 5922 10.3 75% 7.6 | |
10 0.328327 6238 9.8 51% 8.0 | |
11 0.308601 6636 9.2 66% 8.5 | |
12 0.297754 6878 8.9 42% 8.8 | |
13 0.291655 7022 8.7 27% 9.0 | |
14 0.282194 7257 8.4 45% 9.3 | |
15 0.288255 7105 8.6 -32% 9.1 | |
16 0.287719 7118 8.6 3% 9.1 | |
17 0.286870 7139 8.5 5% 9.2 | |
18 0.284358 7202 8.5 16% 9.2 | |
19 0.289436 7076 8.6 -34% 9.1 | |
20 0.295760 6925 8.8 -44% 8.9 | |
21 0.289971 7063 8.6 41% 9.1 | |
22 0.291660 7022 8.7 -13% 9.0 | |
23 0.284008 7211 8.5 60% 9.3 | |
24 0.289926 7064 8.6 -50% 9.1 | |
25 0.292180 7009 8.7 -19% 9.0 | |
26 0.288586 7097 8.6 32% 9.1 | |
27 0.288376 7102 8.6 2% 9.1 | |
28 0.297003 6896 8.9 -84% 8.9 | |
29 0.292822 6994 8.7 41% 9.0 | |
30 0.293671 6974 8.8 -9% 9.0 | |
31 0.289456 7075 8.6 44% 9.1 | |
32 0.287310 7128 8.6 24% 9.2 | |
33 0.295048 6941 8.8 -89% 8.9 | |
34 0.295525 6930 8.8 -6% 8.9 | |
35 0.296414 6909 8.8 -11% 8.9 | |
36 0.287624 7120 8.6 107% 9.1 | |
37 0.295471 6931 8.8 -101% 8.9 | |
38 0.298230 6867 8.9 -35% 8.8 | |
39 0.294439 6956 8.8 50% 8.9 | |
40 0.294616 6951 8.8 -2% 8.9 | |
Maybe you have about 14 parallel paths? | |
mode: never | |
always madvise [never] | |
Initializing array made of 33554432 64-bit words (256.00 MiB). | |
Source pages allocated with transparent hugepages: 0.0% (65536 pages, 100.0% flagged) | |
Legend: | |
BandW: Implied bandwidth (assuming 64-byte cache line) in MB/s | |
% Eff: Effectiness of this lane count compared to the prior, as a % of ideal | |
Speedup: Speedup factor for this many lanes versus one lane | |
Applying Sattolo's algorithm... chain total: 33554432 | |
Time to sum up the array (linear scan) 0.029 s (x 8 = 0.232 s), bandwidth = 8837.3 MB/s | |
Size: 33554432 (262144.00 KiB, 256.00 MiB) | |
--------------------------------------------------------------------- | |
- # of lanes --- time (s) ---- BandW -- ns/hit -- % Eff -- Speedup -- | |
--------------------------------------------------------------------- | |
1 2.980930 687 88.8 0% 1.0 | |
2 1.505097 1361 44.9 99% 2.0 | |
3 1.023700 2001 30.5 96% 2.9 | |
4 0.783129 2615 23.3 94% 3.8 | |
5 0.649609 3153 19.4 85% 4.6 | |
6 0.556561 3680 16.6 86% 5.4 | |
7 0.490677 4174 14.6 83% 6.1 | |
8 0.442962 4623 13.2 78% 6.7 | |
9 0.401537 5100 12.0 84% 7.4 | |
10 0.371809 5508 11.1 74% 8.0 | |
11 0.349054 5867 10.4 67% 8.5 | |
12 0.334601 6121 10.0 50% 8.9 | |
13 0.323253 6336 9.6 44% 9.2 | |
14 0.319347 6413 9.5 17% 9.3 | |
15 0.307541 6659 9.2 55% 9.7 | |
16 0.305902 6695 9.1 9% 9.7 | |
17 0.311355 6578 9.3 -30% 9.6 | |
18 0.314503 6512 9.4 -18% 9.5 | |
19 0.314078 6521 9.4 3% 9.5 | |
20 0.312483 6554 9.3 10% 9.5 | |
21 0.315485 6492 9.4 -20% 9.4 | |
22 0.308035 6649 9.2 52% 9.7 | |
23 0.310508 6596 9.3 -18% 9.6 | |
24 0.315087 6500 9.4 -35% 9.5 | |
25 0.314735 6507 9.4 3% 9.5 | |
26 0.317841 6443 9.5 -26% 9.4 | |
27 0.318662 6427 9.5 -7% 9.4 | |
28 0.316672 6467 9.4 17% 9.4 | |
29 0.317025 6460 9.4 -3% 9.4 | |
30 0.315062 6500 9.4 19% 9.5 | |
31 0.316171 6478 9.4 -11% 9.4 | |
32 0.319314 6414 9.5 -32% 9.3 | |
33 0.320902 6382 9.6 -16% 9.3 | |
34 0.321286 6374 9.6 -4% 9.3 | |
35 0.323953 6322 9.7 -29% 9.2 | |
36 0.326638 6270 9.7 -30% 9.1 | |
37 0.324422 6313 9.7 25% 9.2 | |
38 0.321375 6373 9.6 36% 9.3 | |
39 0.320386 6392 9.5 12% 9.3 | |
40 0.323924 6322 9.7 -44% 9.2 | |
Maybe you have about 15 parallel paths? | |
Done. | |
Restauring hugepages to madvise |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment