Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tune Histogram on H100 #266

Merged
merged 4 commits into from
Jul 26, 2023

Conversation

gevtushenko
Copy link
Collaborator

Description

closes #246

This PR provides histogram tuning for H100. It also extracts the sample vector size into a tuning policy and fixes histogram agent to work with stride load algorithm. Only U8 and U16 types are tuned. Tuning of other sample types leads to controversial results and is blocked by #912.

HBM3

Even

Bins Entropy I8 I16
128 0.201 -50.49% -9.99%
128 0.544 -63.60% -3.24%
128 1 -27.56% -12.38%
2048 0.201 -51.74% 10.63%
2048 0.544 -63.69% 6.39%
2048 1 -26.99% 2.97%
2097152 0.201 -51.05% -34.60%
2097152 0.544 -62.97% -34.59%
2097152 1 -26.47% -35.09%
32 0.201 -36.74% -12.29%
32 0.544 -53.25% -10.78%
32 1 -28.91% -12.41%
64 0.201 -44.52% -11.12%
64 0.544 -59.84% -6.29%
64 1 -28.23% -12.33%
SampleT{ct} BinT{ct} OffsetT{ct} Elements{io} Bins Entropy Cmp Noise %Diff
I8 I32 I32 2^16 32 0.201 1.73% -6.94%
I8 I32 I32 2^20 32 0.201 2.09% -6.89%
I8 I32 I32 2^24 32 0.201 1.52% -19.45%
I8 I32 I32 2^28 32 0.201 0.61% -36.74%
I8 I32 I32 2^16 64 0.201 1.93% -2.87%
I8 I32 I32 2^20 64 0.201 1.79% -9.01%
I8 I32 I32 2^24 64 0.201 1.56% -27.52%
I8 I32 I32 2^28 64 0.201 0.63% -44.52%
I8 I32 I32 2^16 128 0.201 1.81% -6.54%
I8 I32 I32 2^20 128 0.201 2.79% -7.30%
I8 I32 I32 2^24 128 0.201 1.63% -32.38%
I8 I32 I32 2^28 128 0.201 0.67% -50.49%
I8 I32 I32 2^16 2048 0.201 1.99% -5.60%
I8 I32 I32 2^20 2048 0.201 1.65% -12.38%
I8 I32 I32 2^24 2048 0.201 0.70% -34.54%
I8 I32 I32 2^28 2048 0.201 0.76% -51.74%
I8 I32 I32 2^16 2097152 0.201 1.55% -4.73%
I8 I32 I32 2^20 2097152 0.201 1.29% -8.04%
I8 I32 I32 2^24 2097152 0.201 0.77% -31.17%
I8 I32 I32 2^28 2097152 0.201 0.66% -51.05%
I8 I32 I32 2^16 32 0.544 2.07% -6.18%
I8 I32 I32 2^20 32 0.544 2.35% -8.67%
I8 I32 I32 2^24 32 0.544 1.60% -34.85%
I8 I32 I32 2^28 32 0.544 0.63% -53.25%
I8 I32 I32 2^16 64 0.544 2.16% -5.64%
I8 I32 I32 2^20 64 0.544 2.16% -12.50%
I8 I32 I32 2^24 64 0.544 1.48% -40.67%
I8 I32 I32 2^28 64 0.544 0.62% -59.84%
I8 I32 I32 2^16 128 0.544 2.20% -7.30%
I8 I32 I32 2^20 128 0.544 1.72% -13.73%
I8 I32 I32 2^24 128 0.544 1.61% -46.13%
I8 I32 I32 2^28 128 0.544 0.58% -63.60%
I8 I32 I32 2^16 2048 0.544 3.39% -5.41%
I8 I32 I32 2^20 2048 0.544 1.31% -17.11%
I8 I32 I32 2^24 2048 0.544 0.62% -46.74%
I8 I32 I32 2^28 2048 0.544 0.67% -63.69%
I8 I32 I32 2^16 2097152 0.544 1.70% -3.74%
I8 I32 I32 2^20 2097152 0.544 0.76% -12.85%
I8 I32 I32 2^24 2097152 0.544 0.66% -43.32%
I8 I32 I32 2^28 2097152 0.544 0.55% -62.97%
I8 I32 I32 2^16 32 1 1.56% -1.29%
I8 I32 I32 2^20 32 1 2.07% -4.45%
I8 I32 I32 2^24 32 1 1.58% -13.77%
I8 I32 I32 2^28 32 1 0.68% -28.91%
I8 I32 I32 2^16 64 1 2.31% -2.80%
I8 I32 I32 2^20 64 1 2.99% -5.83%
I8 I32 I32 2^24 64 1 1.46% -12.41%
I8 I32 I32 2^28 64 1 0.65% -28.23%
I8 I32 I32 2^16 128 1 2.42% -3.11%
I8 I32 I32 2^20 128 1 1.59% -4.36%
I8 I32 I32 2^24 128 1 1.61% -12.51%
I8 I32 I32 2^28 128 1 0.65% -27.56%
I8 I32 I32 2^16 2048 1 1.92% -5.45%
I8 I32 I32 2^20 2048 1 0.74% -21.43%
I8 I32 I32 2^24 2048 1 0.53% -37.47%
I8 I32 I32 2^28 2048 1 0.67% -26.99%
I8 I32 I32 2^16 2097152 1 1.29% -3.18%
I8 I32 I32 2^20 2097152 1 0.78% -18.91%
I8 I32 I32 2^24 2097152 1 0.44% -35.26%
I8 I32 I32 2^28 2097152 1 0.57% -26.47%
I16 I32 I32 2^16 32 0.201 1.57% -8.00%
I16 I32 I32 2^20 32 0.201 1.43% -10.21%
I16 I32 I32 2^24 32 0.201 0.82% -7.27%
I16 I32 I32 2^28 32 0.201 0.18% -12.29%
I16 I32 I32 2^16 64 0.201 1.56% -6.50%
I16 I32 I32 2^20 64 0.201 1.66% -8.71%
I16 I32 I32 2^24 64 0.201 0.64% -7.19%
I16 I32 I32 2^28 64 0.201 0.18% -11.12%
I16 I32 I32 2^16 128 0.201 1.93% -5.55%
I16 I32 I32 2^20 128 0.201 1.58% -10.05%
I16 I32 I32 2^24 128 0.201 0.71% -6.56%
I16 I32 I32 2^28 128 0.201 0.19% -9.99%
I16 I32 I32 2^16 2048 0.201 1.19% 3.55%
I16 I32 I32 2^20 2048 0.201 6.33% 2.35%
I16 I32 I32 2^24 2048 0.201 2.40% 23.51%
I16 I32 I32 2^28 2048 0.201 4.29% 10.63%
I16 I32 I32 2^16 2097152 0.201 0.21% -58.20%
I16 I32 I32 2^20 2097152 0.201 0.19% -54.90%
I16 I32 I32 2^24 2097152 0.201 0.24% -47.20%
I16 I32 I32 2^28 2097152 0.201 2.99% -34.60%
I16 I32 I32 2^16 32 0.544 1.62% -0.26%
I16 I32 I32 2^20 32 0.544 1.64% -7.49%
I16 I32 I32 2^24 32 0.544 0.64% -6.37%
I16 I32 I32 2^28 32 0.544 0.19% -10.78%
I16 I32 I32 2^16 64 0.544 1.53% -0.60%
I16 I32 I32 2^20 64 0.544 1.88% -6.61%
I16 I32 I32 2^24 64 0.544 0.70% -4.86%
I16 I32 I32 2^28 64 0.544 0.20% -6.29%
I16 I32 I32 2^16 128 0.544 2.07% 1.79%
I16 I32 I32 2^20 128 0.544 1.55% -7.68%
I16 I32 I32 2^24 128 0.544 0.61% -3.70%
I16 I32 I32 2^28 128 0.544 0.23% -3.24%
I16 I32 I32 2^16 2048 0.544 0.95% 9.01%
I16 I32 I32 2^20 2048 0.544 3.50% -2.87%
I16 I32 I32 2^24 2048 0.544 2.70% 10.00%
I16 I32 I32 2^28 2048 0.544 3.50% 6.39%
I16 I32 I32 2^16 2097152 0.544 0.21% -58.18%
I16 I32 I32 2^20 2097152 0.544 0.19% -54.89%
I16 I32 I32 2^24 2097152 0.544 0.26% -47.20%
I16 I32 I32 2^28 2097152 0.544 2.76% -34.59%
I16 I32 I32 2^16 32 1 2.55% -3.99%
I16 I32 I32 2^20 32 1 1.53% -7.74%
I16 I32 I32 2^24 32 1 0.74% -6.62%
I16 I32 I32 2^28 32 1 0.17% -12.41%
I16 I32 I32 2^16 64 1 1.79% -3.41%
I16 I32 I32 2^20 64 1 2.18% -7.55%
I16 I32 I32 2^24 64 1 0.86% -7.09%
I16 I32 I32 2^28 64 1 0.17% -12.33%
I16 I32 I32 2^16 128 1 2.27% -4.80%
I16 I32 I32 2^20 128 1 1.65% -7.64%
I16 I32 I32 2^24 128 1 0.77% -7.30%
I16 I32 I32 2^28 128 1 0.17% -12.38%
I16 I32 I32 2^16 2048 1 0.92% 7.77%
I16 I32 I32 2^20 2048 1 3.37% -8.55%
I16 I32 I32 2^24 2048 1 4.40% 4.04%
I16 I32 I32 2^28 2048 1 3.64% 2.97%
I16 I32 I32 2^16 2097152 1 0.22% -58.19%
I16 I32 I32 2^20 2097152 1 0.19% -54.89%
I16 I32 I32 2^24 2097152 1 0.26% -47.21%
I16 I32 I32 2^28 2097152 1 2.91% -35.09%

Range

Bins Entropy I8 I16
128 0.201 -50.37% -10.77%
128 0.544 -63.53% -10.93%
128 1 -27.75% -10.96%
2048 0.201 -50.36% 2.51%
2048 0.544 -63.50% 0.19%
2048 1 -27.80% -4.37%
2097152 0.201 -49.50% -31.98%
2097152 0.544 -62.78% -31.98%
2097152 1 -27.15% -32.02%
32 0.201 -36.66% -11.37%
32 0.544 -53.05% -11.37%
32 1 -28.94% -11.44%
64 0.201 -44.38% -11.04%
64 0.544 -59.66% -11.06%
64 1 -28.44% -11.14%
SampleT{ct} BinT{ct} OffsetT{ct} Elements{io} Bins Entropy Cmp Noise %Diff
I8 I32 I32 2^16 32 0.201 1.93% -3.45%
I8 I32 I32 2^20 32 0.201 1.87% -6.20%
I8 I32 I32 2^24 32 0.201 1.54% -18.69%
I8 I32 I32 2^28 32 0.201 0.67% -36.66%
I8 I32 I32 2^16 64 0.201 2.10% -3.27%
I8 I32 I32 2^20 64 0.201 1.83% -5.96%
I8 I32 I32 2^24 64 0.201 1.63% -25.59%
I8 I32 I32 2^28 64 0.201 0.69% -44.38%
I8 I32 I32 2^16 128 0.201 1.65% -3.22%
I8 I32 I32 2^20 128 0.201 1.77% -5.14%
I8 I32 I32 2^24 128 0.201 1.57% -30.80%
I8 I32 I32 2^28 128 0.201 0.65% -50.37%
I8 I32 I32 2^16 2048 0.201 1.22% 0.57%
I8 I32 I32 2^20 2048 0.201 1.84% -3.97%
I8 I32 I32 2^24 2048 0.201 1.25% -29.75%
I8 I32 I32 2^28 2048 0.201 0.68% -50.36%
I8 I32 I32 2^16 2097152 0.201 1.37% -0.84%
I8 I32 I32 2^20 2097152 0.201 1.21% -0.45%
I8 I32 I32 2^24 2097152 0.201 1.17% -21.40%
I8 I32 I32 2^28 2097152 0.201 0.54% -49.50%
I8 I32 I32 2^16 32 0.544 2.12% -2.68%
I8 I32 I32 2^20 32 0.544 1.85% -4.25%
I8 I32 I32 2^24 32 0.544 1.58% -32.22%
I8 I32 I32 2^28 32 0.544 0.66% -53.05%
I8 I32 I32 2^16 64 0.544 1.92% -1.23%
I8 I32 I32 2^20 64 0.544 2.14% -7.52%
I8 I32 I32 2^24 64 0.544 1.49% -38.83%
I8 I32 I32 2^28 64 0.544 0.65% -59.66%
I8 I32 I32 2^16 128 0.544 1.38% -2.13%
I8 I32 I32 2^20 128 0.544 2.36% -10.19%
I8 I32 I32 2^24 128 0.544 1.90% -44.55%
I8 I32 I32 2^28 128 0.544 0.62% -63.53%
I8 I32 I32 2^16 2048 0.544 1.92% -0.80%
I8 I32 I32 2^20 2048 0.544 1.78% -3.82%
I8 I32 I32 2^24 2048 0.544 1.29% -43.74%
I8 I32 I32 2^28 2048 0.544 0.59% -63.50%
I8 I32 I32 2^16 2097152 0.544 1.34% -3.11%
I8 I32 I32 2^20 2097152 0.544 1.29% -2.95%
I8 I32 I32 2^24 2097152 0.544 1.11% -34.80%
I8 I32 I32 2^28 2097152 0.544 0.51% -62.78%
I8 I32 I32 2^16 32 1 1.51% -6.42%
I8 I32 I32 2^20 32 1 1.93% -6.47%
I8 I32 I32 2^24 32 1 1.51% -14.20%
I8 I32 I32 2^28 32 1 0.68% -28.94%
I8 I32 I32 2^16 64 1 2.11% -3.89%
I8 I32 I32 2^20 64 1 2.06% -3.47%
I8 I32 I32 2^24 64 1 1.74% -12.47%
I8 I32 I32 2^28 64 1 0.65% -28.44%
I8 I32 I32 2^16 128 1 2.23% -4.12%
I8 I32 I32 2^20 128 1 1.59% -4.87%
I8 I32 I32 2^24 128 1 1.60% -13.08%
I8 I32 I32 2^28 128 1 0.64% -27.75%
I8 I32 I32 2^16 2048 1 1.96% -0.94%
I8 I32 I32 2^20 2048 1 1.74% -3.11%
I8 I32 I32 2^24 2048 1 1.46% -11.10%
I8 I32 I32 2^28 2048 1 0.64% -27.80%
I8 I32 I32 2^16 2097152 1 1.46% -2.20%
I8 I32 I32 2^20 2097152 1 1.39% 0.67%
I8 I32 I32 2^24 2097152 1 1.06% -1.52%
I8 I32 I32 2^28 2097152 1 0.56% -27.15%
I16 I32 I32 2^16 32 0.201 1.22% -9.01%
I16 I32 I32 2^20 32 0.201 1.62% -13.81%
I16 I32 I32 2^24 32 0.201 0.39% -7.78%
I16 I32 I32 2^28 32 0.201 0.05% -11.37%
I16 I32 I32 2^16 64 0.201 1.25% -6.63%
I16 I32 I32 2^20 64 0.201 1.43% -13.70%
I16 I32 I32 2^24 64 0.201 0.28% -8.22%
I16 I32 I32 2^28 64 0.201 0.04% -11.04%
I16 I32 I32 2^16 128 0.201 1.43% -7.08%
I16 I32 I32 2^20 128 0.201 1.38% -13.79%
I16 I32 I32 2^24 128 0.201 0.27% -7.85%
I16 I32 I32 2^28 128 0.201 0.03% -10.77%
I16 I32 I32 2^16 2048 0.201 1.11% -8.63%
I16 I32 I32 2^20 2048 0.201 2.00% -7.93%
I16 I32 I32 2^24 2048 0.201 1.92% 0.72%
I16 I32 I32 2^28 2048 0.201 6.18% 2.51%
I16 I32 I32 2^16 2097152 0.201 0.15% -57.82%
I16 I32 I32 2^20 2097152 0.201 0.19% -54.63%
I16 I32 I32 2^24 2097152 0.201 0.35% -52.68%
I16 I32 I32 2^28 2097152 0.201 0.07% -31.98%
I16 I32 I32 2^16 32 0.544 1.18% -10.87%
I16 I32 I32 2^20 32 0.544 1.03% -14.66%
I16 I32 I32 2^24 32 0.544 0.33% -8.06%
I16 I32 I32 2^28 32 0.544 0.05% -11.37%
I16 I32 I32 2^16 64 0.544 1.11% -5.96%
I16 I32 I32 2^20 64 0.544 1.28% -15.21%
I16 I32 I32 2^24 64 0.544 0.30% -8.07%
I16 I32 I32 2^28 64 0.544 0.04% -11.06%
I16 I32 I32 2^16 128 0.544 1.18% -7.84%
I16 I32 I32 2^20 128 0.544 1.44% -15.92%
I16 I32 I32 2^24 128 0.544 0.34% -7.99%
I16 I32 I32 2^28 128 0.544 0.04% -10.93%
I16 I32 I32 2^16 2048 0.544 0.90% -2.38%
I16 I32 I32 2^20 2048 0.544 2.59% -9.70%
I16 I32 I32 2^24 2048 0.544 2.14% -0.07%
I16 I32 I32 2^28 2048 0.544 3.69% 0.19%
I16 I32 I32 2^16 2097152 0.544 0.15% -57.81%
I16 I32 I32 2^20 2097152 0.544 0.18% -54.64%
I16 I32 I32 2^24 2097152 0.544 0.36% -52.70%
I16 I32 I32 2^28 2097152 0.544 0.07% -31.98%
I16 I32 I32 2^16 32 1 1.42% -9.57%
I16 I32 I32 2^20 32 1 1.43% -15.18%
I16 I32 I32 2^24 32 1 0.31% -8.50%
I16 I32 I32 2^28 32 1 0.12% -11.44%
I16 I32 I32 2^16 64 1 1.18% -9.27%
I16 I32 I32 2^20 64 1 1.41% -15.31%
I16 I32 I32 2^24 64 1 0.31% -8.79%
I16 I32 I32 2^28 64 1 0.12% -11.14%
I16 I32 I32 2^16 128 1 1.17% -9.61%
I16 I32 I32 2^20 128 1 1.41% -14.88%
I16 I32 I32 2^24 128 1 0.27% -8.89%
I16 I32 I32 2^28 128 1 0.05% -10.96%
I16 I32 I32 2^16 2048 1 0.92% 9.05%
I16 I32 I32 2^20 2048 1 2.41% -5.44%
I16 I32 I32 2^24 2048 1 1.36% -7.39%
I16 I32 I32 2^28 2048 1 0.78% -4.37%
I16 I32 I32 2^16 2097152 1 0.15% -57.80%
I16 I32 I32 2^20 2097152 1 0.17% -54.66%
I16 I32 I32 2^24 2097152 1 0.38% -52.79%
I16 I32 I32 2^28 2097152 1 0.06% -32.02%

HBM2e

Even

Bins Entropy I8 I16
128 0.201 -58.49% -10.41%
128 0.544 -68.70% -1.68%
128 1 -38.03% -13.87%
2048 0.201 -58.37% 7.71%
2048 0.544 -68.43% 1.77%
2048 1 -37.87% 1.69%
2097152 0.201 -57.84% -33.94%
2097152 0.544 -68.01% -34.04%
2097152 1 -37.35% -33.40%
32 0.201 -46.36% -13.28%
32 0.544 -60.92% -11.59%
32 1 -39.97% -14.03%
64 0.201 -53.02% -12.14%
64 0.544 -66.90% -6.89%
64 1 -39.54% -13.98%
SampleT{ct} BinT{ct} OffsetT{ct} Elements{io} Bins Entropy Cmp Noise %Diff
I8 I32 I32 2^16 32 0.201 1.63% -6.37%
I8 I32 I32 2^20 32 0.201 1.43% -12.76%
I8 I32 I32 2^24 32 0.201 1.74% -38.68%
I8 I32 I32 2^28 32 0.201 0.80% -46.36%
I8 I32 I32 2^16 64 0.201 1.77% -6.28%
I8 I32 I32 2^20 64 0.201 1.57% -13.00%
I8 I32 I32 2^24 64 0.201 1.89% -44.44%
I8 I32 I32 2^28 64 0.201 0.87% -53.02%
I8 I32 I32 2^16 128 0.201 1.53% -6.49%
I8 I32 I32 2^20 128 0.201 1.56% -15.21%
I8 I32 I32 2^24 128 0.201 1.72% -50.16%
I8 I32 I32 2^28 128 0.201 0.93% -58.49%
I8 I32 I32 2^16 2048 0.201 1.61% -6.64%
I8 I32 I32 2^20 2048 0.201 1.54% -15.65%
I8 I32 I32 2^24 2048 0.201 1.92% -50.04%
I8 I32 I32 2^28 2048 0.201 0.90% -58.37%
I8 I32 I32 2^16 2097152 0.201 1.21% -3.98%
I8 I32 I32 2^20 2097152 0.201 1.21% -10.00%
I8 I32 I32 2^24 2097152 0.201 1.67% -44.55%
I8 I32 I32 2^28 2097152 0.201 0.90% -57.84%
I8 I32 I32 2^16 32 0.544 1.37% -6.02%
I8 I32 I32 2^20 32 0.544 1.54% -16.98%
I8 I32 I32 2^24 32 0.544 1.74% -52.88%
I8 I32 I32 2^28 32 0.544 1.05% -60.92%
I8 I32 I32 2^16 64 0.544 1.47% -6.15%
I8 I32 I32 2^20 64 0.544 1.39% -22.46%
I8 I32 I32 2^24 64 0.544 1.68% -59.29%
I8 I32 I32 2^28 64 0.544 1.01% -66.90%
I8 I32 I32 2^16 128 0.544 1.41% -5.78%
I8 I32 I32 2^20 128 0.544 1.33% -26.45%
I8 I32 I32 2^24 128 0.544 1.63% -62.52%
I8 I32 I32 2^28 128 0.544 1.12% -68.70%
I8 I32 I32 2^16 2048 0.544 1.55% -6.13%
I8 I32 I32 2^20 2048 0.544 0.94% -19.80%
I8 I32 I32 2^24 2048 0.544 1.07% -59.86%
I8 I32 I32 2^28 2048 0.544 1.10% -68.43%
I8 I32 I32 2^16 2097152 0.544 1.28% -3.58%
I8 I32 I32 2^20 2097152 0.544 0.99% -13.65%
I8 I32 I32 2^24 2097152 0.544 1.06% -55.64%
I8 I32 I32 2^28 2097152 0.544 1.11% -68.01%
I8 I32 I32 2^16 32 1 1.50% -4.19%
I8 I32 I32 2^20 32 1 1.39% -12.21%
I8 I32 I32 2^24 32 1 1.71% -32.71%
I8 I32 I32 2^28 32 1 0.55% -39.97%
I8 I32 I32 2^16 64 1 1.40% -3.40%
I8 I32 I32 2^20 64 1 1.37% -11.76%
I8 I32 I32 2^24 64 1 1.78% -31.98%
I8 I32 I32 2^28 64 1 0.52% -39.54%
I8 I32 I32 2^16 128 1 1.43% -2.81%
I8 I32 I32 2^20 128 1 1.52% -11.45%
I8 I32 I32 2^24 128 1 1.82% -30.57%
I8 I32 I32 2^28 128 1 0.50% -38.03%
I8 I32 I32 2^16 2048 1 1.45% -2.95%
I8 I32 I32 2^20 2048 1 0.73% -21.77%
I8 I32 I32 2^24 2048 1 1.02% -32.95%
I8 I32 I32 2^28 2048 1 0.48% -37.87%
I8 I32 I32 2^16 2097152 1 1.23% -2.00%
I8 I32 I32 2^20 2097152 1 0.86% -15.82%
I8 I32 I32 2^24 2097152 1 0.94% -29.29%
I8 I32 I32 2^28 2097152 1 0.47% -37.35%
I16 I32 I32 2^16 32 0.201 1.92% -11.40%
I16 I32 I32 2^20 32 0.201 1.62% -19.11%
I16 I32 I32 2^24 32 0.201 1.17% -7.48%
I16 I32 I32 2^28 32 0.201 0.35% -13.28%
I16 I32 I32 2^16 64 0.201 1.88% -9.87%
I16 I32 I32 2^20 64 0.201 1.59% -17.79%
I16 I32 I32 2^24 64 0.201 1.22% -5.94%
I16 I32 I32 2^28 64 0.201 0.37% -12.14%
I16 I32 I32 2^16 128 0.201 1.83% -7.31%
I16 I32 I32 2^20 128 0.201 1.58% -15.51%
I16 I32 I32 2^24 128 0.201 1.17% -3.92%
I16 I32 I32 2^28 128 0.201 0.38% -10.41%
I16 I32 I32 2^16 2048 0.201 1.52% -1.66%
I16 I32 I32 2^20 2048 0.201 4.82% 0.56%
I16 I32 I32 2^24 2048 0.201 8.18% 8.16%
I16 I32 I32 2^28 2048 0.201 6.16% 7.71%
I16 I32 I32 2^16 2097152 0.201 3.54% -52.42%
I16 I32 I32 2^20 2097152 0.201 0.32% -55.70%
I16 I32 I32 2^24 2097152 0.201 0.40% -47.74%
I16 I32 I32 2^28 2097152 0.201 1.41% -33.94%
I16 I32 I32 2^16 32 0.544 1.80% -7.92%
I16 I32 I32 2^20 32 0.544 1.62% -15.99%
I16 I32 I32 2^24 32 0.544 1.23% -5.26%
I16 I32 I32 2^28 32 0.544 0.35% -11.59%
I16 I32 I32 2^16 64 0.544 1.76% -4.69%
I16 I32 I32 2^20 64 0.544 1.64% -14.25%
I16 I32 I32 2^24 64 0.544 1.12% -3.93%
I16 I32 I32 2^28 64 0.544 0.47% -6.89%
I16 I32 I32 2^16 128 0.544 1.75% -0.90%
I16 I32 I32 2^20 128 0.544 1.72% -13.26%
I16 I32 I32 2^24 128 0.544 1.11% -1.45%
I16 I32 I32 2^28 128 0.544 0.47% -1.68%
I16 I32 I32 2^16 2048 0.544 1.10% 7.01%
I16 I32 I32 2^20 2048 0.544 3.43% -4.98%
I16 I32 I32 2^24 2048 0.544 3.86% 5.50%
I16 I32 I32 2^28 2048 0.544 2.41% 1.77%
I16 I32 I32 2^16 2097152 0.544 0.72% -52.18%
I16 I32 I32 2^20 2097152 0.544 0.32% -55.72%
I16 I32 I32 2^24 2097152 0.544 0.39% -47.77%
I16 I32 I32 2^28 2097152 0.544 1.31% -34.04%
I16 I32 I32 2^16 32 1 1.98% -11.78%
I16 I32 I32 2^20 32 1 1.60% -20.05%
I16 I32 I32 2^24 32 1 1.02% -9.29%
I16 I32 I32 2^28 32 1 0.28% -14.03%
I16 I32 I32 2^16 64 1 2.02% -11.63%
I16 I32 I32 2^20 64 1 1.49% -20.10%
I16 I32 I32 2^24 64 1 1.10% -9.52%
I16 I32 I32 2^28 64 1 0.32% -13.98%
I16 I32 I32 2^16 128 1 2.03% -11.54%
I16 I32 I32 2^20 128 1 1.45% -20.18%
I16 I32 I32 2^24 128 1 0.93% -9.63%
I16 I32 I32 2^28 128 1 0.34% -13.87%
I16 I32 I32 2^16 2048 1 1.71% 20.22%
I16 I32 I32 2^20 2048 1 1.77% -9.18%
I16 I32 I32 2^24 2048 1 0.94% 4.06%
I16 I32 I32 2^28 2048 1 0.83% 1.69%
I16 I32 I32 2^16 2097152 1 4.73% -52.49%
I16 I32 I32 2^20 2097152 1 0.31% -55.68%
I16 I32 I32 2^24 2097152 1 0.39% -47.76%
I16 I32 I32 2^28 2097152 1 1.69% -33.40%

Range

Bins Entropy I8 I16
128 0.201 -58.36% -11.14%
128 0.544 -68.50% -12.12%
128 1 -38.26% -11.31%
2048 0.201 -58.20% -10.11%
2048 0.544 -68.42% -2.99%
2048 1 -38.17% -2.29%
2097152 0.201 -57.34% -22.66%
2097152 0.544 -67.84% -22.64%
2097152 1 -37.48% -22.58%
32 0.201 -46.40% -11.85%
32 0.544 -60.65% -11.93%
32 1 -40.14% -12.07%
64 0.201 -53.01% -11.47%
64 0.544 -66.66% -11.74%
64 1 -39.63% -11.60%
SampleT{ct} BinT{ct} OffsetT{ct} Elements{io} Bins Entropy Cmp Noise %Diff
I8 I32 I32 2^16 32 0.201 1.33% -7.61%
I8 I32 I32 2^20 32 0.201 1.49% -10.98%
I8 I32 I32 2^24 32 0.201 1.68% -38.61%
I8 I32 I32 2^28 32 0.201 0.78% -46.40%
I8 I32 I32 2^16 64 0.201 1.58% -7.05%
I8 I32 I32 2^20 64 0.201 1.40% -11.75%
I8 I32 I32 2^24 64 0.201 1.73% -44.15%
I8 I32 I32 2^28 64 0.201 0.88% -53.01%
I8 I32 I32 2^16 128 0.201 1.32% -6.98%
I8 I32 I32 2^20 128 0.201 1.67% -13.77%
I8 I32 I32 2^24 128 0.201 1.76% -49.85%
I8 I32 I32 2^28 128 0.201 0.88% -58.36%
I8 I32 I32 2^16 2048 0.201 1.56% -6.97%
I8 I32 I32 2^20 2048 0.201 1.78% -13.13%
I8 I32 I32 2^24 2048 0.201 1.70% -49.23%
I8 I32 I32 2^28 2048 0.201 0.88% -58.20%
I8 I32 I32 2^16 2097152 0.201 1.32% -4.85%
I8 I32 I32 2^20 2097152 0.201 1.10% -6.37%
I8 I32 I32 2^24 2097152 0.201 1.47% -43.26%
I8 I32 I32 2^28 2097152 0.201 0.82% -57.34%
I8 I32 I32 2^16 32 0.544 1.53% -5.19%
I8 I32 I32 2^20 32 0.544 1.69% -14.09%
I8 I32 I32 2^24 32 0.544 1.76% -52.10%
I8 I32 I32 2^28 32 0.544 1.13% -60.65%
I8 I32 I32 2^16 64 0.544 1.54% -4.79%
I8 I32 I32 2^20 64 0.544 1.57% -18.74%
I8 I32 I32 2^24 64 0.544 1.68% -58.53%
I8 I32 I32 2^28 64 0.544 1.11% -66.66%
I8 I32 I32 2^16 128 0.544 1.49% -3.96%
I8 I32 I32 2^20 128 0.544 1.57% -22.80%
I8 I32 I32 2^24 128 0.544 1.65% -61.76%
I8 I32 I32 2^28 128 0.544 1.10% -68.50%
I8 I32 I32 2^16 2048 0.544 1.75% -3.97%
I8 I32 I32 2^20 2048 0.544 1.85% -20.85%
I8 I32 I32 2^24 2048 0.544 1.62% -61.20%
I8 I32 I32 2^28 2048 0.544 1.12% -68.42%
I8 I32 I32 2^16 2097152 0.544 1.36% -5.94%
I8 I32 I32 2^20 2097152 0.544 1.15% -13.49%
I8 I32 I32 2^24 2097152 0.544 1.42% -55.66%
I8 I32 I32 2^28 2097152 0.544 1.06% -67.84%
I8 I32 I32 2^16 32 1 1.47% -3.82%
I8 I32 I32 2^20 32 1 1.42% -6.49%
I8 I32 I32 2^24 32 1 1.70% -32.33%
I8 I32 I32 2^28 32 1 0.55% -40.14%
I8 I32 I32 2^16 64 1 1.43% -3.61%
I8 I32 I32 2^20 64 1 1.45% -6.88%
I8 I32 I32 2^24 64 1 1.75% -31.82%
I8 I32 I32 2^28 64 1 0.55% -39.63%
I8 I32 I32 2^16 128 1 1.64% -3.04%
I8 I32 I32 2^20 128 1 1.40% -4.35%
I8 I32 I32 2^24 128 1 1.80% -30.39%
I8 I32 I32 2^28 128 1 0.51% -38.26%
I8 I32 I32 2^16 2048 1 1.68% -3.43%
I8 I32 I32 2^20 2048 1 1.53% -4.01%
I8 I32 I32 2^24 2048 1 1.74% -29.98%
I8 I32 I32 2^28 2048 1 0.51% -38.17%
I8 I32 I32 2^16 2097152 1 1.48% -3.98%
I8 I32 I32 2^20 2097152 1 1.09% -3.61%
I8 I32 I32 2^24 2097152 1 1.52% -24.97%
I8 I32 I32 2^28 2097152 1 0.49% -37.48%
I16 I32 I32 2^16 32 0.201 1.56% -17.11%
I16 I32 I32 2^20 32 0.201 2.01% -25.63%
I16 I32 I32 2^24 32 0.201 0.68% -10.09%
I16 I32 I32 2^28 32 0.201 0.16% -11.85%
I16 I32 I32 2^16 64 0.201 1.16% -16.06%
I16 I32 I32 2^20 64 0.201 1.88% -25.04%
I16 I32 I32 2^24 64 0.201 0.68% -9.76%
I16 I32 I32 2^28 64 0.201 0.18% -11.47%
I16 I32 I32 2^16 128 0.201 0.97% -15.43%
I16 I32 I32 2^20 128 0.201 1.82% -24.64%
I16 I32 I32 2^24 128 0.201 0.69% -9.27%
I16 I32 I32 2^28 128 0.201 0.22% -11.14%
I16 I32 I32 2^16 2048 0.201 0.86% -11.34%
I16 I32 I32 2^20 2048 0.201 2.27% -15.21%
I16 I32 I32 2^24 2048 0.201 1.92% -7.81%
I16 I32 I32 2^28 2048 0.201 0.23% -10.11%
I16 I32 I32 2^16 2097152 0.201 0.51% -51.32%
I16 I32 I32 2^20 2097152 0.201 0.30% -54.91%
I16 I32 I32 2^24 2097152 0.201 0.23% -47.15%
I16 I32 I32 2^28 2097152 0.201 0.11% -22.66%
I16 I32 I32 2^16 32 0.544 1.18% -15.91%
I16 I32 I32 2^20 32 0.544 1.91% -25.17%
I16 I32 I32 2^24 32 0.544 0.60% -9.96%
I16 I32 I32 2^28 32 0.544 0.15% -11.93%
I16 I32 I32 2^16 64 0.544 1.12% -14.69%
I16 I32 I32 2^20 64 0.544 1.93% -24.15%
I16 I32 I32 2^24 64 0.544 0.61% -9.54%
I16 I32 I32 2^28 64 0.544 0.12% -11.74%
I16 I32 I32 2^16 128 0.544 0.99% -13.73%
I16 I32 I32 2^20 128 0.544 1.82% -24.20%
I16 I32 I32 2^24 128 0.544 0.63% -9.42%
I16 I32 I32 2^28 128 0.544 0.18% -12.12%
I16 I32 I32 2^16 2048 0.544 0.68% -2.58%
I16 I32 I32 2^20 2048 0.544 2.60% -14.39%
I16 I32 I32 2^24 2048 0.544 2.83% 2.30%
I16 I32 I32 2^28 2048 0.544 0.38% -2.99%
I16 I32 I32 2^16 2097152 0.544 1.68% -51.51%
I16 I32 I32 2^20 2097152 0.544 0.32% -54.91%
I16 I32 I32 2^24 2097152 0.544 0.26% -47.20%
I16 I32 I32 2^28 2097152 0.544 0.11% -22.64%
I16 I32 I32 2^16 32 1 1.23% -17.43%
I16 I32 I32 2^20 32 1 1.88% -25.66%
I16 I32 I32 2^24 32 1 0.57% -10.79%
I16 I32 I32 2^28 32 1 0.12% -12.07%
I16 I32 I32 2^16 64 1 1.12% -17.44%
I16 I32 I32 2^20 64 1 1.95% -25.06%
I16 I32 I32 2^24 64 1 0.57% -10.54%
I16 I32 I32 2^28 64 1 0.12% -11.60%
I16 I32 I32 2^16 128 1 1.06% -17.47%
I16 I32 I32 2^20 128 1 1.95% -24.52%
I16 I32 I32 2^24 128 1 0.54% -10.39%
I16 I32 I32 2^28 128 1 0.13% -11.31%
I16 I32 I32 2^16 2048 1 1.45% 13.64%
I16 I32 I32 2^20 2048 1 1.25% -10.87%
I16 I32 I32 2^24 2048 1 0.52% -2.82%
I16 I32 I32 2^28 2048 1 0.15% -2.29%
I16 I32 I32 2^16 2097152 1 1.00% -51.47%
I16 I32 I32 2^20 2097152 1 0.35% -54.92%
I16 I32 I32 2^24 2097152 1 0.25% -47.18%
I16 I32 I32 2^28 2097152 1 0.09% -22.58%

Checklist

  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@gevtushenko gevtushenko requested review from a team as code owners July 25, 2023 08:38
@gevtushenko gevtushenko requested review from elstehle and miscco and removed request for a team July 25, 2023 08:38
cub/cub/agent/agent_histogram.cuh Outdated Show resolved Hide resolved
cub/cub/agent/agent_histogram.cuh Show resolved Hide resolved
@gevtushenko gevtushenko merged commit 5531a47 into NVIDIA:main Jul 26, 2023
368 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[FEA]: Tune Histogram on H100
2 participants