Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discussion on tuning machinery #805

Open
alazzaro opened this issue Jun 14, 2024 · 2 comments
Open

Discussion on tuning machinery #805

alazzaro opened this issue Jun 14, 2024 · 2 comments

Comments

@alazzaro
Copy link
Member

Follow up of #804

@RMeli
Copy link
Member

RMeli commented Jun 14, 2024

  • The performance gain with the tuned A100 kernels is minor compared to using the P100 kernels like the tuned P100 kernels work reasonably well for V100.

  • It is better to use the full set of autotuned and predicted kernels from the previous GPU generation than to use only a relative small set of autotuned kernels.

From the comments above (see #804 (comment)) looks like

Then, the strategy will be to rename the file/parameters in "AMD" and "NVIDIA" and drop the specific GPU version. As I said, I will add a generic kernel which will be good enough for all cases we don't cover with autotuning.

is a good compromise to move forward. But I'm no expert on this, so it's good to hear what people think about this issue.

@hfp
Copy link
Member

hfp commented Jun 14, 2024

Then, the strategy will be to rename the file/parameters in "AMD" and "NVIDIA" and drop the specific GPU version.

Good idea. In particular since a specific tuning may also need maintenance given the underlying runtime version can change over time (aka new CUDA version). Also, this opens a reasonable option to tune/refresh for the latest/deployed GPU (and to naturally phase-out some tuning for older GPUs, not saying it would not run anymore).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants