Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider to drop -Werror for tests/configs pulling external APIs/frameworks #783

Open
hfp opened this issue Apr 22, 2024 · 12 comments
Open

Comments

@hfp
Copy link
Member

hfp commented Apr 22, 2024

As we pull latest bits/dependencies into tests, our CI is prone to break automatically. For example, some recent ROCm deprecation seems to cause warnings (turned into errors). This may be wanted but if not, -Werror may be disabled for certain configs.

In file included from /__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp:13:
/__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp: In function ‘int c_dbcsr_acc_init()’:
/__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp:30:40: error: ‘hipError_t hipDevicePrimaryCtxRetain(ihipCtx_t**, hipDevice_t)’ is deprecated: This API is marked as deprecated and may not be supported in future releases. For more details please refer https://github.com/ROCm/HIP/blob/develop/docs/reference/deprecated_api_list.md [-Werror=deprecated-declarations]
30 | ACC_DRV_CALL(DevicePrimaryCtxRetain, (&ctx, acc_device));
/__w/dbcsr/dbcsr/src/acc/cuda_hip/../hip/acc_hip.h:33:35: note: in definition of macro ‘HIP_API_CALL’
33 | hipError_t result = ACC(func) args;
| ^~~~
/__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp:30:3: note: in expansion of macro ‘ACC_DRV_CALL’
30 | ACC_DRV_CALL(DevicePrimaryCtxRetain, (&ctx, acc_device));
| ^~~~~~~~~~~~
In file included from /opt/rocm-6.1.0/include/hip/hip_runtime.h:70,
from /__w/dbcsr/dbcsr/src/acc/cuda_hip/../hip/acc_hip.h:13,
from /__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp:13:
/opt/rocm-6.1.0/include/hip/hip_runtime_api.h:5128:12: note: declared here
5128 | hipError_t hipDevicePrimaryCtxRetain(hipCtx_t* pctx, hipDevice_t dev);
| ^~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp:13:
/__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp: In function ‘int c_dbcsr_acc_finalize()’:
/__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp:44:41: error: ‘hipError_t hipDevicePrimaryCtxRelease(hipDevice_t)’ is deprecated: This API is marked as deprecated and may not be supported in future releases. For more details please refer https://github.com/ROCm/HIP/blob/develop/docs/reference/deprecated_api_list.md [-Werror=deprecated-declarations]
44 | ACC_DRV_CALL(DevicePrimaryCtxRelease, (acc_device));
/__w/dbcsr/dbcsr/src/acc/cuda_hip/../hip/acc_hip.h:33:35: note: in definition of macro ‘HIP_API_CALL’
33 | hipError_t result = ACC(func) args;
| ^~~~
/__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp:44:3: note: in expansion of macro ‘ACC_DRV_CALL’
44 | ACC_DRV_CALL(DevicePrimaryCtxRelease, (acc_device));
| ^~~~~~~~~~~~
In file included from /opt/rocm-6.1.0/include/hip/hip_runtime.h:70,
from /__w/dbcsr/dbcsr/src/acc/cuda_hip/../hip/acc_hip.h:13,
from /__w/dbcsr/dbcsr/src/acc/cuda_hip/acc_init.cpp:13:
/opt/rocm-6.1.0/include/hip/hip_runtime_api.h:5112:12: note: declared here
5112 | hipError_t hipDevicePrimaryCtxRelease(hipDevice_t dev);
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
cc1plus: all warnings being treated as errors

@alazzaro
Copy link
Member

So, the problem here is that we use FROM rocm/dev-ubuntu-22.04:latest which moved from ROCM 6.0.1 to 6.1.0 and now some HIP functions are deprecated...
I wonder if we should set a tag for ROCM (up to 6.0.2 for example). I don't think we should remove -Werror...
ROCM 6.1 introduced a long list of deprecated functions, however I'm not sure how we should update them... Maybe @gsitaram can help?

@hfp I would say, for the moment, please ignore this problem.

@alazzaro
Copy link
Member

alazzaro commented Apr 22, 2024

@hfp , you can also add -Wno-error=deprecated-declarations for the moment...

@gsitaram
Copy link
Contributor

I'll take a look and get back.

@gsitaram
Copy link
Contributor

gsitaram commented Jun 24, 2024

Hi @alazzaro, @hfp, I see the following message in the ROCm docs:

On the AMD platform, context management APIs are deprecated as there are better alternate interfaces, such as using hipSetDevice and stream APIs to achieve the required functionality.

On the NVIDIA platform, CUDA supports the driver API that defines "Context" and "Devices" as separate entities. Each context contains a single device, which can theoretically have multiple contexts. HIP initially added limited support for these APIs to facilitate easy porting from existing driver codes.

These APIs are only for equivalent driver APIs on the NVIDIA platform.

Are these functions necessary in DBCSR? Or could we replace them with equivalent hipDevice* functionality?

@alazzaro
Copy link
Member

Thanks @gsitaram I have no idea how to answer to your question...
I spent some time to understand how to replace them, but I cannot get it from the AMD documentation...

@gsitaram
Copy link
Contributor

@alazzaro , could you point me to all the functions that DBCSR is using but is deprecated now? Is there an error report from CI that I could refer to?

@alazzaro
Copy link
Member

@alazzaro , could you point me to all the functions that DBCSR is using but is deprecated now? Is there an error report from CI that I could refer to?

For that, please see @hfp issue of today (#810)

@alazzaro
Copy link
Member

Oh, oh, I didn't pay much attention... #810 is related to CUDA...
Then, for rocm, the deprecated functions are:

https://github.com/cp2k/dbcsr/actions/runs/9636368793/job/26574254231

Search for "deprecated and may"...

@dev-zero
Copy link
Contributor

in principle CMake should be adding header files from external packages with -isystem, on the other hand it will then just start failing hard when things are finally being removed

@dev-zero
Copy link
Contributor

moving our docker image to use a tagged upstream and then use something like the pre-commit bot to scan the images periodically might be a better method?

@hfp
Copy link
Member Author

hfp commented Jun 25, 2024

in principle CMake should be adding header files from external packages with -isystem, on the other hand it will then just start failing hard when things are finally being removed

moving our docker image to use a tagged upstream and then use something like the pre-commit bot to scan the images periodically might be a better method?

Are your comments meant for this issue?

@dev-zero
Copy link
Contributor

Are your comments meant for this issue?

yes, but it seems I omitted some part: by passing the include dirs for external API/frameworks with -isystem, at least gcc should not be emitting warnings coming from those headers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants