Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JobTimeoutException Leaves Temp Files #221

Open
JVickery-TBS opened this issue Jul 24, 2024 · 4 comments
Open

JobTimeoutException Leaves Temp Files #221

JVickery-TBS opened this issue Jul 24, 2024 · 4 comments

Comments

@JVickery-TBS
Copy link
Contributor

Hitting the rq JobTimeoutException during Xloadering seems to leave tmp files in the tmp directory.

@JVickery-TBS
Copy link
Contributor Author

Not even sure if this is possible to fix in Xloader here. I think you need to use rq's push_exc_handler inside of the ckan.cli.jobs.worker method. So doing some debugging and seeing if I can make an implement in Core code to add exception handlers for the jobs worker.

And then it would be a matter of figuring out how to get the temp file path/name into the implemented exception handler in Xloader here.

@JVickery-TBS
Copy link
Contributor Author

Sorry for the sporadic comments on this one. More debugging, I found that we could catch JobTimeoutException during the downloading of the file into the temp file, and clear it there.

The issue I am still having is during the process of when the temp file contents are being copied into the database in the loader.py script. This is when, if JobTimeoutException is raised, the temp file remains.

Currently I am debugging all of this with load_table and not load_csv.

@JVickery-TBS
Copy link
Contributor Author

Okay yeah we can fix this in Xloader. (e.g. open-data@62ed5a0#diff-69e6ff3cab84fe327b715b7b1d65f7cb9660b09e076ec36dbdb5e12ffeebe3f6)

Will make a PR next week if I have time. But we need to catch the rq timeout exception in a couple places and then just close the tmp file.

@JVickery-TBS
Copy link
Contributor Author

Let's see if it is fixed with this: #223

I have something similar in our fork and it is working on our staging branch. Generally I have only seen this issue with super large resources.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant