-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jobs becoming stuck in processing state #2413
Comments
Can you run the stdump utility when everything is stuck and post the results here? It will dump all the managed stack traces and we'll see the actual methods. |
|
Thank you for the stack traces, but unfortunately the list is not full as it's cut off at the end. May I ask you to post the full stack trace? It shows something, but information is not full. |
Above should be the entire file. |
Thanks for the details. I see a lot of Hangfire's workers are free to receive new tasks, and only 3 of them are busy by doing something on a thread pool (so they are probably async methods). What's strange is that Unfortunately it's really difficult to get information about queued tasks in the Thread Pool, since walking GC heap is required, and this feature is not implemented yet in It's possible to enable this feature by providing a .AddHangfireServer(x => x.TaskScheduler = null) |
We are seeing an issue where jobs are becoming stuck randomly in the processing state. We are not seeing any exceptions, nor have any long-running query or process that would cause the job to remain in the processed state for any considerable length of time, yet the job will sit in the Processing state until manually re-queued.
This issue brings Hangfire to a halt; once all workers have a job in this state, no more work can be done.
We are using Hangfire Pro and Redis. We updated to the latest Hangfire libraries and are running .NET Framework 4.6.1.
Upon re-queueing the job is completes without an issue. We run tens of thousands of jobs daily across a set of 6 servers.
We did want to set the VisibilityTimeout, but we found that it only applies to when the service is reporting an unhealthy state.
What is the debug path to see why Hangfire appears to not get the update for these? What options do we have?
The text was updated successfully, but these errors were encountered: