Issue #9011
closedbatch_regenerate_applicability tasks are never assigned a worker and pulp is stuck until restart
Description
Ticket moved to GitHub: "pulp/pulpcore/2024":https://github.com/pulp/pulpcore/issues/2024
Some information: We have observed this on various systems using pulp via Katello.
I actually saw this in Pulp 2.21.4, but that version is not available on plan.io. We also observed the issue in systems using 2.21.5.
Symptoms from Katello:¶
This happens sporadically with various RPM based repos, and usually not the same repo twice in a row. However it happens consistently enough for Katello systems with daily sync plans to get stuck essentially every day. From the Katello side the Katello sync task is simply stuck on the Actions::Pulp::Repository::RegenerateApplicability
for ever. Once pulp is restarted things are unstuck, and the next round of syncs succeeds.
Symptoms within Pulp:¶
It looks like the underlying batch_regenerate_applicability
Pulp tasks are never assigned to a worker. We can find several instances of the following tasks within mongo:
> db.task_status.find({"group_id": {$ne: null},"state": {$ne: "finished"}}).pretty()
{
"_id" : ObjectId("60e136cf61272888f8460e49"),
"task_id" : "280fcb08-db6f-4a96-9f9a-4146f59eb77e",
"exception" : null,
"task_type" : "pulp.server.managers.consumer.applicability.batch_regenerate_applicability",
"tags" : [ ],
"progress_report" : {
},
"worker_name" : null,
"group_id" : BinData(3,"DaJey7wqRG2XSnLKM8WS3g=="),
"finish_time" : null,
"start_time" : null,
"traceback" : null,
"spawned_tasks" : [ ],
"state" : "waiting",
"result" : null,
"error" : null
}