Issue #7809
closedtimeout on `Pulp2to3MigrationClient::Pulp2RepositoriesApi list({"offset"=>0, "limit"=>2000})`
Description
My Pulpcore worker has been dying with [CRITICAL] WORKER TIMEOUT (pid:15519)
when Katello tries to list 402 migrated repositories with the limit set to 2000. Reproducing the issue is a little bit strange, since I can list using the 2000 limit in the foreman Rails console but trying the same thing during our ImportMigration task results in the killed worker and 504 proxy error. I'm still trying to figure out the difference, but I do believe there's a Pulp bug in there somewhere due to the dead Pulpcore worker.
If it's necessary to reproduce with Katello:
- Create a Katello nightly production VM
- Sync the RHEL 7 Extras repo and put it in a content view
- Publish the resulting content view 400 times
- Run
foreman-rake katello:pulp3_migration
Also I'll note that memory usage is almost definitely not the issue here. I haven't caught any instances of high memory usage, at least.
Files
Updated by iballou about 4 years ago
(found a typo, I meant to say 502 instead of 504)
Updated by ttereshc about 4 years ago
I tried on a katello box but I used the python bindings. Pulp2RepositoriesApi(migration_client).list(offset=0, limit=2000)
.
It took few seconds but I didn't notice any issues, though it does create a short spike on the cpu load. I have 659 repos to list.
I tried it also when I kicked off a migration task. Nothing is failing so far but I'll keep trying at different stages.
Updated by iballou about 4 years ago
I've discovered that I am only able to reproduce this on Katello nightly production boxes after migrating >= 400 repositories. The issue is not reproducible on a development box, however. I'm not sure yet what difference would cause this issue, it's not Pulp plugin versions at least.
Updated by iballou about 4 years ago
I've found it's only reproducible if you do the following:
foreman-rake console
def api_client
Pulp2to3MigrationClient::ApiClient.new(SmartProxy.pulp_primary!.pulp3_configuration(Pulp2to3MigrationClient::Configuration))
end
def pulp2_repositories_api
Pulp2to3MigrationClient::Pulp2RepositoriesApi.new(api_client)
end
imported = Katello::Pulp3::Api::Core.fetch_from_list { |opts| pulp2_repositories_api.list(opts) }
Still trying to figure out why fetching with the above code causes the pulpcore worker to die.
Updated by jsherril@redhat.com about 4 years ago
- File fetch_distributions.txt fetch_distributions.txt added
- File migrated_repos.txt migrated_repos.txt added
- File plan.txt plan.txt added
Attaching some files showing:
- a migration plan
- the resulting listing of 2 pulp2repositories. Notice that all distributions are listed for each one
- a fetching of the distributions showing that they are using the same publication
It appears that the api is showing all distributions associated with the pulp3 version, not just the ones that are part of that pulp2 repository. I suspect this is slowing down the api considerably.
Updated by jsherril@redhat.com about 4 years ago
- Subject changed from Worker dies on `Pulp2to3MigrationClient::Pulp2RepositoriesApi list({"offset"=>0, "limit"=>2000})` to timeout on `Pulp2to3MigrationClient::Pulp2RepositoriesApi list({"offset"=>0, "limit"=>2000})`
Updated by ttereshc about 4 years ago
Thanks, jsherrill. Agreed.
Just adding a bit more detail here.
The likely problem is a serialization. It's very heavy in certain situations because of the following incorrect behaviour:
If pulp 2 has N copies of the same repository, it will be one repo version and one publication in pulp 3 which is good. It will create N distributions, which is correct. However, we serialize pulp2repository
in a way that it shows all the distributions for each publication, so it will be N distributions for each pulp 2 repo, which is wrong.
Migration plugin needs to show only distributions relevant to that pulp 2 repo.
We might need to add pulp3_distributions
relation to the pulp2repository model to resolve that.
Updated by ipanova@redhat.com about 4 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to ipanova@redhat.com
- Sprint set to Sprint 85
Added by ipanova@redhat.com about 4 years ago
Added by ipanova@redhat.com about 4 years ago
Revision 1533ea7d | View on GitHub
Fix distribution serialization.
Added by ipanova@redhat.com about 4 years ago
Revision 1533ea7d | View on GitHub
Fix distribution serialization.
Updated by ipanova@redhat.com about 4 years ago
- Status changed from ASSIGNED to MODIFIED
Applied in changeset pulp:pulp-2to3-migration|1533ea7d8d87c6365d2816d130f2b638d3cbc5f2.
Updated by ttereshc about 4 years ago
- Status changed from MODIFIED to CLOSED - CURRENTRELEASE
Fix distribution serialization.
closes #7809 https://pulp.plan.io/issues/7809