Project

Profile

Help

Issue #7809

closed

timeout on `Pulp2to3MigrationClient::Pulp2RepositoriesApi list({"offset"=>0, "limit"=>2000})`

Added by iballou about 4 years ago. Updated about 4 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Sprint/Milestone:
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Katello
Sprint:
Sprint 86
Quarter:

Description

My Pulpcore worker has been dying with [CRITICAL] WORKER TIMEOUT (pid:15519) when Katello tries to list 402 migrated repositories with the limit set to 2000. Reproducing the issue is a little bit strange, since I can list using the 2000 limit in the foreman Rails console but trying the same thing during our ImportMigration task results in the killed worker and 504 proxy error. I'm still trying to figure out the difference, but I do believe there's a Pulp bug in there somewhere due to the dead Pulpcore worker.

If it's necessary to reproduce with Katello:

  1. Create a Katello nightly production VM
  2. Sync the RHEL 7 Extras repo and put it in a content view
  3. Publish the resulting content view 400 times
  4. Run foreman-rake katello:pulp3_migration

Also I'll note that memory usage is almost definitely not the issue here. I haven't caught any instances of high memory usage, at least.


Files

fetch_distributions.txt (2.25 KB) fetch_distributions.txt jsherril@redhat.com, 11/10/2020 06:14 PM
migrated_repos.txt (133 KB) migrated_repos.txt jsherril@redhat.com, 11/10/2020 06:14 PM
plan.txt (252 KB) plan.txt jsherril@redhat.com, 11/10/2020 06:14 PM
Actions #1

Updated by iballou about 4 years ago

(found a typo, I meant to say 502 instead of 504)

Actions #2

Updated by ttereshc about 4 years ago

I tried on a katello box but I used the python bindings. Pulp2RepositoriesApi(migration_client).list(offset=0, limit=2000). It took few seconds but I didn't notice any issues, though it does create a short spike on the cpu load. I have 659 repos to list.

I tried it also when I kicked off a migration task. Nothing is failing so far but I'll keep trying at different stages.

Actions #3

Updated by iballou about 4 years ago

I've discovered that I am only able to reproduce this on Katello nightly production boxes after migrating >= 400 repositories. The issue is not reproducible on a development box, however. I'm not sure yet what difference would cause this issue, it's not Pulp plugin versions at least.

Actions #4

Updated by iballou about 4 years ago

I've found it's only reproducible if you do the following:

foreman-rake console

def api_client
  Pulp2to3MigrationClient::ApiClient.new(SmartProxy.pulp_primary!.pulp3_configuration(Pulp2to3MigrationClient::Configuration))
end

def pulp2_repositories_api
  Pulp2to3MigrationClient::Pulp2RepositoriesApi.new(api_client)
end

imported = Katello::Pulp3::Api::Core.fetch_from_list { |opts| pulp2_repositories_api.list(opts) }

Still trying to figure out why fetching with the above code causes the pulpcore worker to die.

Actions #5

Updated by jsherril@redhat.com about 4 years ago

Attaching some files showing:

  1. a migration plan
  2. the resulting listing of 2 pulp2repositories. Notice that all distributions are listed for each one
  3. a fetching of the distributions showing that they are using the same publication

It appears that the api is showing all distributions associated with the pulp3 version, not just the ones that are part of that pulp2 repository. I suspect this is slowing down the api considerably.

Actions #6

Updated by jsherril@redhat.com about 4 years ago

  • Subject changed from Worker dies on `Pulp2to3MigrationClient::Pulp2RepositoriesApi list({"offset"=>0, "limit"=>2000})` to timeout on `Pulp2to3MigrationClient::Pulp2RepositoriesApi list({"offset"=>0, "limit"=>2000})`
Actions #7

Updated by ttereshc about 4 years ago

Thanks, jsherrill. Agreed.

Just adding a bit more detail here. The likely problem is a serialization. It's very heavy in certain situations because of the following incorrect behaviour:
If pulp 2 has N copies of the same repository, it will be one repo version and one publication in pulp 3 which is good. It will create N distributions, which is correct. However, we serialize pulp2repository in a way that it shows all the distributions for each publication, so it will be N distributions for each pulp 2 repo, which is wrong.

Migration plugin needs to show only distributions relevant to that pulp 2 repo.

We might need to add pulp3_distributions relation to the pulp2repository model to resolve that.

Actions #8

Updated by dalley about 4 years ago

  • Triaged changed from No to Yes
Actions #9

Updated by ipanova@redhat.com about 4 years ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to ipanova@redhat.com
  • Sprint set to Sprint 85
Actions #10

Updated by rchan about 4 years ago

  • Sprint changed from Sprint 85 to Sprint 86
Actions #11

Updated by ipanova@redhat.com about 4 years ago

  • Status changed from ASSIGNED to MODIFIED
Actions #12

Updated by ttereshc about 4 years ago

  • Sprint/Milestone set to 0.6.0
Actions #13

Updated by ttereshc about 4 years ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Also available in: Atom PDF