Project

Profile

Help

Issue #7842

'table db_info already exists' on consecutive migrations

Added by adam.winberg@smhi.se 6 months ago. Updated 4 months ago.

Status:
CLOSED - DUPLICATE
Priority:
Normal
Assignee:
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Platform Release:
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

On a fresh pulp3 installation, running consecutive 2to3 migrations results in following error (first migration ran without error):

Nov 16 12:48:54 rq[657971]: Traceback (most recent call last): 
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/rq/worker.py", line 936, in perform_job
Nov 16 12:48:54 rq[657971]:     rv = job.perform() 
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/rq/job.py", line 684, in perform
Nov 16 12:48:54 rq[657971]:     self._result = self._execute()
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/rq/job.py", line 690, in _execute
Nov 16 12:48:54 rq[657971]:     return self.func(*self.args, **self.kwargs)
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/tasks/migrate.py", line 140, in migrate_from_pulp2 
Nov 16 12:48:54 rq[657971]:     create_repoversions_publications_distributions(plan)
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/migration.py", line 293, in create_repoversions_publications_distributions
Nov 16 12:48:54 rq[657971]:     task_func(*task_args)
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/migration.py", line 187, in simple_plugin_migration
Nov 16 12:48:54 rq[657971]:     migrate_repo_distributor(dist_migrator, progress_dist, pulp2_dist)
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/migration.py", line 391, in migrate_repo_distributor
Nov 16 12:48:54 rq[657971]:     pulp2dist, repo_version)
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/pulp_2to3_migration/app/plugin/rpm/repository.py", line 74, in migrate_to_pulp3
Nov 16 12:48:54 rq[657971]:     publish(repo_version.pk, checksum_types=checksum_types)
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/publishing.py", line 290, in publish
Nov 16 12:48:54 rq[657971]:     metadata_signing_service=metadata_signing_service
Nov 16 12:48:54 rq[657971]:   File "/usr/lib/python3.6/site-packages/pulp_rpm/app/tasks/publishing.py", line 343, in create_repomd_xml
Nov 16 12:48:54 rq[657971]:     pri_db = cr.PrimarySqlite(pri_db_path)
Nov 16 12:48:54 rq[657971]:   File "/usr/lib64/python3.6/site-packages/createrepo_c/__init__.py", line 202, in __init__
Nov 16 12:48:54 rq[657971]:     Sqlite.__init__(self, path, DB_PRIMARY)
Nov 16 12:48:54 rq[657971]: createrepo_c.CreaterepoCError: Can not create db_info table: table db_info already exists

This is while publishing a 'frozen' rpm repository, i.e. a repo without a feed which we manually copy content to when needed. The content in the repo had changed (content added) between migrations but I don't see how that could be a problem. The error appears at every migration attempt now, while trying to publish the same repo.

Running in an rpm-based installation on RHEL8: python3-pulp-rpm-3.7.0-1.el8.noarch python3-pulpcore-3.7.3-1.el8.noarch python3-pulp-2to3-migration-0.5.1-1.el8.noarch


Related issues

Related to Migration Plugin - Issue #7851: don't generate sqlite db files for yum metadata if pulp2 exporter didn't use generate themCLOSED - CURRENTRELEASE<a title="Actions" class="icon-only icon-actions js-contextmenu" href="#">Actions</a>

History

#1 Updated by dalley 6 months ago

If I had to guess, this is a problem with working directory management. We give createrepo_c a path to init a database file but the path already exists.

But beyond that, we probably shouldn't be generating the sqlite databases to begin with. https://pulp.plan.io/issues/7851 will likely fix the problem for your use case.

#2 Updated by dalley 6 months ago

  • Related to Issue #7851: don't generate sqlite db files for yum metadata if pulp2 exporter didn't use generate them added

#3 Updated by dalley 6 months ago

I attempted to make a reproducer script, but wasn't able to reproduce. Am I missing some steps?

export BASE_ADDR=http://localhost:24817

pulp-admin rpm repo create --download-policy=on_demand --repo-id zoo --feed https://fixtures.pulpproject.org/rpm-unsigned/
pulp-admin rpm repo sync run --repo-id zoo

pulp-admin rpm repo create --repo-id new
pulp-admin rpm repo copy rpm --from-repo-id zoo --to-repo-id new --str-eq name=dog

http POST :24817/pulp/api/v3/migration-plans/ plan='{"plugins": [{"type": "rpm"}]}'

export PLAN_HREF=$(http $BASE_ADDR/pulp/api/v3/migration-plans/ | jq -r '.results[0] | .pulp_href')

http POST :24817${PLAN_HREF}run/

pulp-admin rpm repo copy rpm --from-repo-id zoo --to-repo-id new --str-eq name=bear

http POST :24817${PLAN_HREF}run/

#4 Updated by adam.winberg@smhi.se 6 months ago

Possibly a 'publish' in pulp2 after your last 'pulp-admin' command?

Though I'm not sure what triggers this, but when I encountered it the repo that pulp3 was trying to publish had been updated with new content and published in pulp2.

#5 Updated by adam.winberg@smhi.se 5 months ago

I've reran migrations and not been able to reproduce this again. Not sure what triggers it. Now i instead run into #7876 which happen before publishing, so it may be that the migration fails before I get to this issues stage.

#6 Updated by adam.winberg@smhi.se 5 months ago

nvm, now I got the error again. The big change this time compared to the successful migration I ran yesterday is that we added two new repos to pulp2 - one with a feed and one without a feed ('postgres13' and 'frozen-postgres13'). 'postgres13' was synced and content was copied to the frozen repo and both repos were published.

And this morning when I tried a migration I once again get createrepo_c.CreaterepoCError: Can not create db_info table: table db_info already exists

#7 Updated by dalley 5 months ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to dalley

#8 Updated by dalley 4 months ago

Hey, sorry for the delay. I picked this up just before leaving for winter shutdown.

The second case seems like it might be simpler so let's focus on that one. Just to be sure I'm understanding the sequence of events correctly:

  1. You added a new repository with a feed, and synced it into Pulp 2
  2. You created a new repository without a feed, and copied content from the first repo.
  3. Both repositories were published (for the first time)
  4. Neither repository has ever been migrated at this point, they were both created after the last migration.
  5. They are migrated to Pulp 3, and the migration fails, and fails constantly ever after

I updated my script and unfortunately I'm still having trouble reproducing.

Questions:

  • Can you provide the postgresql 13 repo URL that you used?
  • Did you copy all content into the frozen repo, or just some of the content?
  • Does the migration plan just ask for all RPM repositories to be migrated, or does it explicitly list them?

Assumptions (let me know if they aren't valid):

  • Pulp 3 doesn't have any "virgin" repos that were not migrated from Pulp 2?
  • Pulp 2 has several repos besides these two in question, and Pulp 3 has migrated copies of those?
pulp-admin rpm repo create --download-policy=on_demand --repo-id postgresql13 --feed https://download.postgresql.org/pub/repos/yum/13/redhat/rhel-8.3-x86_64/
pulp-admin rpm repo sync run --repo-id postgresql13

pulp-admin rpm repo create --repo-id frozen-postgresql13
pulp-admin rpm repo copy rpm --from-repo-id postgresql13 --to-repo-id frozen-postgresql13
pulp-admin rpm repo publish run --repo-id frozen-postgresql13


http POST :24817/pulp/api/v3/migration-plans/ plan='{"plugins": [{"type": "rpm"}]}'


export PLAN_HREF=$(http $BASE_ADDR/pulp/api/v3/migration-plans/ | jq -r '.results[0] | .pulp_href')


http POST :24817${PLAN_HREF}run/

#9 Updated by adam.winberg@smhi.se 4 months ago

  1. You added a new repository with a feed, and synced it into Pulp 2
  2. You created a new repository without a feed, and copied content from the first repo.
  3. Both repositories were published (for the first time)
  4. Neither repository has ever been migrated at this point, they were both created after the last migration.
  5. They are migrated to Pulp 3, and the migration fails, and fails constantly ever after

Yes, this is correct.

Questions:

  • Can you provide the postgresql 13 repo URL that you used?

https://yum.postgresql.org/13/redhat/rhel-8-x86_64/

  • Did you copy all content into the frozen repo, or just some of the content?

All content.

  • Does the migration plan just ask for all RPM repositories to be migrated, or does it explicitly list them?

No explicit list in the migration plan, just everything from the rpm plugin.

Assumptions (let me know if they aren't valid):

  • Pulp 3 doesn't have any "virgin" repos that were not migrated from Pulp 2?

Correct, no actions made to the pulp3 installation besides the migrations.

  • Pulp 2 has several repos besides these two in question, and Pulp 3 has migrated copies of those?

Correct.

#10 Updated by dalley 4 months ago

So I still haven't reproduced it but I think I figured out what is going on anyways.

The publish task creates a temporary working directory to work with the metadata files it's constructing. The name of this temporary working directory is constructed from the hostname of the worker and the task ID. During normal publishes and complex migrations this will always be unique because new tasks are spawned for each individual publish op. Not the case for "simple" migrations which means "just migrate everything". In that case, it's running publish() repeatedly from the same task, which means the working directory is constructed with the same name, which is probably why we're getting file name collisions.

If this is what is happening, the next time it happens, try clearing /tmp/ and see if it works.

Refactoring the codepaths for "simple" migrations was planned anyways so we'll keep this in mind when doing so.

#11 Updated by dalley 4 months ago

Alternatively (or additionally)? We should probably make sure that WorkingDirectory() doesn't silently re-use an existing directory.

#12 Updated by dalley 4 months ago

Eh, there might be a little more to this. It looks like the directories should be cleaned up automatically. I'll keep investigating.

#13 Updated by dalley 4 months ago

  • Status changed from ASSIGNED to CLOSED - DUPLICATE

This PR should fix the problem if my analysis was correct. I'll close this issue for now but if you experience it again (once the new version lands and you upgrade) please re-open it.

https://github.com/pulp/pulp-2to3-migration/pull/290

Please register to edit this issue

Also available in: Atom PDF