Issue #2278
closedRemove checksum_type from the srpm and drpm collections
Description
[UPDATE]
We need to write a migration which removes the checksum_type field from the srpm and drpm collections. This is similar to the work done with this migration[0].
[0]: https://github.com/pulp/pulp_rpm/pull/821/files
I'm running the process defined here: https://raw.githubusercontent.com/pulp/pulp/pulp-2.8.0-1/playpen/mongoengine/README and failing with errors. I've attached the log file to this issue.
[root@pulp-p01 tmp]# cat /etc/system-release
Scientific Linux release 6.6 (Carbon)
[root@pulp-p01 tmp]# uname -a
Linux pulp-p01.xxx.xxx.xxx 2.6.32-504.8.1.el6.x86_64 #1 SMP Tue Jan 27 13:39:10 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
[root@pulp-p01 tmp]# rpm -qa | grep pulp
python-pulp-bindings-2.7.1-1.el6.noarch
pulp-puppet-consumer-extensions-2.7.1-1.el6.noarch
python-pulp-repoauth-2.7.1-1.el6.noarch
pulp-puppet-plugins-2.7.1-1.el6.noarch
pulp-consumer-client-2.7.1-1.el6.noarch
pulp-rpm-yumplugins-2.7.1-1.el6.noarch
python-pulp-common-2.7.1-1.el6.noarch
pulp-selinux-2.7.1-1.el6.noarch
pulp-puppet-handlers-2.7.1-1.el6.noarch
python-pulp-puppet-common-2.7.1-1.el6.noarch
python-pulp-agent-lib-2.7.1-1.el6.noarch
pulp-rpm-consumer-extensions-2.7.1-1.el6.noarch
m2crypto-0.21.1.pulp-8.el6.x86_64
python-pulp-oid_validation-2.7.1-1.el6.noarch
pulp-rpm-plugins-2.7.1-1.el6.noarch
python-isodate-0.5.0-4.pulp.el6.noarch
python-kombu-3.0.24-10.pulp.el6.noarch
pulp-agent-2.7.1-1.el6.noarch
mod_wsgi-3.4-2.pulp.el6.x86_64
python-pulp-rpm-common-2.7.1-1.el6.noarch
pulp-server-2.7.1-1.el6.noarch
python-pulp-client-lib-2.7.1-1.el6.noarch
pulp-rpm-handlers-2.7.1-1.el6.noarch
Any guidance would be greatly appreciated. Thanks!
Files
Related issues
Updated by bmbouter over 8 years ago
In your case there is a data quality problem in your database that needs to be corrected. Two of your repositories have a 'null' as their repo id. Pulp needs all repos to have a unique value for repo_id. This was likely created due to a Pulp bug, but to upgrade your database to newer Pulp code you'll need to resolve this.
The easy way is to remove one or more of the offending repositories and then re-run the validation tool. You may try to look at the detail view of a repository to determine if the repo_id field is missing or null.
Updated by dad264 about 8 years ago
I've taken a look at the collection inside of pulp_database. None of the repositories have the field name repo_id. Used: db.repos.find( { repo_id: { $exists: true }} )
Example repository:
{
"_id" : ObjectId("553145f6dfb891098394f162"),
"_ns" : "repos",
"content_unit_counts" : {
"distribution" : 1,
"erratum" : 4807,
"package_category" : 15,
"package_group" : 212,
"rpm" : 5023
},
"description" : null,
"display_name" : "SL66-os-i386",
"id" : "SL66-os-i386",
"last_unit_added" : ISODate("2015-06-13T18:18:54.064Z"),
"last_unit_removed" : null,
"notes" : {
"_repo-type" : "rpm-repo"
},
"scratchpad" : {
"checksum_type" : "sha256"
}
}
I also took a look for duplicate values for _id or id and found none:
> db.repos.aggregate([ { $group: { _id: { _id: "$_id" }, uniqueIds: { $addToSet: "$_id" }, count: { $sum: 1 }}}, { $match: { count: { $gt: 1 }}} ])
{ "result" : [ ], "ok" : 1 }
> db.repos.aggregate([ { $group: { _id: { id: "$id" }, uniqueIds: { $addToSet: "$_id" }, count: { $sum: 1 }}}, { $match: { count: { $gt: 1 }}} ])
{ "result" : [ ], "ok" : 1 }
Do you think I should add the field "repo_id" to all repositories?
Thanks for the input!
Updated by amacdona@redhat.com about 8 years ago
It looks like the upgrade itself is failing before logging is initialized (line 4).
I am curious what version of docker you are running, and if it came with Scientific Linux or if you installed it yourself? I have some debugging ideas, it might be easiest to talk this through in IRC, #pulp on freenode, feel free to ping me (asmacdo).
Updated by bmbouter about 8 years ago
@asmacdo makes a great point. The logging is failing to be initialized. _logger.critical is None. Why is that?
Can you list your indexes on the repos collection and verify the name of the field we are looking for?
I think the field i named 'id'. Not to be confused with '_id'. Once the field name is known based on your index config we need to find the duplicates. Your commands look better than what I could produce, but the index must have a duplicate. I think it's going to have two documents in the repos collection that have 'id' unset. Maybe search using not $exists?
Post more ideas or questions and I'll see them.
Updated by dad264 about 8 years ago
This has been all my fault the entire time. I wasn't running docker on Fedora/CentOS, I was running it on OpenSUSE. I didn't realize what trouble that would cause.
After running the tool on CentOS7, I had a proper run. Log attached.
There are some errors: "The field 'checksum_type' does not exist on the document 'SRPM'"
Should I open a new Issue for this message? I found https://pulp.plan.io/issues/1754, but didn't see a solution that I should implement myself.
Thanks, and sorry for wasting your time on my mistake!
Updated by bmbouter about 8 years ago
- Subject changed from 2.7 -> 2.8 Validation to Remove checksum_type from the srpm and drpm collections
- Description updated (diff)
With some help from some other devs on IRC, we believe the issue is that your database contains 'checksum_type' fields on your SRPM documents.
You could manually delete the checksum_type field from all of your SRPM documents. Note this is different than checksumtype which should be left alone. Remember to backup your database.
Updated by bmbouter about 8 years ago
- Related to Issue #1754: Repo sync with fails on an upgraded pulp server added
Updated by bmbouter about 8 years ago
- Priority changed from Normal to High
- Severity changed from 2. Medium to 3. High
Updated by dad264 about 8 years ago
I've deleted the checksum_type fields and testing came back clean. I'll begin upgrading our pulp system.
Thank you for all of your help!
Updated by amacdona@redhat.com about 8 years ago
- Sprint/Milestone set to 27
- Triaged changed from No to Yes
Updated by ipanova@redhat.com about 8 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to ipanova@redhat.com
Updated by ipanova@redhat.com about 8 years ago
- Status changed from ASSIGNED to POST
Added by ipanova@redhat.com about 8 years ago
Updated by ipanova@redhat.com about 8 years ago
- Status changed from POST to MODIFIED
Applied in changeset pulp_rpm:23106f8dbc91545d1447656739964daea8d0b068.
Updated by semyers about 8 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Remove no longer used checksum_type field from srpm/drpm collections.
closes #2278 https://pulp.plan.io/issues/2278