Issue #2552
closedupdating ostree rpm gives error on sync: LibError: GLib.Error('No such file or directory', 'g-io-error-quark', 1)
Description
Running pulp-2.11.0 I had celery worker crashing. A bugzilla indicated that perhaps updating the ostree rpm would correct this problem. I updated to ostree-2016.15. A sync generated this in /var/log/messages
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) Exception caught from plugin during publish for repo [examplecorp-Red_Hat_Enterprise_Lin\
ux_Atomic_Host-Red_Hat_Enterprise_Linux_Atomic_Host_Trees]
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) Traceback (most recent call last):
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) File "/usr/lib/python2.7/site-packages/pulp/server/controllers/repository.py", line 12\
39, in _do_publish
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) publish_report = publish_repo(transfer_repo, conduit, call_config)
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 673, in wrap_\
f
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) return f(*args, **kwargs)
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) File "/usr/lib/python2.7/site-packages/pulp_ostree/plugins/distributors/web.py", line \
87, in publish_repo
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) return self._publisher.process_lifecycle()
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py", line 566, i\
n process_lifecycle
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) super(PluginStep, self).process_lifecycle()
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py", line 163, i\
n process_lifecycle
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) step.process()
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py", line 253, i\
n process
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) self._process_block()
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) File "/usr/lib/python2.7/site-packages/pulp/plugins/util/publish_step.py", line 297, i\
n _process_block
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) self.process_main()
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) File "/usr/lib/python2.7/site-packages/pulp_ostree/plugins/distributors/steps.py", lin\
e 84, in process_main
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) repository.pull_local(unit.storage_path, [unit.commit], self.depth)
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) File "/usr/lib/python2.7/site-packages/pulp_ostree/plugins/lib.py", line 30, in _fn
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) raise LibError(repr(ge))
Jan 30 19:22:05 devel pulp: pulp.server.controllers.repository:ERROR: (17294-89056) LibError: GLib.Error('No such file or directory', 'g-io-error-quark', 1)
Updated by bmbouter almost 8 years ago
- Description updated (diff)
The BZ number is in the Bugzilla field, so I'm removing the BZ link from the description.
Updated by ipanova@redhat.com almost 8 years ago
We had similar issue time ago, maybe they are related https://pulp.plan.io/issues/1722
Updated by bizhang almost 8 years ago
- Sprint/Milestone set to 32
- Triaged changed from No to Yes
Updated by jortel@redhat.com almost 8 years ago
- Status changed from NEW to ASSIGNED
- Assignee set to jortel@redhat.com
- Sprint/Milestone deleted (
32)
Updated by jortel@redhat.com almost 8 years ago
- Sprint/Milestone set to 32
Okay, no idea how I cleared the Sprint/Milestone in the previous edit. Putting it back.
Updated by jortel@redhat.com almost 8 years ago
The worker SEGFAULT caused by a bug in libostree is corrupting the pulp "backing" ostree repository causing:
LibError: GLib.Error('No such file or directory', 'g-io-error-quark', 1)
to be raised during publishing.
Both the RHEL7 atomic channel and Satellite 6 channels have really old versions of ostree. The segfault can be reproduced using ostree 2016.5.3. But, so for, I have not been able to reproduce the segfault with 2017.1. Discussed with Colin Walters (atomic team) and he assessed the risk of satellite providing ostree 2017.1 on EL7 to be low. Given this issue appears to be fixed, I will recommend this.
The next step is to reproduce the broken repository and try to fix in ways less drastic than deleting the corrupted repository. I'm hoping that running something like:
ostree prune
may have a side effect of repairing the broken repository. If so, it would not be a bad idea to have the pulp ostree plugin run this before every sync for 2 reasons:
- The prune removes un-referenced objects which will recover wasted disk space.
- Repair broken repository
The 2nd has not yet been proven.
Stay tuned....
Updated by jortel@redhat.com almost 8 years ago
No luck using "ostree prune" . Sent email to atomic group asking for help. The main reason I don't want to delete/re-create the repository is that deleting the repository will create broken links in the published repositories. In any case, we'd need log the path to the "backing" repository.
Updated by jortel@redhat.com almost 8 years ago
- Status changed from ASSIGNED to POST
Updated by jortel@redhat.com almost 8 years ago
The approach to resolve this includes both an attempt to prevent the local repository from being corrupted and providing safe tooling to repair the local repository after is has been corrupted.
I have seen the local repository get corrupted when a pull is interrupted by a connection reset by peer. Then, the subsequent publish (pull-local) crashes the celery worker.
- recommend users use libostree 2017.1 or newer.
- use the --repair | -r sync option to fix corrupted repositories. Or, repair=True using the API.
Updated by jortel@redhat.com almost 8 years ago
I will continue to pursue this with the atomic team. I can provide them a corrupted repository in hopes they can prevent the corruption in libostree.
Added by jortel@redhat.com almost 8 years ago
Updated by jortel@redhat.com almost 8 years ago
- Status changed from POST to MODIFIED
Applied in changeset f7806457abcb020597d2df0ae20ac78bd87003d1.
Updated by bizhang almost 8 years ago
- Status changed from 5 to CLOSED - CURRENTRELEASE
Support local repository repair. closes #2552