Project

Profile

Help

Issue #2574

pulp-admin sometimes errors when watching tasks

Added by Ichimonji10 almost 5 years ago. Updated almost 3 years ago.

Status:
CLOSED - WONTFIX
Priority:
Low
Assignee:
-
Start date:
Due date:
Estimated time:
Severity:
1. Low
Version:
Platform Release:
OS:
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

Under both Pulp 2.11 and 2.12, and on all supported platforms, Puppet repository syncs sometimes fail. This failure can be triggered somewhat reliably by Pulp Smash, by executing the following command:

python -m unittest pulp_smash.tests.puppet.cli.test_sync

The test executed by Pulp Smash is pretty simple. It essentially does the following, which can also be used to trigger the failure:

#!/usr/bin/env bash

pulp-admin puppet repo create --repo-id foo --feed http://forge.puppetlabs.com --queries "pulp-pulp-1.0.0"
pulp-admin puppet repo sync run --repo-id foo
pulp-admin puppet repo delete --repo-id foo

For some reason, only the former command causes any errors to be logged. Here's an example of the error:

Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040) Task pulp.server.managers.repo.publish.publish[b4591031-d99b-4ba6-93dd-d04df64446a9] raised unexpected: MissingResource({'resource_id': {'repository': 
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040) Traceback (most recent call last):
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 240, in trace_task
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)     R = retval = fun(*args, **kwargs)
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 505, in __call__
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)     return super(Task, self).__call__(*args, **kwargs)
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)   File "/usr/lib/python2.7/site-packages/pulp/server/async/tasks.py", line 108, in __call__
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)     return super(PulpTask, self).__call__(*args, **kwargs)
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)   File "/usr/lib/python2.7/site-packages/celery/app/trace.py", line 438, in __protected_call__
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)     return self.run(*args, **kwargs)
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)   File "/usr/lib/python2.7/site-packages/pulp/server/controllers/repository.py", line 1086, in publish
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)     repo_obj = model.Repository.objects.get_repo_or_missing_resource(repo_id)
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)   File "/usr/lib/python2.7/site-packages/pulp/server/db/querysets.py", line 139, in get_repo_or_missing_resource
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040)     raise pulp_exceptions.MissingResource(repository=repo_id)
Feb 07 15:59:19 fedora-24-pulp-2-12 pulp[31854]: celery.worker.job:ERROR: (31854-47040) MissingResource: Missing resource(s): repository=704a2ec0-2946-46af-9b4c-c76a3558c214

As you can see, the target repository used for a publish apparently doesn't exist. Weird. This is so weird that I've taken two videos of the issue, in an attempt to better describe exactly what's going on. The video files are attached.

As mentioned, this issue occurs for at least Pulp 2.11 and 2.12, on a variety of platforms. For the record, here's the relevant packages installed on one of my test systems:

[root@fedora-24-pulp-2-12 ~]# rpm -qa | grep -i pulp | sort
pulp-admin-client-2.12.1-0.1.alpha.git.46.47a425e.fc24.noarch
pulp-docker-admin-extensions-2.3.1-0.1.alpha.git.1.ad83c84.fc24.noarch
pulp-docker-plugins-2.3.1-0.1.alpha.git.1.ad83c84.fc24.noarch
pulp-ostree-admin-extensions-1.2.1-0.1.alpha.git.22.19760f7.fc24.noarch
pulp-ostree-plugins-1.2.1-0.1.alpha.git.22.19760f7.fc24.noarch
pulp-puppet-admin-extensions-2.12.1-0.1.alpha.git.1.d407541.fc24.noarch
pulp-puppet-plugins-2.12.1-0.1.alpha.git.1.d407541.fc24.noarch
pulp-python-admin-extensions-2.0.1-0.1.alpha.git.2.be7d11e.fc24.noarch
pulp-python-plugins-2.0.1-0.1.alpha.git.2.be7d11e.fc24.noarch
pulp-rpm-admin-extensions-2.12.1-0.1.alpha.git.11.bdbacef.fc24.noarch
pulp-rpm-plugins-2.12.1-0.1.alpha.git.11.bdbacef.fc24.noarch
pulp-selinux-2.12.1-0.1.alpha.git.46.47a425e.fc24.noarch
pulp-server-2.12.1-0.1.alpha.git.46.47a425e.fc24.noarch
python-kombu-3.0.33-6.pulp.fc24.noarch
python-pulp-bindings-2.12.1-0.1.alpha.git.46.47a425e.fc24.noarch
python-pulp-client-lib-2.12.1-0.1.alpha.git.46.47a425e.fc24.noarch
python-pulp-common-2.12.1-0.1.alpha.git.46.47a425e.fc24.noarch
python-pulp-docker-common-2.3.1-0.1.alpha.git.1.ad83c84.fc24.noarch
python-pulp-oid_validation-2.12.1-0.1.alpha.git.46.47a425e.fc24.noarch
python-pulp-ostree-common-1.2.1-0.1.alpha.git.22.19760f7.fc24.noarch
python-pulp-puppet-common-2.12.1-0.1.alpha.git.1.d407541.fc24.noarch
python-pulp-python-common-2.0.1-0.1.alpha.git.2.be7d11e.fc24.noarch
python-pulp-repoauth-2.12.1-0.1.alpha.git.46.47a425e.fc24.noarch
python-pulp-rpm-common-2.12.1-0.1.alpha.git.11.bdbacef.fc24.noarch
python-pulp-streamer-2.12.1-0.1.alpha.git.46.47a425e.fc24.noarch
cli.ogv (18.8 MB) cli.ogv Ichimonji10, 02/07/2017 10:28 PM
pulp-smash.ogv (38.5 MB) pulp-smash.ogv Ichimonji10, 02/07/2017 10:31 PM
fedora-24-pulp-2-13.log (33.9 KB) fedora-24-pulp-2-13.log Ichimonji10, 02/08/2017 09:01 PM

History

#1 Updated by Ichimonji10 almost 5 years ago

  • File deleted (api.ogv)

#3 Updated by mhrivnak almost 5 years ago

Just to ask the silly question: is there any chance the call to delete the repo is happening before the task finishes?

#4 Updated by ipanova@redhat.com almost 5 years ago

The delete call is triggered before the spawned publish task. From the provided info i assume you got auto-publish set to true.

#5 Updated by Ichimonji10 almost 5 years ago

Just to ask the silly question: is there any chance the call to delete the repo is happening before the task finishes?

I don't think this is the case, based on how the test is written. I'll take another look at journalctl's output and make sure.

The delete call is triggered before the spawned publish task.

What makes you think this?

From the provided info i assume you got auto-publish set to true.

Yes. I'm letting pulp-admin do whatever it does by default, which includes publishing.

#6 Updated by Ichimonji10 almost 5 years ago

This issue definitely has nothing to do with repositories being deleted. I'm running tests with the following modification to Pulp Smash:

diff --git a/pulp_smash/tests/puppet/cli/test_sync.py b/pulp_smash/tests/puppet/cli/test_sync.py
index bc6bb9e..c22772e 100644
--- a/pulp_smash/tests/puppet/cli/test_sync.py
+++ b/pulp_smash/tests/puppet/cli/test_sync.py
@@ -74,9 +74,6 @@ class SyncDownloadedContentTestCase(unittest.TestCase):
                 'pulp-admin puppet repo create '
                 '--repo-id {} --feed {} --queries {}'
             ).format(repo_id, PUPPET_FEED, PUPPET_QUERY).split())
-            self.addCleanup(client.run, (
-                'pulp-admin puppet repo delete --repo-id {}'
-            ).format(repo_id).split())
             client.run((
                 'pulp-admin puppet repo sync run --repo-id {}'
             ).format(repo_id).split())
@@ -131,7 +128,6 @@ class SyncFromPuppetForgeTestCase(unittest.TestCase):
         ).format(repo_id, PUPPET_FEED, PUPPET_QUERY)
         client.run(cmd.split())
         cmd = 'pulp-admin puppet repo delete --repo-id {}'.format(repo_id)
-        self.addCleanup(client.run, cmd.split())

         # Sync the repository.
         cmd = 'pulp-admin puppet repo sync run --repo-id {}'.format(repo_id)

...and the publish error still shows up.

#7 Updated by Ichimonji10 almost 5 years ago

I think I can explain the test failure. Consider the test script posted in the issue:

#!/usr/bin/env bash

pulp-admin puppet repo create --repo-id foo --feed http://forge.puppetlabs.com --queries "pulp-pulp-1.0.0" 
pulp-admin puppet repo sync run --repo-id foo
pulp-admin puppet repo delete --repo-id foo

When pulp-admin puppet repo sync run ... is executed, pulp-admin performs both a sync and a publish. Furthermore, pulp-admin waits for each of these tasks to complete before returning (zero) to the user. But what would happen if the wait-and-poll logic in pulp-admin was broken and pulp-admin returned while a sync was occurring? In that case, the pulp-admin puppet repo delete ... would execute. The whole sequence of events would look like this:

1. pulp-admin issues a synchronous "create repository" command.
2. pulp-admin issues asynchronous "sync repository" and "publish repository" commands. pulp-admin starts polling those tasks, encounters an error while polling the "sync repository" tasks, and returns.
3. pulp-admin issues a synchronous "delete repository" command. As soon as the "sync repository" command completes, the repository is deleted.
4. Pulp attempts to service the "publish repository" command, but the target repository is gone.

Much circumstantial evidence fits with this theory. First, when the following diff is applied to Pulp Smash...

diff --git a/pulp_smash/tests/puppet/cli/test_sync.py b/pulp_smash/tests/puppet/cli/test_sync.py
index bc6bb9e..1f89728 100644
--- a/pulp_smash/tests/puppet/cli/test_sync.py
+++ b/pulp_smash/tests/puppet/cli/test_sync.py
@@ -78,7 +78,7 @@ class SyncDownloadedContentTestCase(unittest.TestCase):
                 'pulp-admin puppet repo delete --repo-id {}'
             ).format(repo_id).split())
             client.run((
-                'pulp-admin puppet repo sync run --repo-id {}'
+                'pulp-admin -v puppet repo sync run --repo-id {}'
             ).format(repo_id).split())

         # Verify the number of puppet modules in each repository.
@@ -134,7 +134,7 @@ class SyncFromPuppetForgeTestCase(unittest.TestCase):
         self.addCleanup(client.run, cmd.split())

         # Sync the repository.
-        cmd = 'pulp-admin puppet repo sync run --repo-id {}'.format(repo_id)
+        cmd = 'pulp-admin -v puppet repo sync run --repo-id {}'.format(repo_id)
         completed_proc = client.run(cmd.split())

         # Verify no errors were reported.

...I get the following output:

$ python -m unittest pulp_smash.tests.puppet.cli.test_sync
/home/ichimonji10/.venvs/pulp-smash/lib/python3.5/site-packages/requests/packages/urllib3/connectionpool.py:852: InsecureRequestWarning: Unverified HTTPS request is being made. Adding certificate verification is strongly advised. See: https://urllib3.readthedocs.io/en/latest/advanced-usage.html#ssl-warnings
  InsecureRequestWarning)
E.
======================================================================
ERROR: test_sync_downloaded_content (pulp_smash.tests.puppet.cli.test_sync.SyncDownloadedContentTestCase)
Create two repositories with the same feed, and sync them serially.
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/ichimonji10/code/pulp-smash/pulp_smash/tests/puppet/cli/test_sync.py", line 79, in test_sync_downloaded_content
    ).format(repo_id).split())
  File "/home/ichimonji10/code/pulp-smash/pulp_smash/cli.py", line 242, in run
    return self.response_handler(completed_process)
  File "/home/ichimonji10/code/pulp-smash/pulp_smash/cli.py", line 78, in code_handler
    completed_proc.check_returncode()
  File "/home/ichimonji10/code/pulp-smash/pulp_smash/cli.py", line 153, in check_returncode
    self.stderr,
pulp_smash.exceptions.CalledProcessError: Command ['pulp-admin', '-v', 'puppet', 'repo', 'sync', 'run', '--repo-id', 'f603e575-1165-4afb-9f2e-af84c6790160'] returned non-zero exit status 70.

stdout: +----------------------------------------------------------------------+
    Synchronizing Repository [f603e575-1165-4afb-9f2e-af84c6790160]
+----------------------------------------------------------------------+

This command may be exited via ctrl+c without affecting the request.

Downloading metadata...
An unexpected error has occurred.More information may be found using the -vv
flag.

stderr: 2017-02-08 14:18:34,134 - ERROR - Client-side exception occurred
Traceback (most recent call last):
  File "/usr/lib/python2.7/site-packages/pulp/client/extensions/core.py", line 474, in run
    exit_code = Cli.run(self, args)
  File "/usr/lib/python2.7/site-packages/okaara/cli.py", line 974, in run
    exit_code = command_or_section.execute(self.prompt, remaining_args)
  File "/usr/lib/python2.7/site-packages/pulp/client/extensions/extensions.py", line 210, in execute
    return self.method(*arg_list, **clean_kwargs)
  File "/usr/lib/python2.7/site-packages/pulp/client/commands/repo/sync_publish.py", line 203, in run
    self.poll([sync_task], kwargs)
  File "/usr/lib/python2.7/site-packages/pulp/client/commands/polling.py", line 120, in poll
    task = self._poll_task(task)
  File "/usr/lib/python2.7/site-packages/pulp/client/commands/polling.py", line 211, in _poll_task
    self.progress(task, running_spinner)
  File "/usr/lib/python2.7/site-packages/pulp/client/commands/repo/sync_publish.py", line 102, in progress
    self.renderer.display_report(task.progress_report)
  File "/usr/lib/python2.7/site-packages/pulp_puppet/extensions/admin/repo/status.py", line 38, in display_report
    self._display_sync_metadata_step(sync_report)
  File "/usr/lib/python2.7/site-packages/pulp_puppet/extensions/admin/repo/status.py", line 66, in _display_sync_metadata_step
    item_type, self.sync_metadata_bar, sync_report.metadata_state)
  File "/usr/lib/python2.7/site-packages/pulp_puppet/extensions/admin/repo/status.py", line 240, in _render_itemized_in_progress_state
    progress_bar.render(items_done, items_total, message=bar_message)
  File "/usr/lib/python2.7/site-packages/okaara/progress.py", line 100, in render
    percentage = float(step) / float(total)
TypeError: float() argument must be a string or a number

----------------------------------------------------------------------
Ran 2 tests in 18.491s

FAILED (errors=1)

Also, all of the API tests succeed. Only CLI tests fail.

#8 Updated by Ichimonji10 almost 5 years ago

Also, no errors are logged by journalctl. I've attached journalctl --follow output from a system on which the error occurred.

#9 Updated by Ichimonji10 almost 5 years ago

  • Subject changed from `pulp-admin puppet repo sync run ...` periodically fails to pulp-admin sometimes errors when watching tasks

#10 Updated by bizhang almost 5 years ago

  • Priority changed from Normal to Low
  • Severity changed from 2. Medium to 1. Low
  • Triaged changed from No to Yes

#11 Updated by bmbouter almost 3 years ago

  • Status changed from NEW to CLOSED - WONTFIX

Pulp 2 is approaching maintenance mode, and this Pulp 2 ticket is not being actively worked on. As such, it is being closed as WONTFIX. Pulp 2 is still accepting contributions though, so if you want to contribute a fix for this ticket, please reopen or comment on it. If you don't have permissions to reopen this ticket, or you want to discuss an issue, please reach out via the developer mailing list.

#12 Updated by bmbouter almost 3 years ago

  • Tags Pulp 2 added

Also available in: Atom PDF