Issue #4795
closedPulp2 task details information returns incorrect values when polled too quickly
Description
Problem¶
When attempting to sync a feed for a second time in automation/CI (not manually), the sync will return incorrect/invalid information for the task details failing the test "often".
Problem Breakout¶
There are two discrete issues here:
- `pulp-smash` is now/recently updated polling at a 0.3s rate that. For a test including a second repo sync, this will occasionally returning a task "COMPLETE" but the values for the task details had yet to be reset. The resulting return of non-zero values causes test failures. It appears that the ACTUAL change is none, however the task_details are reporting incorrectly. This was supposedly discovered long ago, but the legacy issue could not be found.
- Due to the above, the values of 'added_count', 'removed_count', 'updated_count' are being permanently updated to values in the db that are incorrect. Meaning, running `pulp-admin tasks details --task-id` of the automated task will result in the invalid values still being populated in the historic record. This appears to be related to legacy issue #4428
Recreation¶
The RCA seems to be from updating pulp-smash polling to a much quicker rate:
https://github.com/PulpQE/pulp-smash/blob/e385e6685a82709fd167b5374d19c4dfcfdb78c5/pulp_smash/api.py#L687
To test this, locally have pulp-smash installed and set to a VERY low polling rate. At the time of this writing, anything around `0.3` would fail 50-80% of the time. Adjusting to an even smaller value should increase the probabiltiy of failure on a non-patched system.
Note: This MUST be done in an automated fashion with pulp-smash due to the api polling. Testing this by hand or manually will NOT result in issue recreation.
Run this test through pytest with the following syntax in an appropriately prepared virtenv:
(pulp2) [herring@redherring api_v2]$ pwd
/home/herring/git/Pulp-2-Tests/pulp_2_tests/tests/rpm/api_v2
(pulp2) [herring@redherring api_v2]$ count=1; while [ $count -le 10 ]; do echo -e "Iteration: $count\n"; pytest -svv test_sync_publish.py::SyncRpmRepoTestCase::test_no_change_in_second_sync --disable-warnings; ((count++)); done
A loop value of 10 was chosen to ensure math was easy.
It is expected there will be a 100% pass rate. At this time, even with larger values around 2 seconds, there is still ~10% chance of failure.
Also, running the following on a passing Task ID should NOT result in incorrect values for the task:
[root@rhel76 ~]# pulp-admin tasks details --task-id d0140034-e1ca-4d1e-a34f-7fdf1d17a697
Where the `--task-id` is appropriate for the jobs being ran.
QE Workaround in place¶
For the time being, the values for pulp-smash when used on Pulp2 polling can be increase to keep these failures from happening.
Once this task issues are resolved, pulp-smash can have the values restored to lower polling values.
References¶
- Additional Reference needed here from @dkliban about how the values are calculated
https://pulp.plan.io/issues/4428#note-17 - Original investigation information moved down in other Notes to retain investigation history.
Related issues
Updated by bherring over 5 years ago
Additional References¶
There are two discrete issues here:
- `pulp-smash` is now polling at a 0.3s rate that is occasionally returning a task "COMPLETE" but the values for the task had yet to be reset
- The values of 'added_count', 'removed_count', 'updated_count' are being updated to values that aren't completed and set to the final values resulting in a `task_id` having invalid/incorrect values outlined in #4428
The RCA seems to be from updating pulp-smash polling to a much quicker rate:
For the time being, the values for pulp-smash when used on Pulp2 polling can be increase to keep these failures from happening.
Once this task issues are resolved, pulp-smash can have the values restored to lower polling values.
References¶
- Additional Reference needed here from @dkliban about how the values are calculated
- https://pulp.plan.io/issues/4428#note-17
Updated by bherring over 5 years ago
Problem¶
There is an inconsistent and relatively new regression on Pulp 2 Nightly CI:
- rpm.api_v2.test_sync_publish.SyncRpmRepoTestCase.test_no_change_in_second_sync
- https://github.com/PulpQE/Pulp-2-Tests/blob/e4699997370c734e94c43b956b44b088888b5d9e/pulp_2_tests/tests/rpm/api_v2/test_sync_publish.py#L144
- https://pulp-jenkins.rhev-ci-vms.eng.rdu2.redhat.com/view/Pulp%202%20-%20Master/job/pulp-2-master-dev-rhel7/441/testReport/
Failures can be seen in the second-sync Task Details. Historically, only a value of `0` are expected for the second-sync details for the verified fields:
- 'added_count'
- 'removed_count'
- 'updated_count'
The output of the test failure (including a negative value in this case for `removed_count` and non-zero for `added_count` as well as `task_id`) can be correlated with the CLI Task Output for the `task_id` showing non-zero values for the verified fields.
Test Failure¶
============================================================== FAILURES ===============================================================
________________________________________ SyncRpmRepoTestCase.test_02_no_change_in_second_sync _________________________________________
self = <pulp_2_tests.tests.rpm.api_v2.test_sync_publish.SyncRpmRepoTestCase testMethod=test_02_no_change_in_second_sync>
def test_02_no_change_in_second_sync(self):
"""Verify that syncing a second time has no changes.
If the repository have not changed then Pulp must state that anything
was changed when doing a second sync.
"""
report = sync_repo(self.cfg, self.repo)
tasks = tuple(api.poll_spawned_tasks(self.cfg, report.json()))
with self.subTest(comment='spawned tasks'):
self.assertEqual(len(tasks), 1)
for count_type in ('added_count', 'removed_count', 'updated_count'):
with self.subTest(comment=count_type):
> self.assertEqual(tasks[0]['result'][count_type], 0, tasks)
E AssertionError: 8 != 0 : ({'exception': None, 'task_type': 'pulp.server.managers.repo.sync.sync', '_href': '/pulp/api/v2/tasks/ab78afea-178b-4081-99a3-2e63d47915de/', 'task_id': 'ab78afea-178b-4081-99a3-2e63d47915de', 'tags': ['pulp:repository:9601ef5d-f90c-433f-b2e5-9e879ed0c199', 'pulp:action:sync'], 'finish_time': '2019-05-08T11:28:39Z', '_ns': 'task_status', 'start_time': '2019-05-08T11:28:32Z', 'traceback': None, 'spawned_tasks': [], 'progress_report': {'yum_importer': {'content': {'items_total': 0, 'state': 'FINISHED', 'error_details': [], 'details': {'rpm_total': 0, 'rpm_done': 0, 'drpm_total': 0, 'drpm_done': 0}, 'size_total': 0, 'size_left': 0, 'items_left': 0}, 'comps': {'state': 'FINISHED'}, 'purge_duplicates': {'state': 'FINISHED'}, 'distribution': {'items_total': 0, 'state': 'FINISHED', 'error_details': [], 'items_left': 0}, 'modules': {'state': 'FINISHED'}, 'errata': {'state': 'FINISHED'}, 'metadata': {'state': 'FINISHED'}}}, 'queue': 'reserved_resource_worker-0@rhel76-218.dq2', 'state': 'finished', 'worker_name': 'reserved_resource_worker-0@rhel76-218', 'result': {'result': 'success', 'importer_id': 'yum_importer', 'exception': None, 'repo_id': '9601ef5d-f90c-433f-b2e5-9e879ed0c199', 'traceback': None, 'started': '2019-05-08T11:28:32Z', '_ns': 'repo_sync_results', 'completed': '2019-05-08T11:28:39Z', 'importer_type_id': 'yum_importer', 'error_message': None, 'summary': {'modules': {'state': 'FINISHED'}, 'content': {'state': 'FINISHED'}, 'comps': {'state': 'FINISHED'}, 'purge_duplicates': {'state': 'FINISHED'}, 'distribution': {'state': 'FINISHED'}, 'errata': {'state': 'FINISHED'}, 'metadata': {'state': 'FINISHED'}}, 'added_count': 8, 'removed_count': -8, 'updated_count': 0, 'id': '5cd2bd676f8f3411133bf898', 'details': {'modules': {'state': 'FINISHED'}, 'content': {'size_total': 0, 'items_left': 0, 'items_total': 0, 'state': 'FINISHED', 'size_left': 0, 'details': {'rpm_total': 0, 'rpm_done': 0, 'drpm_total': 0, 'drpm_done': 0}, 'error_details': []}, 'comps': {'state': 'FINISHED'}, 'purge_duplicates': {'state': 'FINISHED'}, 'distribution': {'items_total': 0, 'state': 'FINISHED', 'error_details': [], 'items_left': 0}, 'errata': {'state': 'FINISHED'}, 'metadata': {'state': 'FINISHED'}}}, 'error': None, '_id': {'$oid': '5cd2bd60cfa8faeb309e9677'}, 'id': '5cd2bd60cfa8faeb309e9677'},)
test_sync_publish.py:156: AssertionError
CLI Task Failure Output from Second Sync¶
[root@rhel76 ~]# pulp-admin tasks details --task-id d0140034-e1ca-4d1e-a34f-7fdf1d17a697
+----------------------------------------------------------------------+
Task Details
+----------------------------------------------------------------------+
Operations: sync
Resources: 2cbd8fc6-4c37-445d-bfc3-d38a93cc98d1 (repository)
State: Successful
Start Time: 2019-05-07T19:15:43Z
Finish Time: 2019-05-07T19:15:49Z
Result:
Added Count: 8
Completed: 2019-05-07T19:15:49Z
Details:
Comps:
State: FINISHED
Content:
Details:
Drpm Done: 0
Drpm Total: 0
Rpm Done: 0
Rpm Total: 0
Error Details:
Items Left: 0
Items Total: 0
Size Left: 0
Size Total: 0
State: FINISHED
Distribution:
Error Details:
Items Left: 0
Items Total: 0
State: FINISHED
Errata:
State: FINISHED
Metadata:
State: FINISHED
Modules:
State: FINISHED
Purge Duplicates:
State: FINISHED
Error Message: None
Exception: None
Id: 5cd1d965719c8d17a0474cc4
Importer Id: yum_importer
Importer Type Id: yum_importer
Removed Count: -8
Repo Id: 2cbd8fc6-4c37-445d-bfc3-d38a93cc98d1
Result: success
Started: 2019-05-07T19:15:43Z
Summary:
Comps:
State: FINISHED
Content:
State: FINISHED
Distribution:
State: FINISHED
Errata:
State: FINISHED
Metadata:
State: FINISHED
Modules:
State: FINISHED
Purge Duplicates:
State: FINISHED
Traceback: None
Updated Count: 0
Task Id: d0140034-e1ca-4d1e-a34f-7fdf1d17a697
Worker Name: reserved_resource_worker-0@rhel76
Progress Report:
Yum Importer:
Comps:
State: FINISHED
Content:
Details:
Drpm Done: 0
Drpm Total: 0
Rpm Done: 0
Rpm Total: 0
Error Details:
Items Left: 0
Items Total: 0
Size Left: 0
Size Total: 0
State: FINISHED
Distribution:
Error Details:
Items Left: 0
Items Total: 0
State: FINISHED
Errata:
State: FINISHED
Metadata:
State: FINISHED
Modules:
State: FINISHED
Purge Duplicates:
State: FINISHED
Background¶
Test Design¶
The test is simply designed to:
- Sync a repo (immediate/no flags)
- Synch a repo again
- Verify there are no additional changes on the second sync (all artifacts should have been copied the first time) when checking the Task Details of the sync task.
Background¶
This logic was added in 2016 to ensure sync was fully syncing:
- https://github.com/PulpQE/pulp-smash/commit/0a51ef749faaa87afa311d665056de4270f9ec3c
- https://github.com/PulpQE/pulp-smash/issues/161
The failures are relatively recent.
Recreation¶
The easiest way to recreate is to run the following test against a pulp2 server. The test run-time is 16 seconds.
If the test needs to be abstracted to a set of manual steps, it can be. However, with the inconsistent behavior of this test, it is best to use this short python test for recreation in this instance.
The definition of `inconsistent behavior` is that the test will sometimes pass as expected and sometimes fail on any of these fields.
Possible Behavioral Change Commit¶
It was noted the following Commit has recently been introduced.
It may be part of the reason for the behavioral change for the test.
More investigation or a clear RCA is required still to verify the source.
Updated by bherring over 5 years ago
- Subject changed from Pulp 2 Nightly Regression in second-sync testing behavior to Pulp2 task details information returns incorrect values when polled too quickly
- Description updated (diff)
Updated by dkliban@redhat.com over 5 years ago
- Status changed from NEW to CLOSED - WONTFIX
Users polling less frequently is a reasonable work around for this problem.
Updated by bherring over 5 years ago
- Related to Issue #4428: Upsert query raises NotUniqueError added