Issue #4795
Updated by bherring over 5 years ago
h2. Problem
When attempting to sync a feed for a second time in automation/CI (not manually), the sync will return incorrect/invalid information for the task details failing the test "often".
h2. Problem Breakout
There are two discrete issues here:
* `pulp-smash` is now/recently updated polling at a 0.3s rate that. For a test including a second repo sync, this will occasionally returning a task "COMPLETE" but the values for the task details had yet to be reset. The resulting return of non-zero values causes test failures. It appears that the ACTUAL change is none, however the task_details are reporting incorrectly. This was supposedly discovered long ago, but the legacy issue could not be found.
* Due to the above, the values of 'added_count', 'removed_count', 'updated_count' are being permanently updated to values in the db that are incorrect. Meaning, running `pulp-admin tasks details --task-id` of the automated task will result in the invalid values still being populated in the historic record. This appears to be related to legacy issue #4428
h2. Recreation
The RCA seems to be from updating pulp-smash polling to a much quicker rate:
https://github.com/PulpQE/pulp-smash/blob/e385e6685a82709fd167b5374d19c4dfcfdb78c5/pulp_smash/api.py#L687
To test this, locally have pulp-smash installed and set to a VERY low polling rate. At the time of this writing, anything around `0.3` would fail 50-80% of the time. Adjusting to an even smaller value should increase the probabiltiy of failure on a non-patched system.
Run this test through pytest with the following syntax in an appropriately prepared virtenv:
<pre>
(pulp2) [herring@redherring api_v2]$ pwd
/home/herring/git/Pulp-2-Tests/pulp_2_tests/tests/rpm/api_v2
(pulp2) [herring@redherring api_v2]$ count=1; while [ $count -le 10 ]; do echo -e "Iteration: $count\n"; pytest -svv test_sync_publish.py::SyncRpmRepoTestCase::test_no_change_in_second_sync --disable-warnings; ((count++)); done
</pre>
A loop value of 10 was chosen to ensure math was easy.
It is expected there will be a 100% pass rate. At this time, even with larger values around 2 seconds, there is still ~10% chance of failure.
Also, running the following on a passing Task ID should NOT result in incorrect values for the task:
<pre>
[root@rhel76 ~]# pulp-admin tasks details --task-id d0140034-e1ca-4d1e-a34f-7fdf1d17a697
</pre>
Where the `--task-id` is appropriate for the jobs being ran.
h2. QE Workaround in place
For the time being, the values for pulp-smash when used on Pulp2 polling can be increase to keep these failures from happening.
Once this task issues are resolved, pulp-smash can have the values restored to lower polling values.
h2. References
* Additional Reference needed here from @dkliban about how the values are calculated
https://pulp.plan.io/issues/4428#note-17
* Original investigation information moved down in other Notes to retain investigation history.