Project

Profile

Help

Issue #4795

closed

Pulp2 task details information returns incorrect values when polled too quickly

Added by bherring almost 5 years ago. Updated almost 4 years ago.

Status:
CLOSED - WONTFIX
Priority:
Normal
Assignee:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Pulp 2
Sprint:
Quarter:

Description

Problem

When attempting to sync a feed for a second time in automation/CI (not manually), the sync will return incorrect/invalid information for the task details failing the test "often".

Problem Breakout

There are two discrete issues here:

  • `pulp-smash` is now/recently updated polling at a 0.3s rate that. For a test including a second repo sync, this will occasionally returning a task "COMPLETE" but the values for the task details had yet to be reset. The resulting return of non-zero values causes test failures. It appears that the ACTUAL change is none, however the task_details are reporting incorrectly. This was supposedly discovered long ago, but the legacy issue could not be found.
  • Due to the above, the values of 'added_count', 'removed_count', 'updated_count' are being permanently updated to values in the db that are incorrect. Meaning, running `pulp-admin tasks details --task-id` of the automated task will result in the invalid values still being populated in the historic record. This appears to be related to legacy issue #4428

Recreation

The RCA seems to be from updating pulp-smash polling to a much quicker rate:
https://github.com/PulpQE/pulp-smash/blob/e385e6685a82709fd167b5374d19c4dfcfdb78c5/pulp_smash/api.py#L687

To test this, locally have pulp-smash installed and set to a VERY low polling rate. At the time of this writing, anything around `0.3` would fail 50-80% of the time. Adjusting to an even smaller value should increase the probabiltiy of failure on a non-patched system.

Note: This MUST be done in an automated fashion with pulp-smash due to the api polling. Testing this by hand or manually will NOT result in issue recreation.

Run this test through pytest with the following syntax in an appropriately prepared virtenv:

(pulp2) [herring@redherring api_v2]$ pwd
/home/herring/git/Pulp-2-Tests/pulp_2_tests/tests/rpm/api_v2
(pulp2) [herring@redherring api_v2]$ count=1; while [ $count -le 10 ]; do echo -e "Iteration: $count\n"; pytest -svv test_sync_publish.py::SyncRpmRepoTestCase::test_no_change_in_second_sync --disable-warnings; ((count++)); done

A loop value of 10 was chosen to ensure math was easy.

It is expected there will be a 100% pass rate. At this time, even with larger values around 2 seconds, there is still ~10% chance of failure.

Also, running the following on a passing Task ID should NOT result in incorrect values for the task:

[root@rhel76 ~]# pulp-admin tasks details --task-id d0140034-e1ca-4d1e-a34f-7fdf1d17a697

Where the `--task-id` is appropriate for the jobs being ran.

QE Workaround in place

For the time being, the values for pulp-smash when used on Pulp2 polling can be increase to keep these failures from happening.

Once this task issues are resolved, pulp-smash can have the values restored to lower polling values.

References

  • Additional Reference needed here from @dkliban about how the values are calculated
    https://pulp.plan.io/issues/4428#note-17
  • Original investigation information moved down in other Notes to retain investigation history.

Related issues

Related to Pulp - Issue #4428: Upsert query raises NotUniqueErrorCLOSED - CURRENTRELEASEdaviddavisActions
Actions #1

Updated by daviddavis almost 5 years ago

  • Tags Pulp 2 added
Actions #2

Updated by bherring almost 5 years ago

Additional References

There are two discrete issues here:

  • `pulp-smash` is now polling at a 0.3s rate that is occasionally returning a task "COMPLETE" but the values for the task had yet to be reset
  • The values of 'added_count', 'removed_count', 'updated_count' are being updated to values that aren't completed and set to the final values resulting in a `task_id` having invalid/incorrect values outlined in #4428

The RCA seems to be from updating pulp-smash polling to a much quicker rate:

For the time being, the values for pulp-smash when used on Pulp2 polling can be increase to keep these failures from happening.

Once this task issues are resolved, pulp-smash can have the values restored to lower polling values.

References

Actions #3

Updated by bherring almost 5 years ago

Problem

There is an inconsistent and relatively new regression on Pulp 2 Nightly CI:

Failures can be seen in the second-sync Task Details. Historically, only a value of `0` are expected for the second-sync details for the verified fields:

  • 'added_count'
  • 'removed_count'
  • 'updated_count'

The output of the test failure (including a negative value in this case for `removed_count` and non-zero for `added_count` as well as `task_id`) can be correlated with the CLI Task Output for the `task_id` showing non-zero values for the verified fields.

Test Failure

============================================================== FAILURES ===============================================================
________________________________________ SyncRpmRepoTestCase.test_02_no_change_in_second_sync _________________________________________

self = <pulp_2_tests.tests.rpm.api_v2.test_sync_publish.SyncRpmRepoTestCase testMethod=test_02_no_change_in_second_sync>

    def test_02_no_change_in_second_sync(self):
        """Verify that syncing a second time has no changes.

        If the repository have not changed then Pulp must state that anything
        was changed when doing a second sync.
        """
        report = sync_repo(self.cfg, self.repo)
        tasks = tuple(api.poll_spawned_tasks(self.cfg, report.json()))
        with self.subTest(comment='spawned tasks'):
            self.assertEqual(len(tasks), 1)
        for count_type in ('added_count', 'removed_count', 'updated_count'):
            with self.subTest(comment=count_type):
>               self.assertEqual(tasks[0]['result'][count_type], 0, tasks)
E               AssertionError: 8 != 0 : ({'exception': None, 'task_type': 'pulp.server.managers.repo.sync.sync', '_href': '/pulp/api/v2/tasks/ab78afea-178b-4081-99a3-2e63d47915de/', 'task_id': 'ab78afea-178b-4081-99a3-2e63d47915de', 'tags': ['pulp:repository:9601ef5d-f90c-433f-b2e5-9e879ed0c199', 'pulp:action:sync'], 'finish_time': '2019-05-08T11:28:39Z', '_ns': 'task_status', 'start_time': '2019-05-08T11:28:32Z', 'traceback': None, 'spawned_tasks': [], 'progress_report': {'yum_importer': {'content': {'items_total': 0, 'state': 'FINISHED', 'error_details': [], 'details': {'rpm_total': 0, 'rpm_done': 0, 'drpm_total': 0, 'drpm_done': 0}, 'size_total': 0, 'size_left': 0, 'items_left': 0}, 'comps': {'state': 'FINISHED'}, 'purge_duplicates': {'state': 'FINISHED'}, 'distribution': {'items_total': 0, 'state': 'FINISHED', 'error_details': [], 'items_left': 0}, 'modules': {'state': 'FINISHED'}, 'errata': {'state': 'FINISHED'}, 'metadata': {'state': 'FINISHED'}}}, 'queue': 'reserved_resource_worker-0@rhel76-218.dq2', 'state': 'finished', 'worker_name': 'reserved_resource_worker-0@rhel76-218', 'result': {'result': 'success', 'importer_id': 'yum_importer', 'exception': None, 'repo_id': '9601ef5d-f90c-433f-b2e5-9e879ed0c199', 'traceback': None, 'started': '2019-05-08T11:28:32Z', '_ns': 'repo_sync_results', 'completed': '2019-05-08T11:28:39Z', 'importer_type_id': 'yum_importer', 'error_message': None, 'summary': {'modules': {'state': 'FINISHED'}, 'content': {'state': 'FINISHED'}, 'comps': {'state': 'FINISHED'}, 'purge_duplicates': {'state': 'FINISHED'}, 'distribution': {'state': 'FINISHED'}, 'errata': {'state': 'FINISHED'}, 'metadata': {'state': 'FINISHED'}}, 'added_count': 8, 'removed_count': -8, 'updated_count': 0, 'id': '5cd2bd676f8f3411133bf898', 'details': {'modules': {'state': 'FINISHED'}, 'content': {'size_total': 0, 'items_left': 0, 'items_total': 0, 'state': 'FINISHED', 'size_left': 0, 'details': {'rpm_total': 0, 'rpm_done': 0, 'drpm_total': 0, 'drpm_done': 0}, 'error_details': []}, 'comps': {'state': 'FINISHED'}, 'purge_duplicates': {'state': 'FINISHED'}, 'distribution': {'items_total': 0, 'state': 'FINISHED', 'error_details': [], 'items_left': 0}, 'errata': {'state': 'FINISHED'}, 'metadata': {'state': 'FINISHED'}}}, 'error': None, '_id': {'$oid': '5cd2bd60cfa8faeb309e9677'}, 'id': '5cd2bd60cfa8faeb309e9677'},)

test_sync_publish.py:156: AssertionError

CLI Task Failure Output from Second Sync

[root@rhel76 ~]# pulp-admin tasks details --task-id d0140034-e1ca-4d1e-a34f-7fdf1d17a697
+----------------------------------------------------------------------+
                              Task Details
+----------------------------------------------------------------------+

Operations:       sync
Resources:        2cbd8fc6-4c37-445d-bfc3-d38a93cc98d1 (repository)
State:            Successful
Start Time:       2019-05-07T19:15:43Z
Finish Time:      2019-05-07T19:15:49Z
Result:           
  Added Count:      8
  Completed:        2019-05-07T19:15:49Z
  Details:          
    Comps:            
      State: FINISHED
    Content:          
      Details:       
        Drpm Done:  0
        Drpm Total: 0
        Rpm Done:   0
        Rpm Total:  0
      Error Details: 
      Items Left:    0
      Items Total:   0
      Size Left:     0
      Size Total:    0
      State:         FINISHED
    Distribution:     
      Error Details: 
      Items Left:    0
      Items Total:   0
      State:         FINISHED
    Errata:           
      State: FINISHED
    Metadata:         
      State: FINISHED
    Modules:          
      State: FINISHED
    Purge Duplicates: 
      State: FINISHED
  Error Message:    None
  Exception:        None
  Id:               5cd1d965719c8d17a0474cc4
  Importer Id:      yum_importer
  Importer Type Id: yum_importer
  Removed Count:    -8
  Repo Id:          2cbd8fc6-4c37-445d-bfc3-d38a93cc98d1
  Result:           success
  Started:          2019-05-07T19:15:43Z
  Summary:          
    Comps:            
      State: FINISHED
    Content:          
      State: FINISHED
    Distribution:     
      State: FINISHED
    Errata:           
      State: FINISHED
    Metadata:         
      State: FINISHED
    Modules:          
      State: FINISHED
    Purge Duplicates: 
      State: FINISHED
  Traceback:        None
  Updated Count:    0
Task Id:          d0140034-e1ca-4d1e-a34f-7fdf1d17a697
Worker Name:      reserved_resource_worker-0@rhel76
Progress Report:  
  Yum Importer: 
    Comps:            
      State: FINISHED
    Content:          
      Details:       
        Drpm Done:  0
        Drpm Total: 0
        Rpm Done:   0
        Rpm Total:  0
      Error Details: 
      Items Left:    0
      Items Total:   0
      Size Left:     0
      Size Total:    0
      State:         FINISHED
    Distribution:     
      Error Details: 
      Items Left:    0
      Items Total:   0
      State:         FINISHED
    Errata:           
      State: FINISHED
    Metadata:         
      State: FINISHED
    Modules:          
      State: FINISHED
    Purge Duplicates: 
      State: FINISHED

Background

Test Design

The test is simply designed to:

  • Sync a repo (immediate/no flags)
  • Synch a repo again
  • Verify there are no additional changes on the second sync (all artifacts should have been copied the first time) when checking the Task Details of the sync task.

Background

This logic was added in 2016 to ensure sync was fully syncing:

The failures are relatively recent.

Recreation

The easiest way to recreate is to run the following test against a pulp2 server. The test run-time is 16 seconds.

If the test needs to be abstracted to a set of manual steps, it can be. However, with the inconsistent behavior of this test, it is best to use this short python test for recreation in this instance.

The definition of `inconsistent behavior` is that the test will sometimes pass as expected and sometimes fail on any of these fields.

Possible Behavioral Change Commit

It was noted the following Commit has recently been introduced.

It may be part of the reason for the behavioral change for the test.

More investigation or a clear RCA is required still to verify the source.

Actions #4

Updated by bherring almost 5 years ago

  • Subject changed from Pulp 2 Nightly Regression in second-sync testing behavior to Pulp2 task details information returns incorrect values when polled too quickly
  • Description updated (diff)
Actions #5

Updated by bherring almost 5 years ago

  • Description updated (diff)
Actions #6

Updated by bherring almost 5 years ago

  • Description updated (diff)
Actions #7

Updated by dkliban@redhat.com almost 5 years ago

  • Status changed from NEW to CLOSED - WONTFIX

Users polling less frequently is a reasonable work around for this problem.

Actions #8

Updated by bherring almost 5 years ago

  • Related to Issue #4428: Upsert query raises NotUniqueError added

Also available in: Atom PDF