Project

Profile

Help

Issue #5964

pulpcore.plugin.repo_version_utils.remove_duplicates does not handle base_version != None correctly.

Added by gmbnomis 3 months ago. Updated about 1 month ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
Start date:
Due date:
Severity:
2. Medium
Version:
Platform Release:
Blocks Release:
OS:
Backwards Incompatible:
No
Triaged:
Yes
Groomed:
No
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 67

Description

Problem

pulpcore.plugin.repo_version_utils.remove_duplicates does not handle the case were base_version != None correctly as it operates on the wrong content sets.

Example

Assume that there are two artifacts:

- a1 sha256: 4355...
- a2 sha256: 53c2...

and two file content units:

- c1 "relative_path": "test_upload.txt", artifact is a1
- c2 "relative_path": "test_upload.txt", artifact is a2

These two collide w.r.t. the repo_key and pulpcore.plugin.repo_version_utils.remove_duplicates must remove duplicates.

Now create the following repo versions:

0: empty inital repo version

1: Post to /modify adding "c1". Expected content "c1"

2: Post to /modify adding "c2". Expected content "c2" (c1 has to removed because we are adding newer conflicting content)

3: Post to /modify adding "c2" to base_version 1. Note that, semantically, this is exactly the same operation as the one in version 2, i.e. add "c2" to a repo version containing "c1". Expected content "c2" (c1 has to removed because we are adding newer conflicting content)

However, version 3 contains c1!

Reproducer

Run the following script on a Pulp3 install with an empty DB.

#!/usr/bin/env bash
set -e

echo "Setting environment variables for default hostname/port for the API and the Content app"
export BASE_ADDR=${BASE_ADDR:-http://localhost:24817}
export CONTENT_ADDR=${CONTENT_ADDR:-http://localhost:24816}

# Necessary for `django-admin`
export DJANGO_SETTINGS_MODULE=pulpcore.app.settings

# Poll a Pulp task until it is finished.
wait_until_task_finished() {
    echo "Polling the task until it has reached a final state."
    local task_url=$1
    while true
    do
        local response=$(http $task_url)
        local state=$(jq -r .state <<< ${response})
        jq . <<< "${response}"
        case ${state} in
            failed|canceled)
                echo "Task in final state: ${state}"
                exit 1
                ;;
            completed)
                echo "$task_url complete."
                break
                ;;
            *)
                echo "Still waiting..."
                sleep 1
                ;;
        esac
    done
}

echo "Creating a file "1" at path FILE_CONTENT to upload."
export FILE_CONTENT="1"
echo $FILE_CONTENT > test_upload.txt
DIGEST1=4355a46b19d348dc2f57c046f8ef63d4538ebb936000f3c9ee954a27460dd865

echo "Uploading the file to Pulp, creating an artifact, storing ARTIFACT1_HREF."
export ARTIFACT1_HREF=$(http --form POST $BASE_ADDR/pulp/api/v3/artifacts/ \
    file@./test_upload.txt \
    | jq -r '.pulp_href')

echo "Inspecting new artifact."
http $BASE_ADDR$ARTIFACT1_HREF

echo 'Create File Content from the artifact and save as environment variable'
export TASK_URL=$(http POST $BASE_ADDR/pulp/api/v3/content/file/files/ \
    relative_path="test_upload.txt" \
    artifact=$ARTIFACT1_HREF \
    | jq -r '.task')

wait_until_task_finished $BASE_ADDR$TASK_URL

export CONTENT1_HREF=$(http $BASE_ADDR$TASK_URL| jq -r '.created_resources | first')

echo "Inspecting new file content 1"
http $BASE_ADDR$CONTENT1_HREF

echo "Creating a file "2" at path FILE_CONTENT to upload."
export FILE_CONTENT="2"
echo $FILE_CONTENT > test_upload.txt

echo "Uploading the file to Pulp, creating an artifact, storing ARTIFACT1_HREF."
export ARTIFACT2_HREF=$(http --form POST $BASE_ADDR/pulp/api/v3/artifacts/ \
    file@./test_upload.txt \
    | jq -r '.pulp_href')

echo "Inspecting new artifact."
http $BASE_ADDR$ARTIFACT2_HREF

echo 'Create File Content from the artifact and save as environment variable'
export TASK_URL=$(http POST $BASE_ADDR/pulp/api/v3/content/file/files/ \
    relative_path="test_upload.txt" \
    artifact=$ARTIFACT2_HREF \
    | jq -r '.task')

wait_until_task_finished $BASE_ADDR$TASK_URL

export CONTENT2_HREF=$(http $BASE_ADDR$TASK_URL| jq -r '.created_resources | first')

echo "Inspecting new file content 2"
http $BASE_ADDR$CONTENT2_HREF

export REPO_NAME=dup

echo "Creating a new repository named $REPO_NAME."
export REPO_HREF=$(http POST $BASE_ADDR/pulp/api/v3/repositories/file/file/ name=$REPO_NAME \
  | jq -r '.pulp_href')

echo "Inspecting repository."
http $BASE_ADDR$REPO_HREF

echo "Kick off a task to add content 1 to a repository, storing TASK_URL env variable"
export TASK_URL=$(http POST $BASE_ADDR$REPO_HREF'modify/' \
    add_content_units:="[\"$CONTENT1_HREF\"]" \
    | jq -r '.task')

# Poll the task (here we use a function defined in docs/_scripts/base.sh)
wait_until_task_finished $BASE_ADDR$TASK_URL

echo "Retrieving REPOVERSION_HREF from task"
export REPOVERSION_HREF=$(http $BASE_ADDR$TASK_URL| jq -r '.created_resources | first')
export REPOVERSION_WITH_1_HREF=$REPOVERSION_HREF

echo "Inspecting repository version."
http $BASE_ADDR$REPOVERSION_HREF

echo "Inspecting the content of the repository version"
export FILE_CONTENT_REPO_VERSION_HREF=$(http $BASE_ADDR$REPOVERSION_HREF | jq -r '.content_summary.present["file.file"].href')
http $BASE_ADDR$FILE_CONTENT_REPO_VERSION_HREF

echo "Kick off a task to add content 2 to a repository, storing TASK_URL env variable"
export TASK_URL=$(http POST $BASE_ADDR$REPO_HREF'modify/' \
    add_content_units:="[\"$CONTENT2_HREF\"]" \
    | jq -r '.task')

# Poll the task (here we use a function defined in docs/_scripts/base.sh)
wait_until_task_finished $BASE_ADDR$TASK_URL

echo "Retrieving REPOVERSION_HREF from task"
export REPOVERSION_HREF=$(http $BASE_ADDR$TASK_URL| jq -r '.created_resources | first')

echo "Inspecting repository version."
http $BASE_ADDR$REPOVERSION_HREF

echo "Inspecting the content of the repository version"
export FILE_CONTENT_REPO_VERSION_HREF=$(http $BASE_ADDR$REPOVERSION_HREF | jq -r '.content_summary.present["file.file"].href')
http $BASE_ADDR$FILE_CONTENT_REPO_VERSION_HREF

echo "Kick off a task to add content 2 to the version containing 1, storing TASK_URL env variable"
export TASK_URL=$(http POST $BASE_ADDR$REPO_HREF'modify/' \
    add_content_units:="[\"$CONTENT2_HREF\"]" \
    base_version=$REPOVERSION_WITH_1_HREF \
    | jq -r '.task')

# Poll the task (here we use a function defined in docs/_scripts/base.sh)
wait_until_task_finished $BASE_ADDR$TASK_URL

echo "Retrieving REPOVERSION_HREF from task"
export REPOVERSION_HREF=$(http $BASE_ADDR$TASK_URL| jq -r '.created_resources | first')

echo "Inspecting repository version."
http $BASE_ADDR$REPOVERSION_HREF

echo "Inspecting the content of the repository version"
export FILE_CONTENT_REPO_VERSION_HREF=$(http $BASE_ADDR$REPOVERSION_HREF | jq -r '.content_summary.present["file.file"].href')
http $BASE_ADDR$FILE_CONTENT_REPO_VERSION_HREF

Associated revisions

Revision 3943e76a View on GitHub
Added by Fabricio Aguiar about 1 month ago

Considering base version when removing duplicates

https://pulp.plan.io/issues/5964 closes #5964

Revision d7d3cacd View on GitHub
Added by daviddavis about 1 month ago

Revert the behavior change in previous()

Instead handle base_version in remove_duplicates().

ref #5964 https://pulp.plan.io/issues/5964

History

#1 Updated by gmbnomis 3 months ago

  • Description updated (diff)

#2 Updated by fabricio.aguiar 3 months ago

  • Triaged changed from No to Yes

#3 Updated by bmbouter 3 months ago

  • Sprint/Milestone set to 3.1.0

#4 Updated by daviddavis about 2 months ago

  • Sprint/Milestone changed from 3.1.0 to 3.2.0

#5 Updated by fabricio.aguiar about 2 months ago

  • Description updated (diff)
  • Status changed from NEW to ASSIGNED
  • Assignee set to fabricio.aguiar

#6 Updated by daviddavis about 2 months ago

  • Sprint set to Sprint 66

#7 Updated by fabricio.aguiar about 2 months ago

  • Status changed from ASSIGNED to POST

#8 Updated by rchan about 1 month ago

  • Sprint changed from Sprint 66 to Sprint 67

#9 Updated by Anonymous about 1 month ago

  • Status changed from POST to MODIFIED

#10 Updated by daviddavis about 1 month ago

  • Status changed from MODIFIED to CLOSED - CURRENTRELEASE

Please register to edit this issue

Also available in: Atom PDF