Project

Profile

Help

Issue #9669

closed

S3 URL generation for content artifacts is broken for apt clients when using Minio behind a load-balancer

Added by jlsm-se 11 months ago. Updated 11 months ago.

Status:
CLOSED - DUPLICATE
Priority:
Normal
Assignee:
-
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
Platform Release:
OS:
Triaged:
No
Groomed:
No
Sprint Candidate:
No
Tags:
Sprint:
Quarter:

Description

Ticket moved to GitHub: "pulp/pulpcore/2075":https://github.com/pulp/pulpcore/issues/2075


Here's the simplified setup I've used to reproduce this issue:

Pulp server:
CentOS 8
pulp_installer/pulpcore 3.17.2
pulp_deb 2.16

Minio server:
Ubuntu 18.04.6
minio 2022-01-08T03:11:54Z

HTTP load-balancer in front of Minio:
CentOS 8
haproxy 1.8.27 (also tried with nginx and sidekick)

apt client:
Ubuntu 18.04.6
apt 1.16.14

When I run apt-get update on a client configured to use the Pulp server, Pulp responds with the 302 redirect pointing to the HTTP load-balancer I've set up in front of Minio. So far so good.

The problem is that the redirect URL contains a semicolon as a query separator, which none of the load-balancers I've tried seem to handle correctly (the filename parameter in response-content-disposition seem to get discarded). The apt client always gets a 4XX error (e.g. 401 unauthorized).

This seems to happen because content-proxies (that is, the load-balancers) will strip semicolons from query parameters, because that is what is recommended by the WHATWG since december 2017 and somewhat recently discovered cache poisoning attacks seems to have sped up efforts to follow this recommendation among languages and frameworks (see CVE-2021-23336).

These two comments in the golang issue tracker helped me come to this conclusion: https://github.com/golang/go/issues/25192#issuecomment-385662789 https://github.com/golang/go/issues/25192#issuecomment-789799446

I've managed to hackishly solve the issue (apt clients can now use my repos!) with the below patch, but I'm not sure if it's actually the correct solution or even the safest since it still involves a semicolon as a query separator. The ideal would maybe be to avoid the semicolon entirely, but I'm not sure if the AWS S3 specs allow for that.

diff --git a/pulpcore/content/handler.py b/pulpcore/content/handler.py
index 1d8e834c6..0db26d1eb 100644
--- a/pulpcore/content/handler.py
+++ b/pulpcore/content/handler.py
@@ -773,7 +773,8 @@ class Handler:
         if settings.DEFAULT_FILE_STORAGE == "pulpcore.app.models.storage.FileSystem":
             return FileResponse(os.path.join(settings.MEDIA_ROOT, artifact_name), headers=headers)
         elif settings.DEFAULT_FILE_STORAGE == "storages.backends.s3boto3.S3Boto3Storage":
-            content_disposition = f"attachment;filename={content_artifact.relative_path}"
+            content_disposition = f"attachment%3Bfilename={content_artifact.relative_path}"
             parameters = {"ResponseContentDisposition": content_disposition}
             if headers.get("Content-Type"):
                 parameters["ResponseContentType"] = headers.get("Content-Type")
Actions #1

Updated by fao89 11 months ago

  • Description updated (diff)
  • Status changed from NEW to CLOSED - DUPLICATE

Also available in: Atom PDF