Project

Profile

Help

Issue #9669

Updated by fao89 over 2 years ago

 

 **Ticket moved to GitHub**: "pulp/pulpcore/2075":https://github.com/pulp/pulpcore/issues/2075 




 ---- 


 Here's the simplified setup I've used to reproduce this issue: 

 ~~~ 
 Pulp server: 
 CentOS 8 
 pulp_installer/pulpcore 3.17.2 
 pulp_deb 2.16 

 Minio server: 
 Ubuntu 18.04.6 
 minio 2022-01-08T03:11:54Z 

 HTTP load-balancer in front of Minio: 
 CentOS 8 
 haproxy 1.8.27 (also tried with nginx and sidekick) 

 apt client: 
 Ubuntu 18.04.6 
 apt 1.16.14 
 ~~~ 

 When I run `apt-get update` on a client configured to use the Pulp server, Pulp responds with the 302 redirect pointing to the HTTP load-balancer I've set up in front of Minio. So far so good. 

 The problem is that the redirect URL contains a semicolon as a query separator, which none of the load-balancers I've tried seem to handle correctly (the `filename` parameter in `response-content-disposition` seem to get discarded). The apt client always gets a 4XX error (e.g. `401 unauthorized`). 

 This seems to happen because content-proxies (that is, the load-balancers) will strip semicolons from query parameters, [because that is what is recommended by the WHATWG since december 2017](https://www.w3.org/TR/2017/REC-html52-20171214/) and somewhat recently discovered [cache poisoning attacks](https://snyk.io/blog/cache-poisoning-in-popular-open-source-packages/) seems to have sped up efforts to follow this recommendation among languages and frameworks (see [CVE-2021-23336](https://bugs.python.org/issue42967)). 

 These two comments in the golang issue tracker helped me come to this conclusion: 
 https://github.com/golang/go/issues/25192#issuecomment-385662789 
 https://github.com/golang/go/issues/25192#issuecomment-789799446 

 I've managed to hackishly solve the issue    (apt clients can now use my repos!) with the below patch, but I'm not sure if it's actually the correct solution or even the safest since it still involves a semicolon as a query separator. The ideal would maybe be to avoid the semicolon entirely, but I'm not sure if the AWS S3 specs allow for that. 


 ~~~ diff 
 diff --git a/pulpcore/content/handler.py b/pulpcore/content/handler.py 
 index 1d8e834c6..0db26d1eb 100644 
 --- a/pulpcore/content/handler.py 
 +++ b/pulpcore/content/handler.py 
 @@ -773,7 +773,8 @@ class Handler: 
          if settings.DEFAULT_FILE_STORAGE == "pulpcore.app.models.storage.FileSystem": 
              return FileResponse(os.path.join(settings.MEDIA_ROOT, artifact_name), headers=headers) 
          elif settings.DEFAULT_FILE_STORAGE == "storages.backends.s3boto3.S3Boto3Storage": 
 -              content_disposition = f"attachment;filename={content_artifact.relative_path}" 
 +              content_disposition = f"attachment%3Bfilename={content_artifact.relative_path}" 
              parameters = {"ResponseContentDisposition": content_disposition} 
              if headers.get("Content-Type"): 
                  parameters["ResponseContentType"] = headers.get("Content-Type") 
 ~~~ 

Back