Project

Profile

Help

Story #3421

As a plugin writer I have HTTPDownloaders which provide exponential backoff for HTTP 429 errors

Added by daviddavis over 1 year ago. Updated 6 months ago.

Status:
MODIFIED
Priority:
Normal
Assignee:
Category:
-
Sprint/Milestone:
Start date:
Due date:
% Done:

100%

Platform Release:
Blocks Release:
Backwards Incompatible:
No
Groomed:
Yes
Sprint Candidate:
No
Tags:
QA Contact:
Complexity:
Smash Test:
Verified:
No
Verification Required:
No
Sprint:
Sprint 35

Description

When HTTP downloads fail with 429 responses (Too Many Requests), the HttpDownloader could wait-and-retry later with possible success. This would allow for downloads to complete over time instead of the first few completing and then all subsequent requests being rejected. For example this is how github rate limits you if you try to download too many files from them at once.

Use Cases

  • the downloaders will exponential backoff and retry when a HTTP 429 response is encountered
  • As a plugin writer, I can disable the exponential backoff and retry feature

Implementation Options

Having a coordinated rate limiting implementation is straightforward, but then it needs to be configured probably site-by-site by either users or plugin writers. This is a pain and prone to many problems. Instead a dumb exponential backoff behavior can cause rejected requests to retry automatically when the server is overloaded up to 10 times.

We could do something like use backoff-async and passthrough those options to the plugin writer, but I'm hoping for something really simple hence the hardcoded 10 times.

When to finally fail?

After 10 times. If it backs off exponentially it would back of for roughly 17 minutes or so.

Associated revisions

Revision d8ea62f4 View on GitHub
Added by bmbouter over 1 year ago

Adds 429 Retry Support

All retry support is provided by the 'backoff' library which also
provides an asyncio compatible interface.

This updates the docs and docstrings to call out the new retry
functionality.

https://pulp.plan.io/issues/3421
closes #3421

Revision d8ea62f4 View on GitHub
Added by bmbouter over 1 year ago

Adds 429 Retry Support

All retry support is provided by the 'backoff' library which also
provides an asyncio compatible interface.

This updates the docs and docstrings to call out the new retry
functionality.

https://pulp.plan.io/issues/3421
closes #3421

Revision d8ea62f4 View on GitHub
Added by bmbouter over 1 year ago

Adds 429 Retry Support

All retry support is provided by the 'backoff' library which also
provides an asyncio compatible interface.

This updates the docs and docstrings to call out the new retry
functionality.

https://pulp.plan.io/issues/3421
closes #3421

History

#1 Updated by bmbouter over 1 year ago

Yes we need to add some sort of rate limiting to the HttpDownloaders.

What about having the downloaders do an exponential backoff and retry with the 429 errors?

#2 Updated by daviddavis over 1 year ago

+1. That's a good option.

#3 Updated by daviddavis over 1 year ago

  • Tracker changed from Issue to Story
  • % Done set to 0

#4 Updated by bmbouter over 1 year ago

  • Project changed from Ansible Plugin to Pulp
  • Subject changed from Github rate limits requests to As a plugin writer I have HTTPDownloaders which provide exponential backoff for HTTP 429 errors
  • Description updated (diff)

Rewriting to be a core story for the downloaders since all plugins would benefit from this feature.

#5 Updated by bmbouter over 1 year ago

  • Sprint Candidate changed from No to Yes

#6 Updated by bmbouter over 1 year ago

  • Description updated (diff)

#7 Updated by daviddavis over 1 year ago

  • Groomed changed from No to Yes

#8 Updated by bmbouter over 1 year ago

  • Status changed from NEW to ASSIGNED
  • Assignee set to bmbouter
  • Sprint set to Sprint 34

Moving onto sprint since it's groomed and a core beta deliverable. I'm also taking as assigned to post something on it tomorrow.

#9 Updated by bmbouter over 1 year ago

  • Status changed from ASSIGNED to CLOSED - NOTABUG
  • Assignee deleted (bmbouter)
  • Sprint Candidate changed from Yes to No
  • Sprint deleted (Sprint 34)
  • Tags deleted (Pulp 3 MVP)

I couldn't reproduce this issue, and it's possible that the remote servers were having some issues the day it was a problem. In those situations maybe the right thing to do is to fail and not continue retrying anyway?

Since we can't reproduce the problem I think we should hold off on adding retry support for now. I'm removing from the sprint and closing as notabug. We should reopen if the need for retry support of 429 errors.

#10 Updated by bmbouter over 1 year ago

  • Status changed from CLOSED - NOTABUG to ASSIGNED
  • Assignee set to bmbouter
  • Sprint set to Sprint 35
  • Tags Pulp 3 MVP added

After working through some pulp_ansible bugs I can now reproduce this regularly. I'm reopening to add the 429 retry support. Currently the tasks get a lot of errors like this one:

        {
            "code": null,
            "description": "429, message='Too Many Requests'",
            "traceback": null
        },

#11 Updated by bmbouter over 1 year ago

  • Description updated (diff)
  • Status changed from ASSIGNED to POST

Since we went with the 'backoff' library really we just had to choose how many times we wanted to retry for. Since we had already said 10 I just went with that. I updated the ticket description with this.

It got pulp_ansible syncing everything out of github with no failures!

PR available at: https://github.com/pulp/pulp/pull/3425

#12 Updated by bmbouter over 1 year ago

  • Status changed from POST to MODIFIED
  • % Done changed from 0 to 100

#13 Updated by dkliban@redhat.com over 1 year ago

  • Sprint/Milestone set to 3.0

#14 Updated by bmbouter 6 months ago

  • Tags deleted (Pulp 3, Pulp 3 MVP)

Please register to edit this issue

Also available in: Atom PDF