Project

Profile

Help

Story #3421

Updated by bmbouter about 6 years ago

When HTTP downloads fail Sync fails with 429 responses (Too Many Requests), the HttpDownloader could wait-and-retry later with possible success. This would allow for downloads to complete over time instead of the first few completing and then all subsequent requests being rejected. For example this is how github Requests) after 100 or so roles are downloaded from Github. Looks like Github rate limits you if you try to download too many files downloads from them at once. 

 h3. Use Cases 

 * the downloaders will exponential backoff and retry when a HTTP 429 response is encountered 
 * As a plugin writer, their site but I can disable can't find any documentation other than rate limiting in their api. Need to figure out what the exponential backoff and retry feature 

 h3. Implementation Options 

 Having a "coordinated rate limiting":https://quentin.pradet.me/blog/how-do-you-rate-limit-calls-with-aiohttp.html implementation limit is straightforward, but and then it needs to be configured probably site-by-site by either users or plugin writers. This is a pain and prone to many problems. Instead a dumb exponential backoff behavior can cause rejected requests to retry automatically when the server is overloaded up to 15 times. 

 We could do something like use "backoff-async":https://pypi.python.org/pypi/backoff-async/2.0.0 and passthrough those options to the plugin writer, but I'm hoping add support for something really simple hence the hardcoded 15 times. 

 h3. When to finally fail? 

 When the HTTPDownloader gets a 429 over an over, it backs off sleeping for X seconds which uses the following exponential values: 1, 2, 4, 8, 16, 32, ... 

 <pre><code class="python"> 
 >>> [pow(2, i) for i rate limiting in range(0,15)] 
 [1, 2, 4, 8, 16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192, 16384]    # These are all our downloaders in seconds 
 >>> sum([pow(2, i) for i in range(0,15)]) / float(3600) 
 9.101944444444445    # This is 9.1 hours 
 </code></pre> pulpcore. I found this possible solution: 

 This will bail after 15 times, causing each downloader to wait a maximum of 9.1 hours before letting the 429 exception get raised uncaught https://quentin.pradet.me/blog/how-do-you-rate-limit-calls-with-aiohttp.html

Back