Downloader performance comparison » History » Sprint/Milestone 3
bmbouter, 10/20/2017 11:26 PM
1 | 1 | bmbouter | # Downloader Performance Comparison |
---|---|---|---|
2 | |||
3 | ## Overview |
||
4 | |||
5 | The Pulp3 Plugin API has two sets of "downloaders" currently. There are the [asyncio downloaders](http://docs.pulpproject.org/en/3.0/nightly/plugins/plugin-api/asyncio.html) and the [futures downloaders](http://docs.pulpproject.org/en/3.0/nightly/plugins/plugin-api/futures.html) |
||
6 | |||
7 | ## Goal |
||
8 | |||
9 | Run some basic performance comparisons between the asyncio and futures downloaders. |
||
10 | |||
11 | ## The tests |
||
12 | |||
13 | The tests all use the file-example fixture data hosted [here](https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/file-example/) |
||
14 | |||
15 | There are 5 different manifests in that repo, each which has the following characteristics |
||
16 | |||
17 | num_files, size (MB), url |
||
18 | 100, 407.679, https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/file-example/PULP_MANIFEST_100 |
||
19 | 200, 819.088, https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/file-example/PULP_MANIFEST_200 |
||
20 | 300, 1206.11, https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/file-example/PULP_MANIFEST_300 |
||
21 | 400, 1565.02, https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/file-example/PULP_MANIFEST_400 |
||
22 | 500, 1945.95, https://repos.fedorapeople.org/repos/pulp/pulp/demo_repos/file-example/PULP_MANIFEST_500 |
||
23 | |||
24 | ## Methodology |
||
25 | |||
26 | Compare the runtime average and std deviation of the asyncio and futures downloaders across 5 different repos. The methodology needs to control for a variety of things: |
||
27 | |||
28 | #### Differences in how the downloaders are operated |
||
29 | |||
30 | [pulp_example](https://github.com/pulp/pulp_example/) contains an importer that works with the "asyncio" downloader and one that works with the "futures" downloader. That implementation is very similar and differs meaningfully only with its use of the downloader. Neither of them use the changesets; both importers performing testing use the respective downloader directly. |
||
31 | |||
32 | #### Code changes between tests |
||
33 | |||
34 | All tests use the same commit with pulp at ( 1c8040150e11e9808d0cf84fe067a3e1e9da48c3 ) and pulp_example at ( 75087c7b1d5ca4c3678f4f121b55196f979e05d5 ). These commits are used as is with 0 line changes applied onto them throughout testing. |
||
35 | |||
36 | #### Network conditions |
||
37 | |||
38 | Each test is run 10 times with the final runtime average and standard deviation computed from the 10 runs. For any given repo (100, 200, etc) the asyncio and futures downloaders are run immediately back-to-back which should subject them to mostly similar network conditions. The different repo tests (100, 200, etc) are not expected to have run in similar network conditions so we should only compare a single test type, e.g. asyncio-100 vs futures-100 is ok to compare but asyncio-100 versus asyncio-200 is not safe to compare. |
||
39 | |||
40 | #### Local data or state |
||
41 | |||
42 | Three things are done before each test to reset the state: a)the Vagrant environment has its database fully resets by running pclean run on it. b) All downloaded data from previous runs is removed from the artifact directory. c) all services are restarted. |
||
43 | |||
44 | ## Test Environment Setup |
||
45 | |||
46 | 1\. Start a Pulp3 Vagrant VM with pulp at ( 1c8040150e11e9808d0cf84fe067a3e1e9da48c3 ) and pulp_example at ( 75087c7b1d5ca4c3678f4f121b55196f979e05d5 ). |
||
47 | 2\. Ensure you have the [testing scripts](https://gist.github.com/bmbouter/5a4f341e4b304edc39547dfedcd7e480) checked out (3 files). |
||
48 | 3\. Install jq \`sudo dnf install jq\`. |
||
49 | 4\. Run the tests from the pulp virtualenv. You can do this by running \`workon pulp\`. |
||
50 | |||
51 | ## Test Plan |
||
52 | |||
53 | 1\. Edit the perftest.sh to configure your test. Set \`asyncio=1\` to test asyncio, set \`asyncio=0\` to test futures. Set the \`num\` to the test type you want, e.g. 200. |
||
54 | 2\. Run the test with \`./manytests.sh\` |
||
55 | 3\. Read the runtimes (in seconds) in the data file perf_data.txt |
||
56 | |||
57 | ## Data Collected |
||
58 | |||
59 | https://docs.google.com/spreadsheets/d/1E4sRA_xKMq1kNjOvWLz6BkGOvzKNPuTozm7xUzN1Snc/edit?usp=sharing |
||
60 | |||
61 | ## Data Summary |
||
62 | |||
63 | 2 | bmbouter | #### Mean Download Time |
64 | 3 | bmbouter | ![](https://docs.google.com/spreadsheets/d/e/2PACX-1vRDjR6jLHoNV9aYT4-UVnE1LJy1SYDJbMeMYsNgwuhAVY_0PAG7AHW2Mhj6_kVfEU_Lpa62FxgqBbkG/pubchart?oid=1035172647&format=image) |
65 | 2 | bmbouter | |
66 | #### Std. Deviation of Download Time |
||
67 | |||
68 | 3 | bmbouter | ![](https://docs.google.com/spreadsheets/d/e/2PACX-1vRDjR6jLHoNV9aYT4-UVnE1LJy1SYDJbMeMYsNgwuhAVY_0PAG7AHW2Mhj6_kVfEU_Lpa62FxgqBbkG/pubchart?oid=198818531&format=image) |
69 | 1 | bmbouter | |
70 | ## Conclusions |
||
71 | |||
72 | TBD |