Project

Profile

Help

Issue #2689

closed

Don't use ssh connection sharing in rsync distributor

Added by rmcgover almost 7 years ago. Updated almost 5 years ago.

Status:
CLOSED - CURRENTRELEASE
Priority:
Normal
Category:
-
Sprint/Milestone:
-
Start date:
Due date:
Estimated time:
Severity:
2. Medium
Version:
2.12.2
Platform Release:
2.13.1
OS:
Triaged:
Yes
Groomed:
Yes
Sprint Candidate:
No
Tags:
Easy Fix, Pulp 2
Sprint:
Sprint 19
Quarter:

Description

In pulp platform, server/pulp/plugins/rsync/publish.py , ssh for rsync distributor is assembled like this:

cmd = ['ssh', '-l', user]
key = self.get_config().flatten()["remote"]['ssh_identity_file']
cmd += ['-i', key,
        '-o', 'StrictHostKeyChecking no',
        '-o', 'UserKnownHostsFile /dev/null',
        '-S', '/tmp/rsync_distributor-%r@%h:%p',
        '-o', 'ControlMaster auto',
        '-o', 'ControlPersist 10']

Can the unconditional use of ControlMaster please be removed?

The usage of ControlMaster=auto here causes the following problem: if an ssh command is the ControlMaster, it won't exit until all clients using that control socket has exited.

Consider the following scenario to understand why this is an issue:

- there's an idle Pulp with 8 celery workers on one host
- many repo rsync publishes are enqueued
- worker 1 starts publishing with rsync distributor; when it starts rsync, it becomes the ControlMaster
- worker 2 through 8 start publishing other repos & connect to the ControlMaster
- worker 1 upload finishes, but ssh can't exit until the control socket is idle
- worker 1's task can't complete until workers 2 through 8 are all idle at the same time which is unlikely to happen until the entire queue of rsync publishes has completed

Overall, the impact of this is that one worker's rsync publish can unpredictably block completion on the publish from other workers, leading to less throughput due to resources being locked longer than they should, and misleading repo publish times.

I note that if the options relating to ControlMaster here were removed, it would still be possible for Pulp users to enable the feature by editing the ssh_config on the system where Pulp is deployed, if they have a use-case for that.

Also available in: Atom PDF