rsync distributor without force_full incorrectly skips publishing some units
In pulp.plugins.rsync.publish.Publisher#__init__ method there is following logic:
if self.is_fastforward(): start_date = self.last_published end_date = None if self.predistributor: end_date = self.predistributor["last_publish"] date_filter = self.create_date_range_filter(start_date=start_date, end_date=end_date)
This code calculates a date range for the distributor to process.
In summary, it will only process units associated to the repo between the last publish of the rsync distributor, and the last publish of the predistributor.
That seems to be incorrect. If association and publish is done in a certain order, this can cause units to be permanently lost from the publish (until a publish is explicitly done with "force_full").
Using a yum repo as an example, here's a sequence of events which demonstrates the problem:
- Trigger yum publish.
- Time A: yum publish completes
- Time B: associate x.rpm into yum repo
- Trigger rsync publish
- Time C: rsync publish completes
(Note: this publish will not include x.rpm since yum publish hasn't happened for that unit yet)
- Trigger yum publish
- Time D: yum publish completes
- Trigger rsync publish, wait for it to complete
Expected result: after last step, repository is fully published, including x.rpm
Actual result: x.rpm is still not published, because rsync distributor only processed units associated between time C and D. Republishing won't fix it. Explicitly publishing with force_full: True will fix it.
Note I haven't attempted to reproduce this, the bug report is based on code review of latest master ( 6fc2861fd14793f8461d232cb641b5112d271519 ).