Project

Profile

Help

Issue #3999

Updated by jortel@redhat.com over 5 years ago

h1. Problem 

 The ostree plugin creates a `Branch` content (units) for each commit in each branch history.    Each `Branch` has a _created timestamp that is used by the publisher to determine the branch HEAD for each branch.    Because Branch._created is the timestamp when pulp created the record in the DB instead of the timestamp of the commit. As a result, subsequent syncs with a deeper history (depth increased), a newer Branch content (unit) can be created for an older commit.    This causes the publisher, which is sorting on _created, to incorrectly determine the branch HEAD.    Publishing the branch with a HEAD that is not the latest commit in the history means the history is not deep enough to satisfy the depth requirement. 

 Example: 

 depth=3 branch=A 

 Fetch (PULL) history for branch=A: 
 commit-9 
 commit-8 
 commit-7 

 Create content (unit): 

 commit-7    _created: 2018-01-01 
 commit-8    _created: 2018-01-02 
 commit-9    _created: 2018-01-03 

 Change the depth to 4: 

 Fetch (PULL) history for branch=A: 
 commit-9 
 commit-8 
 commit-7 
 commit-6 

 Create content (unit): 

 commit-6    _created: 2018-01-04 

 List content (sorted by _created): 
 commit-7    _created: 2018-01-01 
 commit-8    _created: 2018-01-02 
 commit-9    _created: 2018-01-03 
 commit-6    _created: 2018-01-04 <-- uh-oh 

 Now the publisher (sorting by _created) incorrectly determines commit-6 is the branch HEAD and tries to do a PULL-LOCAL on commit-6 with depth=4.    libostree will attempt to pull commit-6's parent (commit-5) but it's not in pulp's storage (ostree) repository because the depth=4 keep it from being fetched. 

 GLib.Error('Importing commit-5.commit: linkat: No such file or directory', 'g-io-error-quark', 1) (Katello::Errors::PulpError) 

 h1. Proposed Solution 

 Discontinue storing and using 


 Replace the <pre>Branch._created</pre> to determine branch commit ordering.    Then, add <pre>Branch.parent</pre> that contains which is set when inserted into the parent commit ID.    Basically, model DB with <pre>Branch.timestamp</pre> where the parent/child commit chain just as OSTree does. _timestamp_ is set by the importer using the timestamp in the OSTree.    This provides will provide a deterministic method of maintaining reliable way for the commit chain that will support rebasing and changing traversal depth.   

 The importer would pull publisher to sort the history (as it does currently) Branch (content) and set select the newest commit and parent when creating associated with a Branch content (unit).    Further, it would need to update any existing Branch content (units) that have been re-parented by either rebase or an increase in traversal depth. 

 The publisher would determine given branch even with the depth is increased and the branch HEAD based on walking the commit chain. 

 moves. 


 This will require a migration that would need to iterate all of the Branch content (units) and _unset_ <pre>Branch._created</pre> and _set_ <pre>Branch.parent using the history. 

 h1. Questions 

 1. When a commit is (re)parented, the Branch.parent will need to be updated.    What problems might this cause? 

 as well.

Back