For all details and references of the project one can visit darcs-GSoC-2014-History-Reordering-Performance-and-Features.

jueves, 12 de junio de 2014

Third Week (02-06 june)

Well, well... Now with the solution already implemented here are a couple of time tests that show the improvement.

For the repository of the issue2361:

"let it run for 2 hours and it did not finish"

real    0m5.929s
user    0m5.683s
sys     0m0.260s

For the repository generated by forever.sh, that in summarize has 12600~ patches, a bundle unrevert and doing reorden implies move 1100~ patches forward passing by 11500~ patches.

(Interrupted!)
real    73m9.894s
user    71m28.256s
sys     1m11.439s

real    2m23.405s
user    2m17.347s
sys     0m6.030s

The repository generated by bigRepo.sh has 600~ patches, with only one tag and a very small bundle unrevert.

real        0m34.049s
user        0m33.386s
sys         0m0.665s

real        0m1.053s
user        0m0.960s
sys         0m0.152s

One last repository generated by bigUnrevert.sh, has 13 patches and a really big bundle unrevert (~10MB).

real    0m1.304s
user    0m0.499s
sys     0m0.090s

real    0m0.075s
user    0m0.016s
sys     0m0.011s

The repository with more examples is in here: ExamplesRepos.

martes, 3 de junio de 2014

Second Week (26-30 may)

Luckily, this week with Guillaume we found a "solution" for the issue 2361. But before of entering in details, let's review how the command darcs optimize --reorder does for reorder the patches.

So, suppose we have the following repositories than, reading it from left to right we have the first patch till the last patch, besides with $p_{i,j}$ we denote the $i$-th patch who belongs to the $j$-th repository, and when we want to specify that a patch $p_{i,j}$ is a tag we write $t_{i,j}$.

$r_1$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{n+1,1}$ $\ldots$ $p_{m,1}$

$r_2$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$ $p_{k+1,2}$ $\ldots$ $p_{l,2}$

where the red part represent when $r_2$ was cloned from $r_1$, and the rest is how each repository was evolved. Now, suppose we make a merge of $r_1$ and $r_2$ in $r_1$ making a bundle of the patches of $r_2$ and appling it in $r_1$. Thus, after the merge we have that

$r_1$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{n+1,1}$ $\ldots$ $p_{m,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$ $p_{k+1,2}$ $\ldots$ $p_{l,2}$

and we found the situation where the tag $t_{1,2}$ is dirty because the green part is in the middle. And now we are in conditions of finding out how darcs does the reorder of patches.
So, the first task is to select the first tag seeing $r_1$ in the reverse way, suppose $t_{1,2}$ is the first (ie, $p_{k+1,2}$ $\ldots$ $p_{l,2}$ are not tags), and split the set of patches (the repository) in

$ps_{t_{1,2}}$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$

and the rest of the patch set,

$rest$ $=$ $p_{n+1,1}$ $\ldots$ $p_{m,1}$ $p_{k+1,2}$ $\ldots$ $p_{l,2}$

this is done by splitOnTag, which I don't totally understand yet, so for the moment... simply do the above :) Then, the part that interest us now is $rest$, we want to delete all the patches of $rest$ that exist in $r_1$ and then add them again, causing that they show up to the right. This job is done by tentativelyReplacePatches, which first calls tentativelyRemovePatches and then calls tentativelyAddPatches.

So, tentativelyRemovePatches of $r_1$ and $rest$ makes,

$r_{1}'$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$

and, tentativelyAddPatches of $r_{1}'$ and $rest$,

$r_{1}''$ $=$ $p_{1,1}$ $p_{2,1}$ $\ldots$ $p_{n,1}$ $p_{1,2}$ $\ldots$ $p_{k,2}$ $t_{1,2}$ $p_{n+1,1}$ $\ldots$ $p_{m,1}$  $p_{k+1,2}$ $\ldots$ $p_{l,2}$

leaving $t_{1,2}$ clean.

Well, all of this was for understanding the "solution" for the issue, we are almost there but before let's look at the function tentativelyRemovePatches. It attempts to remove patches with one special care: when one does darcs revert, a special file is generated, called unrevert in _darcs/patches, which is used for darcs unrevert in case that one makes a mistake with darcs revert. One important difference with unrevert is that unlike all the other files in _darcs/patches, unrevert in not a patch but a bundle, that contains a patch and a context. This context allows to know if the patch is applicable. So when one removes a patch (running for example oblitarete, unrecord or amend) that patch has to be removed from the bundle-revert (bundle of the file _darcs/patches/unrevert). It's now always possible to adjust the unrevert bundle, in which case, the operation continues only if the user agrees to delete the unrevert bundle.

But now a question emerge. Is it necessary to accommodate the bundle-revert in the case of reorder?; the answer is no, and it's because we don't delete any patch of $r_1$ so we still can apply the bundle-revert in $r_{1}''$.

So, finally! we find out that for reorder we need a special case of removing, which doesn't try to update the unrevert bundle. And this ends up being the "solution" for the issue, since the reorder blocks in that function. But! beyond this solves the issue something weird is happening, that is the reason of the double quotes for solution :)

This is more o less the step forward for now. The tasks ahead are, documenting the code in various parts and make the special case for the function tentativelyRemovePatches. On the way I will probably understand more about some of the functions that I mention before so probably I will add more info and rectify whatever is needed.