Project 25 View Three-Way Differences
“I’ve got two sets of edits to the same file. How do I merge the changes so I can see both sets of changes in one file?”
This project shows how to compare two files against a common ancestor file and view their differences relative to the common file. It shows how to merge the two sets of changes to create a single file that contains both sets and how to make patch files that bring either of the two files up to date. The project covers commands diff3 and patch. It builds on concepts covered in Project 24, which uses commands diff, sdiff, and patch to compare and update pairs of files. Project 26 explores sorting of text files, and techniques for picking out commonalities and differences between sorted files.
Three-Way Comparison with diff3
We use the diff3 command to compare two files against a common ancestor file. If two people copy and then change the same file independently, diff3 can be used to merge both sets of changes into a new file. diff3 can also create a patch file describing the additional changes in one file compared with the other. When the patch is applied to the other file, the result is a file that contains both sets of changes.
The diff3 command reports on lines that have been added, deleted, and changed. It works on text or binary files, but we’ll stick to text files in this project; it’s simpler to illustrate the principles involved by using files that can be displayed.
The diff3 command is useful when two (or more) people make independent changes to their own copies of the same original file. We have two (or more) new files, both of which are relevant, so we must merge both sets of changes. At first glance, two people editing their own copies of the same file, at the same time, may seem a silly thing to do, but it is done often in software development. Many software engineers work on the same project in parallel, each engineer copying the same base set of files and applying his particular updates. Mostly, the engineers will change different files from one another, but occasionally, changes overlap. When development is complete, all sets of changes must be merged. This principle is employed by concurrent version control systems such as CVS, which uses diff3 to merge the changes as each developer returns his set of files to the base set.
Let’s look at an example that uses two independent sets of changes, made in files day1 and day2, to a common base file, orig. The three files are shown side by side. In day1, line 1 is edited, and line 5 is deleted. In day2, line 3 is edited, and line 7 is added. Our aim is to produce a merged file such that lines 1 and 3 are edited, line 5 is deleted, and line 7 is added, with respect to orig.
day1 orig day2 line 11 line 1 line 1 line 2 line 2 line 2 line 3 line 3 line 33 line 4 line 4 line 4 line 6 line 5 line 5 line 6 line 6 line 7
We use diff3 to compare the changes made in day1 with those made in day2, relative to the base file orig.
You may be asking why the third file, orig, is required. It tells us that it was day1 that deleted line 5. Without comparing against orig, we couldn’t be sure that it wasn’t day2 that added line 5. Similarly, we know that day2 added line 7 and not that day1 deleted it. This information is important when the two files are merged, ensuring that diff knows to delete line 5 and add line 7 to produce the merged file.
Merge Two Sets of Changes
Having determined that day1 and day2 both contain changes to the original file, we now merge both sets of changes into day3. Recall that our aim is to produce a file such that lines 1 and 3 are edited, line 5 is deleted, and line 7 is added. To do this, simply specify option -m (merge) and redirect output to day3.
$ diff3 -m day1 orig day2 > day3 $ cat day3 line 11 line 2 line 33 line 4 line 6 line 7
Patch the Differences
A more roundabout approach makes a patch file. First, use option -A and direct the output to the patch file.
$ diff3 -A day1 orig day2 > patchfile
Displaying patchfile, we see that it contains instructions on how to update day1 to incorporate the changes made to day2.
$ cat patchfile 5a line 7 . 3c line 33 .
Apply the patch to day1.
$ patch day1 patchfile $ cat day1 line 11 line 2 line 33 line 4 line 6 line 7
Now day1 incorporates both sets of changes.
The patch could equally well be applied the other way around, as long as the diff is also done the other way around.
$ diff3 -A day2 orig day1 > patchfile $ patch day2 patchfile
Resolve Conflicts
One final consideration: It’s quite possible for the same line to be changed in both files, creating a conflict that must be resolved manually. (Computers are dumb.) This situation is illustrated below, where line 3 was changed in both files day1 and day2. You’ll notice that diff produces a sequence of lines surrounded by <<<<<<< and >>>>>>> highlighting the area of conflict.
$ diff3 -m day1 orig day2 line 11 line 2 <<<<<<< day1 line 333 ||||||| orig line 3 ======= line 33 >>>>>>> day2 line 4 line 6 line 7
Whenever you merge two files, check for conflicts. You can check the return status of diff3 by examining the special shell variable $?. It’s 0 for success and 1 for conflicts, and 2 means trouble (as the man page puts it).
$ diff3 -m day1 orig day2 > day3 $ echo $? 1
Alternatively, dry-run the merge and count the number of conflicts.
$ diff3 -m day1 orig day2 | grep "<<<<" | wc -l 1
If you get conflicts, complete the merge and then edit the new file manually, choosing the appropriate line and removing the conflict markers.