We get the difference between binary files using vcdiff
VCDIFF - format and algorithm for delta encoding. It is described in RFC 3284 .
Delta encoding (English Delta encoding) - a way of presenting data as a difference (delta) between successive data instead of the data itself.
"copy the text copy" (source.txt)
"copy change copy" (target.txt)
It is necessary to get the difference between the files:
"changes" (source.txt -> target.txt)
"text" (target.txt -> source.txt)
I'm using the program xdelta3 but I think any one that works with the format of vcdiff.
How to get
We will need another file filled with spaces:
xdelta3 -e -A -n -s source.txt target.txt | xdelta3 -d -s spaces.txt
The flags used are:
-e - Creation of delta
-A - removes the extra headers
-n - removes crc (it does not allow to use delta with another source)
-s[файл] - the source with which the target file is compared and
-d - obtaining the target file from the delta and the source
How it works
If you run the command:
xdelta3 -e -A -n -s source.txt target.txt | xdelta3 printdelta
Then after all the headers we see the commands VCDIFF
Offset Code Type1 Size1 @ Addr1 + Type2 Size2 @ Addr2
??? CPY_0 9 S @ 0
??? ADD 9
??? CPY_0 9 S @ 14
The VCDIFF format is inherently very simple. It consists of 3 teams.
COPY (copy) - copies data from the source or target
ADD (add) - writes to the target file the data stored in the delta (unique data of which is not in the source)
RUN (repeat) - repeats one byte from the delta the specified number of times
Delta stores only unique data and copies the rest from the source. If you run the command:
xdelta3 -e -A -n -s source.txt target.txt> target.vcdiff
We will see in the delta only the word "changes" which is only in the target file
( JSON does not like special characters, so I translated them into HEX )
If the delta is applied to the source (source.txt), then we get the target file (target.txt)
xdelta3 -d -s source.txt target.vcdiff
copy of the copy of
By replacing the source (source.txt) with a file filled with spaces (spaces.txt), we replaced the data that is repeated in the source and target file with spaces.
xdelta3 -d -s spaces.txt target.vcdiff
Any other character can be used in the spaces.txt file. The main condition is that the spaces.txt file is larger or equal in size to the source file.