Compare

by Harvey Block
(2023/11/23)

Back in the '80s when I began to learn about the manuscripts of the Bible, I learned that there were very many manuscripts, (that means "hand written"), and that no two were the same, except perhaps some very short fragments. I began to want to see the differences with my own eyes.

I had learned some computer programming in college and on the job, so I started thinking about how to write a "compare program" that would let me easily see these differences.

I should explain that I was not thinking of comparing the actual manuscripts, or even photographs of them. But when the manuscript texts had been entered into "text files" on the computer, then they could be compared.

There were compare programs already available on the computers I was using on my job, but they were not at all like what I had floating around in my mind. And this was in the days before "personal computers." The only access I had to a computer was at my job, and I couldn't do much without my own computer. But I could think about how I might write the program.

Then in 1990, if I remember correctly, around the time I was laid off my job at Honeywell, I had the idea of what I call the basic "compare engine." It was, I think, a somewhat "elegant" algorithm. That part of the program I still use today.

Over the intervening years, more compare programs have become available. Some are for sale, and some are free. I have used a number of them, and some are very useful. The one I use the most, which I had to pay for, is called "Beyond Compare", and it is very useful as a "programmer's tool." But for what I had in mind for comparing manuscript text files, it is, I have to say, horrible.

So with my "compare engine" idea, I started working on the code. In about 1993 or 1994 my dad bought me my first computer with Microsoft windows on it. It was then that I made enough progress on my compare program that I got it complete enough to use. I had written it for DOS rather than Windows for a variety of reasons.

It worked, but before long, I knew I needed it to run on Windows. That was a major move, and I essentially started over. About the only part I kept was the basic compare engine, which really is a fairly small amount of the code. Most of the program was completely new and different. There were some features that I had to drop, but the gain in other areas was worth the loss in others.

I continued working on that version over the years until I finally made the switch from Microsoft Windows to Linux. So again I was faced with a complete rewrite. This time, I researched and found a platform in which I could write the program code in a way that could be "compiled" to run on any computer. So I eventually completed a version that ran on Linux. Some time later I attempted to "compile" it for Windows, and it ran with only a few very minor changes.

So this process I have fiddled with on and off for over 30 years! I have made improvements, optimizations and added features many times. Oh, and fixed "bugs" (things that didn't work right.) And I am still finding more bugs. I just found another bug yesterday (2023/11/21). There are also a few ideas I want to add. A project like this is never "done."

So why am I telling you all this?

Well along the way, I always had in the back of my mind, to make it into a product available for sale to the world. But I have become discouraged with that idea for multiple reasons. First, I doubt it will ever be "done enough" to sent it out on the open market. And second, the idea of trying to service users out there, if it became popular, scares me to death! I'm getting older and programming is getting harder on my brain.

But I still desire to share it with people who could really benefit from it, and especially those who might find it useful in comparing Biblical text files.

So next I want to show you how it differs from other compare programs out there. There are two major ways that it is different. Most compare programs display the two files in two window panes, one file in each pane. My program has only one window pane. Both files are displayed in that one window with the differences brightly colored. The second difference is, many compare programs out there are "line oriented." But mine is "character oriented."

Here I want to give you an example where "character oriented" compare makes a huge difference:

Here are two small files that are almost the same. Can you find the differences?

File A:

File B:

There are more than two differences in these files, but it will be impossible for you to find more than two. If you compare word for word, by moving your fingers along the same in the two files, one difference is not too hard to find. The other one, I wonder how many people will find it.

Now let's try the program I paid for, "Beyond Compare":

What is that?

If you look close at the word "long" in the beginning line of the last paragraph, you may notice the letters of the word are red. That indicates the difference. So now notice that in the left pane "long" is the fifth word, but in the right pane it is the fourth. That is the difference, the word "long" is moved. Or to think of it another way "time long ago" was changed to "long time ago".

The other difference is also marked by the red color. In the fourth line, the scripture reference chapter and verse are separated by a semi-colon (;) in the left file but a colon (:) in the right file.

But the big glaring difference is that the file on the right has the paragraphs broken into fixed lines, but on the left each paragraph is just one long line. Having a paragraph as one long line is usually preferred because then the text will be "word-wrapped" to fit the width of its display area.

But now notice that the words "and lasting impres" on the left are also red, and strangely, only the words "and las" are red on the right. That is not the part that is different either! Only the word "impres" (partly showing) is different on the left side, because it is on the next line on the right pane file.

And finally, the worst is that you can only see differences in the first part of the paragraph. The rest of the paragraph, wrapped down in multiple lines, you have no clue what differences there may be.

Here it is in my compare program:

In my single pane window every difference is very obvious, brightly colored. Whatever has a white background is the same in the two files. I color the background of the letters, not the letters themselves. This way even spaces and tabs are colored, which in Beyond Compare that difference would be completely hidden.

And finally, the hard line endings are plainly visible with a down arrow, indicating a "new-line" (or line-feed) code. This is because my program is "character-oriented" rather than "line-oriented" as Beyond Compare is. So the files remain synchronized, rather than trying to match only line for line.

If you are interested in this work, or have a need for a better compare program, send me an email. You will have to type it in from this picture:




Copyright © 2023 by Harvey Block
(2023/11/23) on HarveyBlock.Net