Computerized Bible Criticism

While the theory that God dictated his book to Moses cannot be disproved, there is a different, more falsifiable component of widespread traditional thought: that the Pentateuch is a unified text, rather than an amalgam of distinct compositions. At the very least, the fact that our results align so closely with previous scholars’ proposals should indicate that Bible critics are on to something in their quest to unravel the threads of the Pentateuch.

By Idan Dershowitz
Hebrew University
August 2011

Roughly a month ago, the Associated Press and other media outlets reported on a new, automated method of biblical analysis that several colleagues and I have recently been working on. We are a team of four Israelis from two very different backgrounds: computer science (Moshe Koppel, Navot Akiva, and Nachum Dershowitz), and Bible studies (me). Together, we have tried to develop a tool that will allow us to analyze biblical texts in ways that were not previously possible. Our method has started producing some interesting results, among them a possible corroboration of certain scholarly theories on the composition of the Pentateuch. We reported on these at the annual meeting of the Association of Computational Linguistics, which met in Portland in late June. (Our article can be found here.)

While the media reports convey the general gist of our work, I thought I’d take this opportunity to explain what we’ve been doing in a little more detail. To set the stage, let’s begin with a brief review of the state of affairs in the field of Pentateuch criticism.

Many scholars are of the opinion that one or more redactors compiled the Five Books of Moses from a number of independent works. Each of these compositions had its own independent structure, ideology, and style. While these books, or documents, likely had their own tortuous histories prior to compilation, by the time they reached the hands of the redactor(s), they had coalesced into relatively cohesive units. The four primary documents are usually referred to as J, E, D, and P, short for the German equivalents of the Yahwistic, Elohistic, Deuteronomistic, and Priestly sources. This model is known as the Documentary Hypothesis, and it enjoyed great currency for nearly a century after its popularization by Julius Wellhausen in the late nineteenth century. However, a large and growing number of Bible scholars disagree with the notion of multiple freestanding documents, preferring a supplementary model, in which a series of writers made additions to previously existing texts, each expanding upon his (or less likely, her) predecessor’s version. Some of these scribes may have written a great deal of their own material, reflecting their personal ideology and style, but each scribe’s work was fundamentally dependent on the previous edition.

Both the Documentary Hypothesis and the Supplementary Hypothesis have numerous variants, and other models abound as well. For example, even among so-called documentarians, many express doubt about the scope (or the very existence) of the E source. And even scholars who believe that J and E are well-represented in the Pentateuch sometimes offer substantially different analyses of those two sources. While it might have been tempting to delve into these disputes, we felt that our first experiment should examine the least controversial claim of biblical source critics, namely that the Pentateuch can be divided into two categories of texts: priestly, and non-priestly. No scholar would suggest that these two categories correspond to two individual authors, but the dichotomy is nevertheless meaningful. Documentarians and supplementarians alike see a common stylistic and topical thread running through the priestly material to the extent that one may speak of a “priestly school.” Similarly, but to a lesser degree, the non-priestly texts have much more in common with one another — in terms of both subject matter and style — than any of them do with the priestly component. We treat this dichotomy as our scholarly “consensus,” and our experiment aimed to determine if that consensus holds water.

Now, our method must be significantly different from the traditional Bible scholar’s method if it is to add anything to the conversation. In particular, we knew that we must focus only on the criteria that are least open to interpretation, since critics of modern Bible analysis often point to what they maintain are fundamental methodological flaws. For instance, scholars often draw attention to repetitions, which they think can indicate a composite text. To illustrate the point, Genesis 7:1–5 reads:

Then the LORD said to Noah, “Go into the ark, with all your household, for you alone have I found righteous before Me in this generation. Of every clean animal you shall take seven pairs, males and their mates, and of every animal that is not clean, two, a male and its mate; of the birds of the sky also, seven pairs, male and female, to keep seed alive upon all the earth. For in seven days’ time I will make it rain upon the earth, forty days and forty nights, and I will blot out from the earth all existence that I created.” And Noah did just as the LORD commanded him.

In the above passage, Noah is instructed to enter the ark, together with his household and representatives of every animal. But a reader may be forgiven for feeling a touch of déjà vu after having read in the immediately preceding verses (Genesis 6:13–22):

God said to Noah, “I have decided to put an end to all flesh, for the earth is filled with lawlessness because of them: I am about to destroy them with the earth. Make yourself an ark of gopher wood; make it an ark with compartments, and cover it inside and out with pitch [...] and you shall enter the ark, with your sons, your wife, and your sons’ wives. And of all that lives, of all flesh, you shall take two of each into the ark to keep alive with you; they shall be male and female. From birds of every kind, cattle of every kind, every kind of creeping thing on earth, two of each shall come to you to stay alive. For your part, take of everything that is eaten and store it away, to serve as food for you and for them.” Noah did so; just as God commanded him, so he did.

Since the commandment to enter the ark with beast and kin appears twice, as does the report that Noah did as commanded, Bible scholars infer that each belongs to a distinct version of the Flood narrative. Nevertheless, their critics might suggest that what to a modern reader seems like a jarring recapitulation is in fact the product of an ancient literary style that favored deliberate repetition. We must therefore make certain that our method does not rely on duplication to determine multiple authorship.

But repetition is not the only feature Bible scholars point to in this section of the Flood narrative. They — like several traditional commentators before them — note a seeming contradiction between the verse that says “Of every clean animal you shall take seven pairs, males and their mates, and of every animal that is not clean, two, a male and its mate” and the verse that reads “And of all that lives, of all flesh, you shall take two of each into the ark.” Which is it: two of every species, or two only of the unclean species? Whatever the merit of this argument, it too is open to legitimate criticism. After all, the identification of alleged inconsistencies is a fundamentally subjective one. Perhaps the latter verse intentionally goes into less detail than the former, giving some modern readers the mistaken impression of an internal disagreement. We see that our algorithm must also ignore apparent contradictions if we wish to shield it from critiques of this nature.

For similar reasons, our method also ignores potential breaks in narrative flow — a crucial tool in the Bible scholar’s toolbox — and focuses on a single attribute: word usage. Of all the features that critics are on the lookout for, this is the least subjective one. But word usage has its own problem; it can be intertwined with subject matter. Critics of Bible scholars’ methodology argue that the preponderance of a certain term — say, “מטה” (tribe, staff) in Numbers 1 — might be due to that section’s topic, not its author. This is a fair criticism, so we decided to focus on just two aspects of word usage. First and foremost: synonym choice. The mere use of the word “מטה” is indeed ignored, and only the choice to use that term and not “שבט,” its synonym, is recorded. The algorithm works as follows: after breaking up the text into blocks of verses, it measures how alike any pair of blocks are on the basis of synonym choice, and it then clusters the similar ones together. The advantage of this technique is that the selection of one term over its synonymous counterpart is a stylistic one and entirely independent of subject matter. Our method differs from that of previous Bible critics in another regard. Scholars have identified a large number of words and phrases that they believe are characteristic of one source or another. Typically, however, these terms are used almost exclusively by a single putative author. But this is just a small part of the picture, since an author might prefer some synonyms by only a small margin. These can be prohibitively difficult for a human to detect, but our computerized method can spot subtle tendencies, adding them up to build a statistically compelling case.

After initially grouping the text into two clusters according to synonym choice, the method proceeds to a second stage in which it refines its results by looking at a related but slightly different feature: the distribution of the most common and widespread words in the Bible. Given that these generic words appear in numerous and diverse sections of the Bible, this stage in the classification process should also be relatively unswayed by subject matter.

But who is to say that our method works at all? Perhaps it is perfectly objective and unbiased, but also perfectly wrong. To test our method against an established benchmark, we needed a biblical text that was unquestionably written by multiple authors, and whose correct source-division was known in advance. Since no such text exists, we were left with no choice but to make one of our own. We selected the books of Jeremiah and Ezekiel, both of which belong to the same genre and time period, and each of which is thought to be primarily the work of a single author. We then chopped the two books into small chunks and mixed them together. We called our new book “Jeriel.” When tasked with splitting Jeriel into two constituents, our algorithm produced two clusters of text that were nearly identical to the actual books of Jeremiah and Ezekiel themselves.

Having established that our method is capable of classifying biblical texts according to author, we could finally apply it to the Pentateuch. In order to test the consensus P/non-P dichotomy, we asked the program to separate the text into the two most distinct constituents. The results at this point could reasonably have been almost anything. For instance, it would make perfect sense to divide the Pentateuch according to genre: legal and narrative. Remarkably, however, each constituent had ample representation of both genres, and the two clusters overlapped with over 90% of the consensus source-division. Given that our respective methods are so different, this result is a strong affirmation of Bible scholars’ prior work. And since claims of subjectivity and anachronistic subjugation of the text to modern literary conventions are less applicable to our algorithm, it stands to reason that scholars’ identification of P and non-P material was relatively untainted by such deficiencies.

Having said all that, we would be remiss not to discuss our method’s disagreements with the consensus source-division. A notable example is Genesis 1 — the seven-day creation story — which scholars almost unanimously consider priestly, but which our algorithm clustered with the non-priestly material. There are several such examples, although they make up only a small minority of cases. At this stage, it is too early to determine what to make of each disagreement; they must be examined on an individual basis. But we must not forget that for all its successes, our tool is imperfect, as scholars’ analyses surely are, as well. A priori, any case of disagreement can be because scholars are wrong, but it could just as plausibly be due to a flaw in our own method. In the case of Genesis 1, it is not impossible that our method was thrown off by the repeated use of the term vayomer (he said), which is ever so slightly more common than the synonym vayedaber in the non-priestly texts. (I say thrown off because the chapter is unique in having several statements that are said to no one, and there is almost no data vis-à-vis word preference under those circumstances.) Another limitation of our method is that it looks for a convergence of linguistic features in a given section, which means that it is not built to disentangle passages that switch rapidly between one source and another. It is possible that this contributed to the fact that the nearly all of the aforementioned Flood narrative was assigned to a single cluster. In any event, these instances of disagreement are the rare exception, not the rule. We offer our entire dataset to the scientific community in the hope that it will prove useful in fine-tuning scholars’ analyses.

In conclusion, what has our experiment achieved? Those who already believe that there is a P/non-P dichotomy in the Pentateuch will now surely have their convictions reinforced. Those who think that the division of the Pentateuch into multiple strands rests on subjective analysis or the retrojection of modern literary norms onto ancient texts may wish to reconsider their views, given that such criticisms are less applicable to our methods. Those whose objection to previous scholarship rests on religious or traditional considerations have different issues to consider. Much has been made in the press and blogosphere of a somewhat tangential sentence in our article: “Those for whom it is a matter of faith that the Pentateuch is not a composition of multiple writers can view the distinction investigated here as that of multiple styles.” While the theory that God dictated his book to Moses cannot be disproved, there is a different, more falsifiable component of widespread traditional thought: that the Pentateuch is a unified text, rather than an amalgam of distinct compositions. At the very least, the fact that our results align so closely with previous scholars’ proposals should indicate that Bible critics are on to something in their quest to unravel the threads of the Pentateuch.

Comments (4)

Nice explanation of a new method. It looks sound to me, yet like all methods will probably undergo improvement with experience. Good luck and keep up the good work!
#1 - Paul Flesher - 08/09/2011 - 20:25

I'm pleased that biblical scholarship has finally proven the accuracy of computers and algorithms. This should put an end to unwarranted skepticism. Well done.
#2 - Ron Hendel - 08/10/2011 - 00:09

Do you have any plans to apply this method to the book of Isaiah?
#3 - cjbatch - 08/10/2011 - 19:57

Frank Cross once remarked, after carbon dating was used successfully on the Dead Sea Scrolls, that paleography had proven the validity of carbon dating. It now seems that source criticism has proven the validity of computational linguistics. Well done!
#4 - Ron Hendel - 08/10/2011 - 22:29

