18 August 2009

Unraveling Ancient Documents

Computer science and humanities departments have joined forces at Ben-Gurion University in Beersheba to decipher historical Hebrew documents, a large number of which have been overwritten with Arabic stories. The unique algorithm being used to determine the wording was developed by BGU computer scientists. The documents are searched electronically, letter by letter, for similarities in handwriting which help determine the date and author of the texts. The documents being deciphered at BGU are degraded texts from sources such as the Cairo Geniza, the Al-Aksa manuscript library in Jerusalem, and the Al-Azar manuscript library in Cairo. All together, the base consists of 100,000 medieval Hebrew codices and their fragments that represent the book production output of only the last six centuries of the Middle Ages. The purpose of the project is to classify the handwritten documents and determine their authorship. One problem is that many of the original Hebrew texts which were found in the Cairo Geniza have been scratched off, and the parchment used to write an Arabic text.

Although the texts are in Hebrew, the task of deciphering what is written is difficult because the historical documents have degraded over time. Now, the foreground and background lettering are hard to separate and there are smudges on the ink of much of the text which intensifies the background coloring. Furthermore, ink from the alternate side of the document adds blotches to the lettering. To solve the problem, the algorithm is used to cover the text in a dark grey color, which then highlights lighter colored pixels as background space and identifies the darker pixels as outlining the original Hebrew lettering. There are two separate academic disciplines interested in driving this project forward. First, linguistic specialists seek to gain a deeper appreciation of the origins of the Hebrew language. Second, Jewish philosophers are interested in studying ancient forms of prayer that are thought to be contained in the texts. With the new algorithm, researchers hope to create a catalogue of all the texts and piece together the ancient prayers and other documents, including those citing Jewish law.

More information:

http://www.jpost.com/servlet/Satellite?cid=1249418581591&pagename=JPost%2FJPArticle%2FShowFull