Researchers use statistical indices to measure variations in letters
Two scientists working at The Institute of Mathematical Sciences, Chennai, (IMSc) have figured out a way to computationally estimate whether a language is written from left to right or otherwise. Most interestingly, they have studied the Indus script and calculated that it must flow from right to left.
“Professor Iravatham Mahadevan [the well-known Indus scholar] was one of the experts who had figured out that the Indus script ran from right to left by observing how the writing got a little cramped as it ran towards the left — suggesting that the writer started writing at the right end and ended up running out of space as he or she reached the left end,” says Sitabhra Sinha of IMSc, one of the scientist who carried out the study. “In a workshop at Roja Muthiah Research library, he asked the audience whether it was possible to come up with a mathematically rigorous technique to infer the ‘handedness’ of a script — that is, to deduce whether the script was written from left to right or right to left,” adds Mr. Sinha. This question set Mohammed Ashraf, a research scholar with B.S. Abdur Rahman University, Chennai, thinking and led him to this collaboration and discovery.
We know intuitively that in a language, some words are used more often than others. Similarly, some letters of the alphabet occur more at the start of words and others are more common at the end of words. The variation faced by different letters may be measured using two independent statistical indices — the Gini index and Shannon’s entropy. Mr. Sinha and Mr. Ashraf have established that there is a difference between these measures when calculated for the first letter and the last letter. This difference between start and end of a word makes it possible for them to identify whether the word is written from left to right or the other way around.
In most of 24 languages studied, including Arabic, Chinese, Korean, and Sumerian, the duo was able to match their results and predict using their computation alone whether the words in that language were written left to right or otherwise. In the hitherto undeciphered Indus script also, they predict that the words are written from right to left.