Sunday, 13 November 2011

An Indian Interlingua - Antarbhasa

I have a confession here. To my knowledge, there doesn't exist any Interlingua equivalent of Indian languages. But that doesn't mean it isn't possible. The languages of India are sufficiently close so that a common vocabulary can be extracted and joined together by a simple grammar.

So, what's the language situation like in India?

Well, there are 103 languages spoken in the country and the speakers of the 12 major languages account for 90 per cent of the population. That's a major relief because instead of more than hundred languages, we can safely limit ourselves to the Big 12 and assume at least 9 out of 10 people will be able to relate to it.

These Big 12 are: 

1. Hindi (40%)
2. Bengali (8-9%)
3. Telegu (8-9%)
4. Marathi (6-7%)
5. Tamil (5-6%)
6. Urdu (5-6%)
7. Gujarati (4-5%)
8. Kannada (3-4%)
9. Malyalam (3-4%)
10. Oriya (3-4%)
11. Punjabi (2-3%)
12. Assamese (1-2%)

Of the Big 12, eight (in bold) are direct descendants from Sanskrit and the remaining four belong to the Dravidian family of languages. Let's deal with them one at a time: 

Sanskrit Descendants:

Those descended from Sanskrit have similar vocabulary and almost identical grammars. For an analogy, you can think of Sanskrit as Latin and its descendants as Romance languages.

Also, Hindi and Urdu have so much in common that it is impossible to distinguish the colloquial speech. 

The learned vocabulary is derived from Sanskrit; Urdu is the sole exception here. It prefers to borrow from Persian and Arabic.

Each of these languages, Marathi being an exception, has a script of its own and they are mutually incomprehensible.

On the plus side we have more or less a common word stock and similar grammars to begin with.

Dravidian Family: 

I haven't got much to say here because I don't speak any Dravidian language. But I do know that their grammars have nothing in common with the first group. 

Fortunately, they have borrowed from Sanskrit. I read a couple of years ago in The Tribune that 90% of words in Malyalam have Sanskrit origin. Tamil is the farthest you can get in terms of non-Sanskrit vocabulary. Still, I read that in a dictionary, 40% of its vocabulary comes from Sanskrit. 

Using the same analogy, you can think of the Dravidian group as English that has borrowed a lot from Latin but has kept its grammatical features intact.

Now that we have a word stock to start with, we can start registering and standardising a pan-Indian vocabulary. But how are we to go about it?

A way forward is to pick up a word as see how it appears in all the languages of the Big 12. If it has a similar form and meaning in 7 languages, it is accepted into Antarbhasa. A downside of this approach is that we will not have many Dravidian words because there are only four of them in the list. Also, it is not wise to give Hindi and Assamese equal voting power. Hindi is understood by more than half of the Indians but Assamese, on the other hand, is hardly spoken outside the tiny state of Assam. To overcome this, here is my proposal: 

Divide these languages into three groups. The first group will include Hindi and Urdu; Oriya, Bengali, Marathi, Assamese, Punjabi and Gujarati will be in the second Group and the Dravidian languages are put in the third group. 

After assigning the languages separate groups, we treat each group as a single entity and only when a word is common in at least two of them, does it become a part of Antarbhasa. English and Persian will be referred if there isn't a common ground. That's because these two languages, at different times, have been the official languages of the country since 1400 or 1500 AD. And if even that doesn't help, let Sanskrit come to rescue.

That's only my idea. I may be way off the mark. But it would be interesting to work on such a project. I don't know if Nikhil wants to work on something like this.

Once we have the vocabulary at our disposal we can start thinking of grammar. And given that there are eleven different scripts for these Big 12, the question of an official script is going to cause much heated debate.