Sunday, 13 November 2011

An Indian Interlingua - Antarbhasa

I have a confession here. To my knowledge, there doesn't exist any Interlingua equivalent of Indian languages. But that doesn't mean it isn't possible. The languages of India are sufficiently close so that a common vocabulary can be extracted and joined together by a simple grammar.

So, what's the language situation like in India?

Well, there are 103 languages spoken in the country and the speakers of the 12 major languages account for 90 per cent of the population. That's a major relief because instead of more than hundred languages, we can safely limit ourselves to the Big 12 and assume at least 9 out of 10 people will be able to relate to it.

These Big 12 are: 

1. Hindi (40%)
2. Bengali (8-9%)
3. Telegu (8-9%)
4. Marathi (6-7%)
5. Tamil (5-6%)
6. Urdu (5-6%)
7. Gujarati (4-5%)
8. Kannada (3-4%)
9. Malyalam (3-4%)
10. Oriya (3-4%)
11. Punjabi (2-3%)
12. Assamese (1-2%)

Of the Big 12, eight (in bold) are direct descendants from Sanskrit and the remaining four belong to the Dravidian family of languages. Let's deal with them one at a time: 

Sanskrit Descendants:

Those descended from Sanskrit have similar vocabulary and almost identical grammars. For an analogy, you can think of Sanskrit as Latin and its descendants as Romance languages.

Also, Hindi and Urdu have so much in common that it is impossible to distinguish the colloquial speech. 

The learned vocabulary is derived from Sanskrit; Urdu is the sole exception here. It prefers to borrow from Persian and Arabic.

Each of these languages, Marathi being an exception, has a script of its own and they are mutually incomprehensible.

On the plus side we have more or less a common word stock and similar grammars to begin with.

Dravidian Family: 

I haven't got much to say here because I don't speak any Dravidian language. But I do know that their grammars have nothing in common with the first group. 

Fortunately, they have borrowed from Sanskrit. I read a couple of years ago in The Tribune that 90% of words in Malyalam have Sanskrit origin. Tamil is the farthest you can get in terms of non-Sanskrit vocabulary. Still, I read that in a dictionary, 40% of its vocabulary comes from Sanskrit. 

Using the same analogy, you can think of the Dravidian group as English that has borrowed a lot from Latin but has kept its grammatical features intact.

Now that we have a word stock to start with, we can start registering and standardising a pan-Indian vocabulary. But how are we to go about it?

A way forward is to pick up a word as see how it appears in all the languages of the Big 12. If it has a similar form and meaning in 7 languages, it is accepted into Antarbhasa. A downside of this approach is that we will not have many Dravidian words because there are only four of them in the list. Also, it is not wise to give Hindi and Assamese equal voting power. Hindi is understood by more than half of the Indians but Assamese, on the other hand, is hardly spoken outside the tiny state of Assam. To overcome this, here is my proposal: 

Divide these languages into three groups. The first group will include Hindi and Urdu; Oriya, Bengali, Marathi, Assamese, Punjabi and Gujarati will be in the second Group and the Dravidian languages are put in the third group. 

After assigning the languages separate groups, we treat each group as a single entity and only when a word is common in at least two of them, does it become a part of Antarbhasa. English and Persian will be referred if there isn't a common ground. That's because these two languages, at different times, have been the official languages of the country since 1400 or 1500 AD. And if even that doesn't help, let Sanskrit come to rescue.

That's only my idea. I may be way off the mark. But it would be interesting to work on such a project. I don't know if Nikhil wants to work on something like this.

Once we have the vocabulary at our disposal we can start thinking of grammar. And given that there are eleven different scripts for these Big 12, the question of an official script is going to cause much heated debate.

4 comments:

  1. Sellamat Eto !
    This looks interesting (and this would be "interbahsa" in Sambahsa), and may help Westerners to have a clearer oversight of Indian languages. I think you should adopt the Romance script to help its understanding outside India. As you said, Dravidian languages have borrowed a lot from Sanskrit. At the very end, it may look like a simplified Sanskrit ! ;-)
    A man from Maine, USA, has been learning Sambahsa for a week on the Sambahsa Yahoo Group and he knows by now the basics of the language. You should have a look at this !

    Olivier

    ReplyDelete
  2. Shalom.
    For some words, two versions can be applied. Like in Interlingua: "etiam", "anque", "tamben", "alsi" for "also" or "sed", "ma", "mais" for "but". I used "tamben" because of my previous learning Spanish, but now I prefer to use "anque".

    Krzysztof

    Ps. My mother tongue is Polish

    ReplyDelete
  3. Firstly, I would wildly speculate that of the remaining 10%, they would have second language fluency in at least one of the 12 languages. Or they would speak a language that is closely related to one or more of the 12, so including the minor languages would add little to the analysis. To use the Interlingua analogue, Interlingua doesn't refer to Catalan. But despite of this, it still is likely to be just as easy for a Catalan speaker as for a Spanish speaker.
    Secondly, I would suggest that you give consideration to the fact that some of these languages are spoken not just in India. Obviously Urdu in Pakistan, Bengali in Bangladesh and Tamil in Sri Lanka. Others as well. Consider knocking down the partition borders (for the purposes of this conlang!) and making a language for the wider sub-continent. So include/consider some of the languages of Pakistan, Nepal, Bangladesh and Sri Lanka also.
    You may coincidentally find of interest my proposal for an auxlang tentatively called "Oriental" in a recent article on my blog http://konstspraik.blogspot.com/

    It's a proposal for a language based on major international languages around the Indian Ocean Rim, most of which are influenced by Arab-Islamic expansion, but also includes the influence of Sanskrit. The six source languages of my proposed auxlang are from 5 different language families, with only Hindi-Urdu and Persian being distantly related. The six source languages are unlikely to have common vocabulary for the most basic core vocabulary. It's probably necessary that the basic core-vocabulary would need to be based on lexical features from proto-indo-aryan. So it would be an Indo-European language with superstrates of arabic and turkic and graeco-romance internationalisms.

    ReplyDelete
  4. @ Olivier:

    That's interesting. And I have joined the Sambahsa Yahoo group too. There I am seeing some interesting developments. Like the lessons are complete for the guy from the US and you are working on a collaborative translation project. That's great!!!

    I apologise for replying so late. I just wasn't well.

    @ Krzysztof

    Thanks ! That would be of great help!

    @ David Parke

    That sounds great! But I doubt if such a language is possible. At best what we can do, I believe, is create a simplified Arabic. That's would comprehensible to many Indians, Iranians, Indonesians, Africans.... at first glance. That would also solve the problem of getting common every vocabulary.

    ReplyDelete