Friday, 28 January 2011

Why have I decided to learn Sambahsa and Japanese?

In his forward to The Sambahsa Grammar, Dr. Olivier Simon writes something fantastic. There, the creator of the language rightly says:

"Since they (IALs) first became popular in the late 19th century, [most of the] auxiliary languages have placed simplicity above all else."

Before I read this, I'd tried to learn Esperanto, Interlingua and Lojban and had read about Ido, Latino Sine Flexione, Lingwa de Planeta and a whole lot of others.

Most of them - with the exception of Klingon and Na'vi in my mind - claim to be easier than any other language for one reason or another.

In spite of their relative ease in learning, I couldn't understand, why I was unable to learn them. This often made me think that learning languages wasn't my cup of tea. But after I reading these sentences in the same forward, I'm beginning to doubt this: 

"The new state of Israel, at its inception, could easily have gone with one of the many international auxiliary languages and yet went with a language that was not created to be easy, a language [that] appealed to people for its spirit and heritage, and not [for] its simplicity or international character."

Dr. Simon is right in saying that we are more likely to spend time learning something that is appealing to us than something that's easy but doesn't evoke any emotions.

A corollary of this statement is that languages are more than mere words linked together by some rules of a grammar.

It is something that most conlangers don't take care of when they set out to design a language.  I read in a blog (probably by Rick Harrison) that what the conlangers do is they conceive of their language as a product and want to make it better thinking that it would attract more speakers.

This is as far from reality as it can be because if a better language (I don't know what makes a language better) could attract speakers, auxlangs wouldn't have, on an average, less than 25 speakers per language. Especially, given the current scenario in which almost each conlanger claims that his or her language is better than others.

Now, turning to answering why I've decided to learn a language that didn't even have more than two fluent speakers by last count, I don't need more than these three words: on a whim.

I'm not a professional linguist who can analyze and then can make judgments about the 'linguistic superiority' (does there exist such a term?) of a language. I've decided to learn Sambahsa because its creator has explicitly mentioned in the forward that learning his language is not going to be an easy task.

I know that learning languages is not easy. The creator of the Sambahsa doesn't try to hide the difficulties, a learner would face, by playing with words. To me this is 'good' and has positive connotations. You can also say it's the absence of any exaggeration of any kind about the language that has attracted me to it.

I don't know if I will ever be able to learn Sambahsa to fluency or not. But I do know one thing: I'll not drop the language until I can link it to goodness as I perceive it.

Talking of Japanese, I've decided to learn it for exactly the same reason. The Introduction to the Complete Grammar Guide by Tae Kim doesn't adopt any short cuts and fools learners into believing that this is going to be a cake walk.

Another reason is that I'm curious to know what the Japanese felt when they first encountered Europeans, who were technologically more advanced than the Japanese back in the 16th and 17th centuries. Also, I want to read Osamu Dazai in the original.

My plan is to spend thirty minutes everyday for five days a week on Sambahsa and I'm looking forward to reading Ithacus is Calator within three months from now.

About Japanese, I'll spend 45 minutes everyday for six days a week and still consider myself lucky if I could complete the Basic Grammar on Tae Kim's website and get the gist of even a single text on Aozora, the Japanese equivalent of our The Gutenberg Project, in the same period.

Wednesday, 26 January 2011

An Illusion of Two Different Languages

Hindustāni meiṅ Absraik (Abstract in Hindustani)

Āj bhī jab Pākistān aur Bhārat ke log ek dūsre ko jab bolte huye sunte haiṅ to vo ek dūsre ki zubān samajhte haiṅ. Aur ye sirf ek dhokhā he ke Hindī aur Urdū do alag-alag bhāshā he. Is kā sabūt ye he ke Urdū meiṅ gāne Bhārat meiṅ kāfī pasand kiye jāte haiṅ aur Hindī meiṅ filmeiPākistān meiṅ bahut caltī hai. Is ārṭikal meiṅ yahī batāne kī koshish kar rahā hūṅ. 

Resumo en Esperanto (Abstract in Esperanto)

Kiam personoj el Pakistano kaj Barato renkontiĝas kaj parolas inter si, ili facile komprenas unu la alian. Estas nur iluzio, ke ekzistas du malsimilaj lingvoj - hinda kaj urdua. La kantoj en urdua estas popularaj en Barato kaj televidprogramoj en hinda estas famaj en Pakistano. Se hinda kaj uruda estus apartaj lingvoj, ĝi ne estus eble.

The Article

It's been more than 60 years since we gained political independence from Britain. The freedom came at a huge cost: the country, originally known as Hindustan was parted into a Muslim majority Pakistan and a Hindu majority India. 

The partition was far from being a calm affair - it resulted in the displacement of more than ten million people across the newly drawn border and approximately a million people were killed in religious riots between fanatic Muslims on the one side and zealot Hindus-Sikhs on the other.

The partition not only resulted in the division of territory, the common language of country,  Hindustani, also fell a victim.

The fundamentalists started promoting Sanskrit, the sacred language of Hindus, in India and Arabic, the language revered by the Muslims, in Pakistan. But because they could not convince or force their populations to speak in Sanskrit or Arabic, they played a game.

In India, religious gurus littered Hindustani with words from Sanskrit and started calling it Hindi. The fundamentalists from Pakistan reciprocated in the same manner by borrowing unnecessarily from Arabic and Persian and declared Urdu (along with English) the official language of Pakistan. And thus, almost overnight, two new official languages - Hindi and Urdu - were born. 

To give an illusion of two separate languages the Indian government encouraged the use of Devanagri (a Sanskrit based script) to write the langauge and Government in Pakistan adopted Nastaleeq (a script based on Arabic-Persian characters). And the official stand, regarding to the writing systems used in both countries, is still the same.

Though people in both countries don't write it in the same script the colloquial spoken language is only as different as British and American English or Castilian and Latin American Spanish.

To cite you an example, here is a song I've been listening to a lot in the past few days. It is sung by Atif Aslam from Pakistan for an Indian movie. Ask any one from Pakistan or India to listen to it and then tell you if it's in Hindi or Urdu.

An Indian will swear to God that 'it's in Hindi'. In contrast, a common man in Pakistan would laugh at you if you suggested it's not in Urdu! As a result of decades of propaganda, people here are afraid of Urdu and consider it an "enemy language." Many here don't even have an idea about what Urdu is like or how close it is to Hindi. The same, I believe, must be true of Pakistan too.

That song by Atif Aslam is more of a rule than an exception. Indian movies and television serials (in Hindi) are popular in Pakistan and the shaer-o-shaeri (in Urdu) is widely appreciated in India.

Incomprehensibly occurs only when you swap to your everyday vocabulary in favor of more Sanskrit or Arabic (or Persian) words. And that's what the textbooks do in both countries. It's easier for a Pakistani to read a newspaper in Hindi and for an Indian to do the same than reading each others' textbooks.

To conclude, I'd say that the masses in both countries continue to use the pre-partition-language in daily life, the name Hindustani has been completely erased from their memories and there is only an illusion of two different languages.

Saturday, 15 January 2011

Punjabi - The Case for Roman Characters

Resumo en Esperanto (Abstract in Esperanto)

La panĝaba estas la dekdua aŭ dektria plej parolata lingvo en la mondo. La parolantoj de la lingvo uzas tri malsimilajn skribsistemojn por skribi la lingvon kaj nur malofte parolanto konas skribsistemon krom la sia. Tiu kaŭzas multe da malkompreneco inter tiuj, kiuj parolas panĝaban. En tiu ĉi artikolo, mi proponas, ke, malkompreneco povas malkreskigi, se ĉiuj parolantoj de la panĝaba komencas uzi la latinajn literojn krom sia propra skribsistemo. Tiu ago ne nur faros popolojn en ambaŭ Panĝaboj pli amikaj pri unu la alian sed ankaŭ helpos normigi la lingvon.

La Artikolo (the Article)

Punjabi is the twelfth or thirteenth most spoken language in the world. There are already three different scripts - Gurmukhi, Shahmukhi and Devanagri - used to write the language. Of these, Gurmukhi is the only script with an official status. It is recognized in the Indian Punjab. Speakers of the language in Pakistani Punjab write it using Shahmukhi and Punjabi speakers in India (outside Punjab) employ Devanagri to pen their thoughts in the language. 

What I just said is common knowledge. But what most people don't know is when the English, through the East India Company, firmly established their rule on Punjab (both Pakistani and Indian) after defeating the Sikh armies in both Anglo-Sikh wars during the 1840s, they tried to introduce Roman characters to write the language.

In fact, in 1894, Lieutenant Colonel J A L Montgomery, the Officiating Commissioner and Superintendent of Rawalpindi Division, wrote a letter(1) to the then Under Secretary of Government in which he, citing Mr. Wilson, suggested:
  1. "In all Government schools and colleges and in all Government offices only the Roman character should be used.
  2. "In all primary schools education should be carried on only in the Punjabi language written in the Roman character. 
  3. "A committee of scholars should assemble to draw up in Punjab and in the Roman character a grammar and dictionary of the authorized Punjabi language and school-books composed in that language and that character."
As to why the change in the script was required, many reasons were put forward. Among the primary ones were the need to standardize the language and make it easier for Punjabi to adopt English (European) scientific and technological vocabulary.

That's what the experts were talking about more than a hundred years ago. Now the question arises if these attempts still hold any relevance today. The answer is yes. 

The learned vocabulary of Punjabi on each side of the border is being increasingly  either Persianized or Sanskritized. A Punjabi speaker in India calls 'astronomy' khagol-vigyān (from Sanskrit) but a Punjabi speaker in Pakistan will not understand term unless you used the word falkiat (from Urdu, Persian).

Also, you rarely find a person who is able to read Punjabi in script other than his own. Shahmukhi is Greek to Indian Punjabis and a Pakistani Punjabi doesn't know how to decipher Gurmukhi or Devanagri. A disadvantage of this is that the people in both Punjabs are virtually unaware of the literature of other.

Having said that, it does not mean the problems resulting from the incomprehensibility of written word and over-Sanskritization or -Persianization are insuperable. A way out could be to first standardize the script and then coin new words either from existing roots or from widely recognized international scientific vocabulary.

Of the three scripts in current use, Gurmukhi is the most suitable to write the language. The only problem is: three-quarters of Punjabi speakers in the world don't know the script and are unlikely to adopt it for various cultural or religious reasons. But if a new script based on the Latin alphabet is introduced, no one is going to raise objections as such a script would be culturally neutral all sides - Muslims, Sikhs and Hindus.

Also, this script does not need to be created from scratch because there already exists a system for transliterating Sanskrit and Pali into Roman characters.  It is known as the International Alphabet of Sanskrit Transliteration. An extension of this alphabet,  called the ISO 15959 Transliteration of Devanagri and Other Indic Scripts into Latin Characters can be adopted to write Punjabi.  The ISO 15959 system is currently used in dictionaries and other scholarly works.

Here is what Punjabi written as per this system looks like:

Sab To Khatarnāk(2)
Kirat di luṭ sab to khatarnāk nahī hundi
Pulas di kusab to khatarnāk nahī hundi
Gadārī-lobh di muṭh sab to khatarnāk nahī hundi

Baiṭhe suteā faṛe jāā burā ta he
 Ḍurū jehī cup vic maṛhe jāā burā ta he
Sab to khatarnāk nahī hundi


Sab to khatarnāk hunda he
Murdā shāti nāl bhar jāā,
Nā hoā taṛap dā, sab sehe kar jāā
Gharā to nikalaā kam te
Te kam toṅ ghar jāā
 Sab to khatarnāk hunda he
Sāḍe supneāṅ dā mar jāā...

The use of diacritics - dashes and dots - is a necessity because I don't think it is possible to write the language in Roman characters in a manner that is both readily comprehensible to most people and not weird looking!

Before I conclude, I'd like to talk about another system which, though less perfect, is more popular with the people here. It is colloquially called Panjābi English Vich and it roughly translates as 'Punjabi [written] in [the] English [or Roman script].'

This system has not been devised by linguists but it has developed out of necessity. Mobile phones are a commonplace item in both Punjabs but surprisingly very few of these devices support Gurmukhi or Shamukhi characters.

This system, as the name suggests, is based on transliterating Punjabi through English values of Roman characters. Here are the main idiosyncrasies of this system that I've noticed: 
  1. Long vowels and short vowels are only distinguished if it is difficult to interpret them from the context. For example, ਆਜ਼ਾਦ is normally written as azad but it is  also not unusual to insert another 'a' and write it as azaad if the meaning is not clear from the context. 
  2. Both a regular 'n' and a nasalized '' are written as 'n'.  
  3. Finally, the system does not differentiate between retroflex consonants (ਟ, ਠ, ਡ, ਢ, ਣ) and dental consonants (ਤ, ਥ, ਦ, ਧ, ਨ). These sound clusters are transliterated as t, th, d, dh and n. It may given an impression that the distinction between dental consonants and retroflex consonants is not important but it's not true.
  • To cite an example, here are two sentences: 
ਉਹ ਦਾਦਾ ਹੈ। (Oh dada he.)
ਉਹ ਡਾਡਾ ਹੈ। (Oh dada he.) 
  • Though no distinction is made while expressing both sentences in Panjābi English Vich system, they are far from being interchangeable. The first sentence means, He is a grandfather while the other refers to him being a sadist. (Lit: He is a sadist.)
To summarize, I'd say that Punjabi will gain significantly by shifting to a Latin Alphabet based script from the current Gurumukhi, Shahmukhi and, to some extent, Devanagri. The shift will not mean a ban on the use of old scripts - people will continue to use their respective scripts for liturgical purposes if they want so - it would only mean more open communication, less differences and an increased feeling of shared common heritage between Muslims, Sikhs and Hindus.

    (1) = Bringing Order to Linguistic Diversity in India: Language Planning in the British Raj; Ranjit Singh Rangila, MS Thirumalai and B Mallikarjun
    (2) = Sampūran Pāsh Kāv, pg 256, Chetna Parkashan

    Wednesday, 12 January 2011

    Two Childish Questions about China

    Why are there only 340, 230 articles on the Chinese Wikipedia and merely 37, 000 on the Hindi Wikipedia when, in contrast, the English Wikipedia boasts of over 3.5 million articles?

    The primary reason for why there are not to be found even half a million articles in a language that is spoken by almost a fifth of world population is simple - the Chinese don't use Wikipedia.  Instead, they turn to two home grown encyclopedias - Hudong and Baike Baidu - for reference. Hudong has over four million(1) articles and Baike Baidu crossed the 2.8 million(2) mark in 2010.

    To have a glimpse of these Chinese encyclopedias, compare the articles on Manmohan Singh, the current Indian Prime-minister, on Wikipedia, Hudong and Baike Baidu.

    As far as the Hindi Wikipedia is concerned, there are two main reasons for the dearth of articles: 

    a. There are more illiterates in India than any other country(3) and 
    b. those who are lucky enough to have completed their education contribute mainly to the English Wikipedia(4)

    If the Chinese government has banned Google, how  do they look for information on the internet?

    Google was never as popular in China as it is in India; the proportion of population in China which used Google was only as significant as the number of people who use Yahoo! or Bing in this country.

    Also, as far as I know, there are five countries, including China, where Google is not the most used search engine. They are: China, the Czech Republic, Japan, Russia and South Korea. Except for Japan, the rest of the four have developed their own search engines; Japan uses the Japanese language version of Yahoo! called Yahoo! Japan.

    The Chinese love Baidu, the Czechs have developed an affection for Seznam, nostalgic Russians' response to American Google is Yandex and the South Koreans don't let a day pass without using their Naver.

    Moreover, these are not small players by any standard. Baidu is the sixth most popular website(5) in the world and Yandex's Alexa Traffic Rank is 24. The tech loving Japanese use Yahoo! Japan so much that Alexa has ranked it the 12th most visited website globally.


    (1) = Wikipedia Article on Hudong
    (2) = Wikipedia Article on Baike Baidu
    (3) = A The Indian Express Report on Illiteracy
    (4) = Aapka Apna Wikipedia; Eye, The Sunday Express
    (5) = Alexa Traffic Rank - Baidu