Wednesday 29 August 2012

六百十二 - 612

我知道六百十二漢字。

It says, I know 612 Chinese characters. 

I just did a test on NCIKU and based on my scores their estimate is 612. Here are the results: 


What do I think about it? They're generous! 

I am trying to read texts aimed at beginners who know at least 500 characters and every second or third sentence there crops up an unknown Hanji. No, I am not complaining. I love it. Perhaps I didn't know readers also include some new vocabulary.

Now I know how to say: 明天我跟外星人一起去看電影。(I am going to watch a film with an alien tomorrow.)

Tuesday 28 August 2012

Chinese studies continue

It's a miracle! I am still studying Chinese. Otherwise what usually happens is I give up a language after a couple of months at most.

Today I read another short text in the language. Here it is if you are interested: 


Beautiful, isn't it?

It isn't some old manuscript. I have used GIMP (Linux equivalent of PhotoShop) and other tools to create this image. This helps me learn more about my OS and in the process I can revise Chinese.

Tuesday 7 August 2012

WW2 propaganda in Roman Hindustani

Found an interesting webpage a few days ago. It tells a fascinating account of the propaganda war between the British Colonial Government on the one side and the Axis powers (Germany and Japan) and a group of Indian freedom fighters on the other.

I have yet to read the whole thing. But skimming through it I noticed something damn interesting. The British, Germans, Japanese and Indians were using Romanised Hindi/Urdu to reach the masses. 

The title of a leaflet from the Indian National Army of the Provisional Indian Government reads: 

Burma par dobara qabza karna ger-munkin hai.

(It's impossible to conquer Burma again.)

The leaflet refers to the British attempts to recapture the territory from the INA and its Japanese allies.

A German propaganda pamphlet asks Indians if they were aware of what was happening in the world. 

It reads:

Hindustánio! Tum ko kuchh khabar haí kih dunyá men kyá ho ráhá haí?

Another example of German efficiency. They mark long vowels! It makes the text easier to read.

That was more than 60 years ago. What makes it so fascinating is that the people writing them were no scientists experimenting on a new script. They were propagandists of desperate regimes trying to woo Indians. And this makes me wonder, why would they chose to put forward their side in Roman script. I don't have the faintest idea.

Aside from the script, the propaganda is chiefly in four languages: English, Hindi/Urdu, Bengali and Tamil. English probably for the educated urban populace, Hindi/Urdu for the North and the West, Bengali for the Eastern parts and Tamil for the South.

Monday 6 August 2012

Largest online encyclopaedias

Wikipedia is the largest encyclopaedia ever written. There are more than 22 million (23 July, 2012) articles on it in more than 100 hundred languages. With a little more than four million articles, the English encylopaedia is the largest. Next comes the German Wikipedia with its approximately one and a half million (1.44 million) articles. The French version will soon cross the 1.3 million mark. Currently these are the only Wikipedias (Dutch too, with its more than 100,000 bot generated articles)  to have more than a million articles. The Italian, Spanish and Polish versions are set to join the club before the end of this year.

Compared to this, the Chinese Wikipedia's performance is dismal. Despite having more than a billion speakers, the article count is barely above half-a-million. Even Japanese (with much less speakers) boasts of close to 900,000 articles.

So are the Chinese (just like the Arabs and Indians) disinterested, or are we missing something?

It turns out, they are just as excited about voluntarily contributing online as anyone else in the world. The Chinese Wikipedia may not have an article count comparable to English, German or French, but then it's not the only Chinese collaborative encyclopaedia online.

Baike Baidu (百科百度) and Hudong (互動) are two such giants. There is also Soso (搜搜). Of these, with its 6.4 million (1 July, 2012) articles, Hudong is the largest. Then comes Baike (5 million). It's followed by Soso (900,000).

I got the numbers from this table. This is an article on Baike Baidu on the Chinese Wikipedia.

Here is a screenshot: 


And here are the numbers:


Translation help:

First row from second column onward:

維碁百科:Wikipedia
中文維碁百科:Chinese Wikipedia
百度百科:Baike Baidu
互動百科:Hudong
搜搜百科:Soso

First column, third row from top:

條目數:Article count

So, it turns out the largest online encylopaedia is Hudong. The second largest is Baike Baidu, followed by the English, German and French Wikipedias.

Saturday 4 August 2012

Saanjo - a script for Punjabi

Yesterday I found this. It is a script created by Ejaz Mahmood to write Punjabi. What I like about Saanjo? Unlike Gurmukhi and Shahmukhi, it does represent tones. Though the creator doesn't call them tone markers, my understanding is that the "Supportive Signs" are indeed there to show tone. However, I may be wrong. 

Also unlike Gurmukhi, which is an Abugida, and Shahmukhi, an Abjad, Saanjo is an alphabetic script. There is a guide to the script on the website for those who wish to learn.