Monday, September 11, 2006

How to blog in Complex Asian / Indian Languages?


Blogging in Indic Languages? Correct your CSS first!

**-**

Blogs have redefined the use of internet. Already there were 47.5 million blog sites exists and 75 thousand more are added every day. Every hour 50 thousand blogs are updated. Most of these blogs are written in English, and chances are, you might be writing one, and if not, then certainly, reading regularly a few.

Blog: why other than in English?

If you are writing one or planning to write one in English, you will quickly realize that among millions of existing blogs in English, it becomes really tough to get an audience and get noticed. But if you plan to write one differently – in a language other than English, you will certainly get noticed among most users of that language, since number of blogger in other languages – especially Asian languages - such as Hindi or Tamil, having complex text layout - are very small. For example, all across internet, presently, there are only about 300 bloggers in Hindi and around 2000 bloggers in Tamil. There are some Indian languages, where number of bloggers had not reached in its tens. On an average, of these 300 odd Hindi blogs, about 10 posts are updated daily. There are every chance that these 10 posts got read by each and every Hindi users all across internet.

Now, the picture is quite clear to you, isn’t it? So, if you are planning to write a blog, or writing one already in English but facing readership crisis, then it is high time you write one in your own native language. Start writing one in your own language – other than English. You will get bands of readers and followers of your blog from day one, it is guaranteed.

Available tools and platforms:

Indic languages having Complex Text Layout (CTL) creates problems in rendering correctly across different platform and in different applications. Unicode and CTL display technology are comparably new and you will also find that many applications and platform still does not support these. You must have Windows 2000 or above or Linux Fedora core 3 (or equivalent time release of other version) or above for managing Indic Blogs. Otherwise, you will find managing and posting Indic blog a little difficult. Even in these operating systems, there were no out of the box support for Indic language, and you have to set many things manually.

Almost all popular blog platform like Blogger (http://www.blogger.com ), Wordpress (http://www.wordpress.com ), Livejournal (http://www.livejournal.com ), Yahoo360 and MSN now supports Unicode UTF-8 encoding by default and you can straightway start writing blog in Indic languages. Your blog contents will remain usable despite display glitches at later stages because of problems in template CSS design.

IndicBlog0005 This is how your text will display in Mozill Firefox with templates having character spacing settings by default.







Common Problems in Indic Language Blogs – Reasons and fixing them.

Indic language uses CTL fonts and due to this, they tend to display differently in different rendering engine. Moreover, in CTL, characters jumps and acquire different positions irrespective of keystroke. This confuses a genuine, plain application unless it is programmed to handle these strange behavior of CTL fonts. Since long, this is considered as main hurdle in showing strong presence of Indic languages in Internet. Many basic issues have been addressed, but, some glitches still remain there and you have to keep in mind following small points while selecting and modifying your Blog template.

  • Remove every instances of character spacing settings in your template by commenting it out or making these settings to Zero, else you will see strange display of your text. Character spacing disturbs the default positions of Indic characters – such as Matra, Anuswar etc., and hence do not display correctly. Browser like Internet Explorer and opera are designed to ignore these settings while displaying Indic language, but not other applications like Mozilla – and you have to realize that already there are a million users of Mozilla – and counting.
  • Similarly, your post’s text alignment should never be made Justified or right aligned. Due to reason stated above, Opera and Internet Explorer can display this well, but Firefox can’t handle this well as far as indic language is concerned, and it displays text weirdly.
  • In template source, add definition for your language, in addition to English such as “lang = hi_IN” for hindi and “lang = ta_IN” for tamil etc.

IndicBlog0003 You need to add Language definiotion in your template, for example, add lang = “hi_IN” for Hindi and lang = “ta_IN” for Tamil.














  • It will be a good idea to add language specific Unicode Font definition in template at all such instances. For example, add Mangal if you are writing in Hindi.
  • If charset is defined in template then make sure that it is “UTF-8”, else, your page may not display correctly.
  • Try to write your post directly within text area of built in post editor. If you copy pest content from some other source, be ready to get strange display when you publish your post. This happens because during copy paste, character formatting may also gets copied in some cases, which may disturb actual font rendering in some browser.

IndicBlog0004 Remove or comment out every instances of letter spacing, text – right alignment and text Justify settings from template source, else your page will not display correctly.














Indic Language Keyboards – Online and Offline tools:

Except some latest versions of Linux such as Fedora Core 5, your operating system does not come with default installation of Indic Keyboard. Even Windows XP does not comes with one. However, new, Windows XP Starter Edition has out of the box support for Hindi & Tamil, and work is on to add many more Indic languages. You can always add one from Regional and Language Options from your Windows Control panel. If your language is not listed there, then you can download one from many language specific IME tool from BhashaIndia (http://www.bhashaindia.com/ ). In Windows IME, you will find many keyboards from Remington to Phonetic in almost all Indic language packs. Similarly, nearly all new Linux distributions comes preloaded with Inscript Indic language keyboard. You simply need to activate the particular language specific keyboard you want. See http://raviratlami1.blogspot.com/2005/06/using-indian-language-in-linux-fedora.html for detailed instruction to setup Indic Keyboard in Linux. There also exists many beautiful online keyboard which you can also use if you can’t install language specific Keyboard locally. There is a browser based tool in Sourceforge that let's you type in many Indian language - http://sourceforge.net/project/showfiles.php?group_id=144879&package_id=185123 , similarly there is an online keyboard which let's you type online in world’s many languages including Indic languages - http://www.gate2home.com/?language=hi&sec=2

Present and future of Indic language blog

Google has predicted that in recent future, Internet content will be dominated by languages from China and India. Because Unicode technology is relatively new, all application and platform do not support Unicode and hence Indic language blogs were not able to show its strong presence. But, things are changing really fast and, if we believe Google’s saying, future is really bright for Indic Language Blogs.

.

.

*******

Microsoft - BhashaIndia Indic Blogger Award

IndicBlog0006 Results of Microsoft BhashaIndia first ever Indic Blogger awards are out.











In June 2006, Microsoft – BhashaIndia has declared its first ever Indic Blogger Awards (See http://www.bhashaindia.com/contests/iba/Winners.aspx for details about winners of 2006.) for blogs written in 11 Indian languages. The languages were- Bengali, Hindi, Tamil, Marathi, Malayalam, Gujarati, Konkani, Sanskrit, Telugu, Kannada and Punjabi. One blog from each language was given Best Blog award, and, among all languages, there were best blog award for these categories - Activism / Social Activities, Art and Literature, Entertainment, Journal, Political, Sports, Technology, and Topical. Winners were chosen on the basis of six broad criteria - Quality of Content, Quality of Language, Frequency of Posts, Visual Aesthetics, Popularity and Features.

Microsoft had stated that - The Indic Bloggers Awards was conceptualized with the aim of encouraging those people who till date have been not only expressing their opinions but also promoting the use of an Indian language on the Internet.

Indic Blog award is not new. The Indibloggies award (See http://indibloggies.org/ for details) is running successfully since last two consecutive years, giving many prominent award in many categories among Indian bloggers of all Indian languages that includes English. So far, this award has been given on the basis of Online Voting.

2 comments:

John samuel said...

Nice blog. Useful for bloggers striving hard to publish their blogs in Indian languages.

Raviratlami said...

Hi John,

Thanks for your comments :)

More Articles...

Translate in your own language

Want to translate this article in your own language? Just click the Flag below