Hailing from the small town of Shimoga in Karnataka, 34-year-old Ram Prakash Hanumanthappa, who holds a BTech degree in CS & Engineering from IIT Madras, always wondered why Indian languages were not popular on the Internet. Apart from that, the fact that he didn’t have much exposure to speaking in English led him on to do some research. He soon realised that it was simply due to the lack of an easy way of typing in Indian languages. He explains, “There were complicated rule-based Indian language transliteration methods, which I thought a common user would not have any incentive to learn. They expected users to strictly follow key combinations so that it was easy for computers to convert them to Indian languages. For one it put the onus on the user to follow strict rules, and on the other hand kept the role of the software trivial. This I thought should be other way round.”
In fact, he strongly believes that users should be able to type intuitively and the software should take the onus of producing the correct output. The fact that he hadn’t touched a computer before he joined IIT Madras helped him relate to people who haven't been initiated to computers. This is what guides him when designing tools that will enable people to use software, but most importantly, without them having to learn a lot of new things. It also spurred his interest in machine intelligence and fuelled his aspiration to build intelligent systems that allow people to interact with them naturally. He saw problems in the existing transliteration technology and decided to develop a solution.
When developing Quillpad, Ram drew from his own experiences and this helped him to create an intuitive tool
Outlining the issues, he says, “The transliteration rules were not natural as the attempt was to provide phonetic, one-to-one correspondence between English alphabets with 26 characters to Indian language alphabets with 50+ phones. So the key mappings had to use upper case, lower case and even characters like ~,^ etc to complete the mapping. For example, to write a word like 'rashtrapati', user was expected to input 'rAShTrapatI'. Only correct capitalization and exact spellings would give correct output. The second method was that of using specific keyboard layouts for Indian languages. This too wasn’t easy, as the layout had to be imagined on the English keyboard because Indian language keyboards weren't easily available.”
The tech behind Quillpad
Back then, Ram had just started to understand machine learning algorithms and had the idea to apply machine learning approaches to build a predictive transliteration engine. What started in 2006 slowly emerged in the form of Quillpad that allows users to type in Indian language using intuitive phonetic spellings instead of following any particular rules. Interestingly, he says he didn’t face much trouble when developing Quillpad and what was actually most difficult was convincing others about the need for such a tool. He says, “The first method I tried to build the Quillpad predictive transliteration engine worked beautifully. It still remains the best approach in comparison with other predictive transliteration solutions available. Since the prediction engine was modeled as a statistical one, I didn't have any trouble making it work for multiple languages in the first go. I actually don't remember facing any challenge in developing Quillpad except for challenges in convincing people that this can change the way people would use Indian languages on computers. Somehow people thought everybody in India knows English and wondered who would use Quillpad.”
Needless to say, the skeptics were proved wrong, as today Quillpad has over 3,00,000 words typed everyday on the site. Quillpad has also won the Ram MIT Technology Review’s ‘Young Innovator under 35’ award in 2010. Explaining the technology behind Quillpad, he says, “It uses statistical machine learning to convert each input character in English to a corresponding Indian language sound, depending on the context of other letters in the input. The method is completely language independent and can learn to predict transliteration between any two alphabet based languages in the world. We use decision trees to learn and encode the transliteration rules. This tree is automatically built by statistically learning the prediction rules from a given language corpus. For instance, Quillpad allows users to type 'rashtrapati', 'raashtrapathi' etc as long as it is readable phonetically. The correct output for each of the characters is predicted depending on the context. For example, if you type 'vishesh' in Quillpad, the first 'sh' and the second 'sh' will get correctly predicted to be different 'sha' sounds in Hindi.”
The correct output for each of the characters is predicted depending on the context.
In simple terms, the prediction is modelled as a statistical process. What they have done is taken the corpus of Unicode words for each language from the Internet and fed it to the process. Each time a user types in an English language letter, a decision tree listing the various phonetic probabilities is formed and the best one is selected based on the history of usage. This is what makes the process so simple, he explains. “It learns the nuances of the language accurately from the corpus of Unicode words we just scrape from the Internet. There is no manual tuning or linguistic rules to be incorporated in our code. Each language and its nuances can be modeled in a simple text file and such a model can be created by anybody who knows to speak the language, within an hour or two. We don't encode any linguistic rules and hence we don't need linguistic experts to help us make Quillpad work for any new language.”
Quillpad is marketed by Tachyon Technologies, primarily a technology licensing company which boasts of a client roster that includes names like Indiatimes, Rediff.com, Hungama, in.com, OnMobile and Yahoo.com! Apart from this, it has also been put to use by the Unique Identification Authority of India (UIDAI). This not only validates Ram’s belief about the need for such a tool, but also proves how efficient it is compared to other transliteration platforms, several of which are free. Speaking about what sets Quillpad apart, Ram says, “Prediction accuracy, natural support for various phonetic spelling variants and most of all allowing users to mix English words with native English spellings is what gives us an edge over others. The last feature is particularly important as English words are a strong part of our vocabulary. When typing English words, users may either type the word phonetically, or most likely they will use native English spelling. Since native English spellings are not phonetic, we can't interpret them phonetically. We have an engine that converts native spellings to Indian English pronunciation and then further transliterates. This allows users to type any English word in the midst of any Indian language sentence, adding fluency to the interface. Try typing words like ‘flight’ or 'purse' in Quillpad and on Google Indic transliteration to see the difference.”
The Quillpad Touch app is available for download from Apple app store
The focus now is on making Quillpad available to the vast majority of Indians. As a step forward, Quillpad now has an app as well as a social platform. Speaking about the initiatives, Ram says, “With the ever-increasing popularity of tablet devices, we wanted to build a version that can make the best of touch gesture capabilities. We hit upon the idea of building an interface that combined the best of keyboard and handwritten gesture input. We started working on it in November 2011 and released the first version on January 8, 2012. Quillpad Touch can be downloaded for free from the iOS App Store and right now only supports Hindi. We are also working on an Android app. Besides this, on the site, we have users writing everything from novels and movie scripts to Facebook and Twitter messages. Encouraged by this we plan to add more features that will make it easy for them to use Quillpad more effectively. As a step in this direction we recently beta launched Quillpad Social from which users can post to Facebook or Twitter directly from the Quillpad website. Otherwise they would cut and paste after composing the message on Quillpad.in.” Quillpad currently supports Bengali, Gujarati, Hindi, Kannada, Malayalam, Marathi, Punjabi, Tamil and Telugu.