Friday, December 10, 2010

LinguaSys - USA Today

An article about an Android app built by our partners in LinguaSys using Carabao as a backend (among others), is published in USA Today and other news sources.

Monday, December 6, 2010

ALTA 2010

Digital Sonata was invited to present at the Australian Language Technology Association 2010 workshop on Thursday, December 9, 2010. Overview and directions are here.

Vadim Berman will be speaking some time between 3:30pm and 5:30pm.

Monday, June 7, 2010

Digital Sonata Signs Long Term Exclusive Agreement with LinguaSys™ For Use Of Carabao For Machine Translation

Digital Sonata signed a long term deal with LinguaSys™ for the exclusive use of Carabao in machine translation (MT) solutions on March 24, 2010.

Carabao is a hybrid language translation system using both statistical and rules-based methodologies. Vadim Berman, CEO of Digital Sonata in Australia, and Chief Technology Officer and a co-founder of LinguaSys, is the author of Carabao. Berman has a wealth of dedicated experience in the field of MT and text analysis.

Brian Garr, CEO of LinguaSys said, “We are very excited that this incredible technology from Digital Sonata will help us create the next generation of language translation solutions.”

LinguaSys is a new next generation machine translation company. LinguaSys’ Carabao language middleware uses language processing methodologies offering excellent comprehension in the least amount of time at low cost. LinguaSys enables enterprises to translate volumes of information, including text chat, e-mail, web pages and documents, quickly, accurately and automatically. LinguaSys provides the creation of new MT languages, customized lexical services, ease of use, compatibility with existing natural language software, security behind the firewall, availability, integration and lower memory requirements.

Monday, May 3, 2010

Carabao Language Kit released

The version is now available for download.


  • Handling of control priority greater than 2, when some of the members have no feasible agreement graph. The result was, that some parts of the sequence worked, and some didn't.
  • Truncation of very long sentences


  • A utility to validate and correct rule unit values
  • A generic support for formatted processing, e.g. HTML, XML, SGML including embedded formatting elements in the text flow
  • GUI to test formatted processing in Carabao Test Console
  • Automatic conversion of double-byte space characters into standard single-byte


  • Regular expressions for segmentation into character classes for double-byte languages
  • Perl-compatible regular expressions have been introduced for unknown heuristics
  • Frequency-based backtracking added to the tokenization algorithm
  • Unicode clipboard support in Carabao desktop suites is now bidirectional: when leaving the application and when coming back to the application

Wednesday, January 27, 2010

English - Swedish OLIF dictionary released

Engish - Swedish OLIF dictionary added to the list of OLIF lexicons distributed by Digital Sonata. The dictionary is available for download from

Sunday, January 10, 2010

Bilingual OLIF dictionaries released

Digital Sonata released a set of low-cost royalty-free bilingual dictionaries in OLIF format, optimized for use in NLP and content management applications. Translation, part of the speech, and a thesaurus article is included. The dictionaries are available at Currently the following dictionaries are available:

  • English -> Finnish

  • English -> French

  • English -> German

  • English -> Japanese

  • English -> Korean

  • English -> Russian

  • English -> Spanish

Tuesday, January 5, 2010

Carabao Language Kit released

The version is now available for download.


  • Transliteration to empty string
  • Partial transliteration


  • Change log which allows distributed collaboration


  • Processing speed
  • Entry matching accuracy