Thursday, February 28, 2008

Carabao Language Kit 1.1.0.0 released

The version 1.1.0.0 is now available for download.

Fixed:

  • Volatility of newly assigned rule units in late sequences
  • Inconsistencies in the generation of inflected forms in design time

Added:

  • Friendly GUI of meta-rules such as lemmatized forms and generation of inflected forms
  • MorphoLogic now inspects the design time data generation meta-rules when generating inflected forms

Improved:

  • Processing speed and memory consumption
  • Increased maximum length of the meta-rule content field
  • Increased some fields to accommodate large sequences and a lot of grammatical data
  • Concurrency during long processing
NOTE: if you are upgrading from 1.0 and would like to keep your data, please run convertTo11.exe executable on your data.

Sunday, February 24, 2008

Our products are now available at ComponentSource

ComponentSource, the largest online reseller of software components, is now selling Carabao DeepAnalyzer, with Carabao MorphoLogic and Carabao Translation Server on the way. Here is a direct link to our page:

http://www.componentsource.com/features/digital-sonata/index.html

It took us a while (over 2 months) to sign up, with all the checks, examinations, questions, and reviews.

ComponentSource provides the corporate customers a more convenient mode of purchase, compliant with their supply chain procedures, and establishes higher visibility for our products.

Why is MT so formal?

I came across an interesting discussion about machine translation in LinkedIn:

http://www.linkedin.com/answers/international/internationalization-localization/INT_INZ/172005-2191793

Among the obvious stuff (obligatory "spirit is willing, flesh is weak", "out of sight, out of mind" quotes and recommendations from professional translators to hire professional translators instead), there was one curious comment that machine translation is "unnecessary formal".

Brushing aside the exaggerated expectations (you don't expect your computer to have a Jerry Seinfeld inside, do you?), I now recall that when I myself first encountered an MT software (it was PARS in early 1990s), what struck me was the unnecessarily formal style of the output (OK, nowadays they also have SMTS, which produces "porridge o' words" style).

Really, why does it have to be so formal?

If you fly often, and it's usually not business class, then probably you developed strong aversion for airplane food and collocations like "sky chefs". While usually food is well-preserved and reasonably fresh, it rarely tastes like real food with real flavour. I love spicy food and I frequently fly Asian airlines, but I never got to taste real spice there. Aside from the safety concerns (sick people on the plane don't really make it fly faster), I think the reason is that they are aiming for the bland, politically correct, acceptable, good enough by everyone average. No one gets offended. No one gets hurt.

It is the same with MT. The formality is not a product of technical limitations. It is possible to implement all styles even in older generation systems, but it is more difficult to maintain them. So essentially, the developer needs to pick one style. And if it has to be one style, the best bet is for a bland, politically correct, formal language.

And just like in the case of the airlines that do not set the goal of providing a unique culinary experience, no MT system ever promised to produce a literary masterpiece. Just as you pay the airlines to get you from point A to point B, you use MT to get your "cargo" from a source language to a target language, with as little damage to the wares as possible.