Canku Ota Logo

Canku Ota

Canku Ota Logo

(Many Paths)

An Online Newsletter Celebrating Native America


January 26, 2002 - Issue 54


pictograph divider


Scientists at CMU Join Forces to Create a Program to Preserve Vanishing Languages

by Byron Spice, Science Editor, Post-Gazette
credits: graphic courtesy

Racing computer

Petu afi ti purum means "The dance is ending" in the language of Chile's Mapuche Indians. Not many people still speak the Mapudungun language, however, and so the dance may indeed be ending -- for Mapudungun.

Chile has the largest Mapuche population in the world, but its official language, the language taught in schools, is Spanish. Mapudungun has been reduced to a "kitchen language" that few people speak and fewer young people learn.

It's the same story 8,000 miles away, on Alaska's North Slope. Schools once punished Inupiat Eskimo children who spoke Inupiaq, telling them the brain has a limited capacity that should be reserved for English. Now, thanks in part to the pervasive influence of television and radio, few Inupiats younger than 45 speak the language of their ancestors.

Mapudungun and Inupiaq are just two examples of the thousands of languages considered endangered; the Worldwatch Institute last year estimated that 50 percent to 90 percent of the world's 6,800 languages might disappear by the end of the century.

But researchers at Carnegie Mellon University's Language Technologies Institute are working with the Mapuche, the Inupiat and indigenous peoples in Peru and Colombia to develop new machine translation tools that may help them revitalize and preserve their languages.

Computer programs that could automatically translate English into Inupiaq, for instance, would make it possible to create up-to-date science textbooks for Inupiat classrooms. Translation programs for Mapudungun to Spanish would help the Mapuche interact with their government and make greater use of the Internet.

Making these machine translation programs a reality, however, will require the "democratization" of language software development, said Jaime Carbonell, institute director.

Existing machine translation programs, he explained, are expensive, requiring perhaps a person-century of work for each pair of languages. Spending $20 million to develop such a translating machine is worthwhile for the couple of dozen languages, such as English, Japanese and Spanish, that are both widely used and important for commerce. But that kind of money simply isn't available for minority languages.

So Carbonell and his colleagues, including linguist Lori Levin and computer scientist Alon Lavie, are looking for shortcuts that would allow the indigenous groups to do much of the leg work -- actually, tongue work -- that otherwise might be done by computer and linguistic experts.

"We're essentially trying to empower them to do this for themselves," Carbonell said. That accomplishes two goals: reducing the cost of software development and giving the people a greater stake and greater control over their own language tools.

The hope is to take advantage of whatever dictionaries and other language references already exist and to design computer software that can discern for itself many of the usage rules by analyzing sets of "elicitations" -- a long list of sentences that native speakers translate into their language.

"We're not assuming that the people we have access to will necessarily have a deep understanding of computer skills," Lavie said. But if they can translate the elicitations, the computer program can learn many of the usage rules.

If it works, this process should yield a translating machine with a few-person years of work by native speakers supplemented by a few person-months of work by experts, rather than a person-century.

That doesn't mean it is an easy process. The Siona people of southern Colombia, for instance, found the set of elicitations -- "The feather fell," "The feather is falling," "A feather fell," etc. -- extremely boring.

"They started out enthusiastically," Carbonell recalled, "and then after a while they said, 'What else have you to do?'"

But many indigenous groups have come to appreciate that preserving their language is worth the effort.

"Language and culture are synonymous," said Peter Wilkniss, president of the Transnational Arctic and Antarctic Institute in Anchorage, Alaska. As languages are lost, cultures lose touch with the knowledge and wisdom of their ancestors.

"In Alaska, people are desperately trying to keep their identity," he added. The Inupiat call it "living with one spirit in two worlds" -- their own rural, subsistence culture based on whaling, hunting and fishing, and the side-by-side high-technology culture of the oil industry and the video lessons beamed into their tiny schools from distant cities.

Edna Ahgeak MacLean, a former language professor who is now president of Ilisagvik College in Barrow, Alaska, said a concerted effort began five years ago to once again teach Inupiaq to children. Knowing the language of your ancestors provides a feeling of satisfaction and self-worth, she said, that is reflected in other school work.

"When you're feeling good about yourself ... learning is much faster and more meaningful," she added.

The Inupiaq language is attuned to the Arctic environment, where little distinction is made in the winter between the frozen, snow-covered sea and the flat, featureless, snow-covered land. A single word can encompass several pieces of information, such as an object's motion, its visibility, whether it is up or down, and its orientation to a horizon. Samna, for instance, describes something that can't be seen, covers a large area and is equidistant between the people who are talking.

Machine translation tools could help compensate for the lack of Inupiaq textbooks, MacLean noted. Also, Inupiat children are technologically savvy, so the availability of computer-assisted learning tools could make learning their ancestral language more appealing.

George Aaron Broadwell, chairman of the Linguistic Society of America's committee on endangered languages, said machine translation might well raise the prestige of the local language and help speakers realize its value.

"And if parents believe that their language is important and valuable, they are much more likely to speak it to their children," added Broadwell, who teaches at the State University of New York at Albany.

The systems being developed at Carnegie Mellon might also pay a dividend to linguists by providing documentation that helps them better understand the languages.

Levin noted that linguists have their own stake in the preservation of minority languages. The study of language, she explained, can be seen as a cognitive science, providing insights into how the brain functions. "But if you lose 90 percent of your data" -- the world's languages -- "what can language tell you about the structure of the mind?"

In addition to the Inupiats, Mapuches and Sionas, the Carnegie Mellon group has made contact with the Quechua Indians of Peru; the Quechua language is a derivative of the old Incan language.
The National Science Foundation last fall awarded the group $2.5 million for the next five years for the project. Initial efforts are focused on developing text-to-text translators, but if that is successful the project could expand to include the technically trickier task of speech-to-speech translation.

Broadwell cautioned that, as helpful as machine translation might be, it can only playing a supporting role in saving a language from extinction. Ultimately, parents and grandparents must decide that the language is important and teach it to their children.

"Computers, machine translation programs and linguists can play some part in saving endangered languages, but the most important part will always be played by the local community."

pictograph divider


Home PageFront PageArchivesOur AwardsAbout Us

Kid's PageColoring BookCool LinksGuest BookEmail Us


pictograph divider

  Canku Ota is a free Newsletter celebrating Native America, its traditions and accomplishments . We do not provide subscriber or visitor names to anyone. Some articles presented in Canku Ota may contain copyright material. We have received appropriate permissions for republishing any articles. Material appearing here is distributed without profit or monetary gain to those who have expressed an interest. This is in accordance with Title 17 U.S.C. section 107.  

Canku Ota is a copyright © 2000, 2001, 2002 of Vicki Lockard and Paul Barry.


Canku Ota Logo


Canku Ota Logo

The "Canku Ota - A Newsletter Celebrating Native America" web site and its design is the

Copyright © 1999, 2000, 2001, 2002 of Paul C. Barry.

All Rights Reserved.

Thank You