|
Despite its scientific, political,
and practical value, comprehensive information about human
languages, in all their variety and complexity, is not readily
obtainable. Such information exists in a multiplicity of incompatible
formats such as texts, audio and video clips, on individual
websites and personal computers of linguists, and is difficult
to find because it is not centrally indexed. Also, access
rights and procedures are diverse, idiosyncratic, and frequently
undocumented. What is needed is a single, accessible avenue
to comprehensive and up-to-date information about the languages
of the world, living and extinct, with an emphasis on lesser
known languages. In response to these needs, we have embarked
on the following two initiatives:
-
We are organizing the existing
scattered and incongruent data into a Language Information
Grid, a distributed digital library of language information
which will be at once comprehensive in scope, immediately
accessible, and flexible enough to serve the needs of
multiple user communities.
-
We are developing a graduate
education and training program by introducing a PhD in
Computer Science with a concentration in Language Engineering
that leverages the research opportunities and infrastructure
present in the Language Information Grid. This Ph.D. program
in Language Engineering would not only provide computer
science students with training in the emerging field of
language-related technologies but also offer them the
opportunity to be a part of an innovative and important
language technology project, the implementation of the
Language Information Grid.
Through this multi-institutional,
multi-disciplinary, and collaborative project with a strong
and neatly integrated international collaborative component,
we plan to train and matriculate ten graduate students with
a doctoral degree in language engineering, a new discipline
we are helping to establish in five years. The project has
the following broad objectives:
-
Language
Engineering Degree Programs. We propose
to establish a graduate training opportunity in the much
needed and emerging discipline of language technology
by introducing a new Computer Science PhD concentration
and a graduate certificate course in Language Engineering
at Wayne State University. New curriculum and instructional
materials necessary for these programs will be developed
in close collaboration and in complementary fashion with
the participating universities keeping the needs and strengths
of each participant in mind. Practical training and internship
opportunities will be made available at the Universities
of Alaska and Melbourne to complement the curriculum offered
at the participating institutions. Annual international
summer schools and workshops on language engineering will
be organized to further supplement the formal training.
-
Research
and Training Opportunities for IGERT Students and Faculty.
The students and faculty will have the opportunity to
take up cutting edge and practical dissertation research
projects on language information integration, text and
multi-media data annotation, cross-linguistic comparison,
analysis and querying and so on. They will also have the
opportunity to work with front line researchers involved
with the Language Information Grid, the LINGUIST List,
and the preservation of endangered language data in Alaska
and Australia.
-
Community
Outreach. With the goal of monitoring
and enhancing the quality of the application data, the
IGERT trainees will also have the opportunity to participate
in a supporting effort grounded in outreach to the linguistics
community and focused on resource discovery, data preparation,
and modification of existing linguistic tools.
This project offers an opportunity
to train and matriculate a new generation of linguists with
skills in modern information technology tools as professional
language engineers. Research and education in this new discipline
is important because information about human languages is
critical to our nation's scientific endeavors, to the multinational
enterprises of our government and business communities, and
to the vitality, security, and diversity of our social life.
Within the scientific community, anthropologists, archaeologists,
historians, ethnologists, psychologists, sociologists and
linguists depend upon language data to reconstruct historical
movements, decode genetic affiliations, and delineate human
mental capacities. Within government and business, information
about languages and the cultural information embedded in it
are often critical to making well-informed decisions in the
global arena and to implementing these with confidence. Finally,
language education is essential to our attempt to weave a
durable social fabric from our nation's multi-ethnic, multicultural
strands.
|