Home > People & Projects > On-line language documentation for Biak (Austronesian)

Project Details

not specified
Project Name: 
On-line language documentation for Biak (Austronesian)
Principal Investigator / Director: 
Mary Dalrymple
Oxford participants: 
Mary Dalrymple (Main Contact); Suriel Mofu
Other Participants: 
not specified
  • Division: Humanities
  • Unit: Linguistics, Philology & Phonetics Faculty
  • Sub-Unit: not specified
Start Date: 
10/2009
End Date: 
10/2010
Partner organizations (inside or outside Oxford): 
Universitas Cenderawasih and Universitas Negeri Papua
Funder: 
not specified
Subject Area: 
Linguistics
Project Description: 

This collaborative project involved the University of Oxford and two universities in Papua, Universitas Cenderawasih and Universitas Negeri Papua, in the creation of an on-line database of digital audio texts and their linguistically annotated transcriptions and translations for the Austronesian language Biak, a language with about 50,000-70,000 speakers in Papua. The annotated transcriptions were produced using Toolbox, a freely-available data management and analysis tool for language documentation, which supports the creation of resources in various forms: transcribed texts with free translations (of most use to the Biak-speaking community and for pedagogical use in Papua) and linguistically annotated transcriptions in two forms: a standard human-readable form like the paper-based corpora familiar to linguists, and a translation of this form to XML via the utility tools for Toolbox, suitable for computer analysis and database search. These resources provide a snapshot of audio and textual data on the language, and hence are useful for language preservation efforts, for ongoing efforts to produce teaching materials in the indigenous languages of Papua, and as a basis for the creation of dictionaries and glossaries in the language. Since they will be linguistically annotated, they are also invaluable for linguists conducting research on Biak and related Austronesian languages. The project began in October 2009, and ran for one year.

ICT Methods: 
CategorySub-HeadingsDetails
Data CaptureData ReuseUse of existing digital data
Data analysisText AnalysisContent analysis
- -Indexing
- -Text mining
Data structuring and enhancementText EncodingLemmatisation
- -Text encoding - descriptive
- -Text encoding - referential
Last updated: 
25/06/2015 16:24:50
Updated by: 
martinw@ox.ac.uk