Speak Up! 2024: Three-Minute Research Talk Presentations
Symposium by ForagerOne
    Skip navigation
  • arrow_back_ios
    Exit Event
  • Welcome Page
  • Presentations
  • Live Sessions
  • Login
  • Sign Up

Preserving Endangered Languages with Morphological Analysis


Voiceover

Speak Up! 2024(s)

Sydney DeFilippo

Abstract or Description

This project is focused on integrating a set of morphological rules into an automatic speech recognition and AI language model system to efficiently transcribe speech to Interlinear Glossed Text (IGT). IGT is a format of linguistic annotation that segments the words in a language into their morphological units. This allows linguists to document languages without translating them into their own language, so the syntax and unique semantic meaning of every component are preserved. My work is focused on the Zongozotla dialect of Totonac, a language indigenous to the Sierra Norte de Puebla region of Mexico, with only 5000 speakers. I am in the process of working with a native speak to develop a full documentation of the vocabulary and rules of the language. I am constructing a program which will parse any given text and segment it along its morpheme boundaries. My experiment is comparing this method of segmentation to traditional algorithms such as BPE and Morfessor, and seeing if segmenting along explicit morphoological rules improves the results of the Automatic Speech Recognition model. Generating gloss directly from speech, rather than transcribing and glossing by hand, can significantly reduce the time and resources necessary for documenting these under-resourced

languages. The main goal of my project is to improve current computational tools that can be used to preserve not only Totonac, but any under-resourced language.

Mentor

Lori Levin

of 1
Current View
Current View

Enter the password to open this PDF file.

File name:

-

File size:

-

Title:

-

Author:

-

Subject:

-

Keywords:

-

Creation Date:

-

Modification Date:

-

Creator:

-

PDF Producer:

-

PDF Version:

-

Page Count:

-

Page Size:

-

Fast Web View:

-

Preparing document for printing…
0%

Comments

Tony Downs1 year ago
Thanks for the presentation. I was able to understand the importance of your research for preserving languages, how the current investigation fits in the broader context of the field, and how you will go about using software to analyze the Totonac language. As someone without a background in this field, some of the technical vocabulary was a little hard to follow and a next step might include showing more of the research process on the slide for the audience to follow or incorporating other visual aids. Slowing down may also help the audience as well because the quick pace of the speech led to some confusion. Good luck with you project!
•
Korryn Mozisek1 year ago
Very interesting research, Sydney. I will offer that I did intercollegiate debate as an undergrad and talking fast was a valued strategy; I can also offer that breaking the speed when presenting normally was also hard for me and something that I had to work on to ensure others could follow me. Only you can know why you were so rushed -- feeling nervous, concerned about the amount of content you wanted to cover, rate habit (like in my story), or something else -- but I will offer that it was quite noticeable and did make a fairly technical topic that much more difficult to follow. You clearly have a handle on the technical dimensions of your research; more introduction of the novelty and importance is needed earlier in the presentation and more defining of key technical terms is needed. Good luck with your research! Preserving languages is of critical importance.
•
Symposium™ by ForagerOne © 2026
AboutContact UsTerms of ServicePrivacy Policy