Two-Level Qazan Tatar Morphology

Dublin Core

Title

Two-Level Qazan Tatar Morphology

Author

Gökgöz, Ercan
Kurt, Atakan
Kulamshaev, Kalmamat
Kara, Mehmet

Abstract

In this paper we present a two level description of Tatar Language. Tatar is a Turkic language and the official language of Tataristan. It is spoken by millions of people mostly in the world. We describe the Tatar orthography using two level rules of Koskenniemi. These orthographic rules governing the phonology of the language during word formation is essential to morphological parsing and generation. We then represent the Tatar morphotactics using finite state machines. The FSMs for nominal and verbal morphotactics describe in detail how the words of the language can be formed. The orthographic rules and morphotactics are implemented in the Dilmac Machine Translation Framework by encoding them in XML files in an language independent way.

Keywords

Conference or Workshop Item
PeerReviewed

Date

2011-05

Extent

63

Document Viewer