Zeitschrift für Sprachwissenschaft | |
Towards a broad-coverage graphemic analysis of large historical corpora | |
Waldenberger Sandra1  Dipper Stefanie2  Lemke Ilka3  | |
[1] Germanistisches Institut, Ruhr-Universität Bochum, Universitätsstr. 150, 44801Bochum, Germany;Sprachwissenschaftliches Institut, Ruhr-Universität Bochum, Universitätsstr. 150, 44801Bochum, Germany;Westfälische Wilhelms-Universität Münster, Germanistisches Institut, Abteilung Sprachwissenschaft, Schlossplatz 34, Raum: VSH 146, 48143Münster, Germany; | |
关键词: graphemic variation; middle high german; corpus-based analysis; quantitative analysis; | |
DOI : 10.1515/zfs-2021-2037 | |
来源: DOAJ |
【 摘 要 】
This paper presents a method which we are developing to explore graphemic variation in large historical corpora of German. Historical corpora provide an amount of data at the level of graphemics which cannot be handled exhaustively using common methods of manual evaluation. To deal with this challenge, we apply methods from computational linguistics to pave the way for a broad-coverage graph(em)ic analysis of large historical corpora. In this paper, we show how our approach can be applied to the Reference Corpus of Middle High German. Illustrating our method and linguistic analysis, we present findings from our investigations into diatopic and/or diachronic variation as documented in 13th and 14th century charters (Urkunden) from the corpus.
【 授权许可】
Unknown