About Problems With Coreference
PDF (Polski)

Keywords

reference
coreference
corpus
annotation

Abstract

DOI: http://doi.org/10.26333/sts.xxviii.17

The aim of the paper is the presentation of some problems occurring during coreference annotation. Such an analysis was performed for the project CORE – Computer-based methods for coreference resolution in Polish texts (managed by Maciej Ogrodniczuk). The project’s main goal was to create innovative methods and tools for automated anaphora and coreference resolution in Polish texts.

The main problem with the coreference resolution in the Polish language arose due to several reasons. On the pragmatic and semantic level it wasn’t easy to decide if there was an identity or just a similarity between two different objects. Another problem was the lack of specific knowledge which made it a very hard task for the annotator to see the coreference between phrases in some highly specialist texts. On the grammatical level, some properties of the Polish language made the annotation difficult. There are no definite and indefinite articles in Polish, therefore it was very hard to determine if the speaker had meant always the same object or just different specimens belonging to the same class. Also, long subject-less sentences presented some problems with defining the coreference chains between analyzed phrases.

PDF (Polski)

References

Fauconnier Gilles, Mark Turner. 2002. The Way We Think. Conceptual Blending and the Mind’s Hidden Complexities. New York: Basic Books.

Kunz Kerstin Anna. 2010. Variation in English and German Nominal Coreference. Frankfurt am Main: Peter Lang.

Langacker Ronald. 2009. Gramatyka kognitywna. Wprowadzenie. Kraków: Universitas.

Padučeva Elena Viktorovna. 1992. Wypowiedź i jej odniesienie do rzeczywistości. (Referencyjne aspekty znaczenia zaimków). Warszawa: PWN.

Recasens Marta, Eduard Hovy, M. Antonia Marti. 2010. A Typology of Near-Identity Relations for Coreference (NIDENT). W 7. International Conference on Language Resources and Evaluation, LREC 2010. European Language Resources Association (ELRA), N. Calzolari, K. Choukri, B. Maegaard, J. Mariani, J. Odijk, S. Piperidis, M. Rosner, D. Tapias (red.). Valletta: European Language Resources Association.

Recasens Marta, Eduard Hovy, M. Antonia Marti. 2011. "Identity, non-identity, and near-identity: Addressing the complexity of coreference". Lingua 121 (6) : 1138–1152.

Topolińska Zuzanna. 1984. Składnia grupy imiennej. W Gramatyka współczesnego języka polskiego. Składnia, Zuzanna Topolińska (red.). Warszawa: PWN.

Van Hoek Karen 2009. Pronouns and point of view: cognitive principles of coreference. W The new psychology of language: cognitive and functional approaches to language structure, vol. 2, M. Tomasello (red.). Tylor & Francis e-library.

Vater Heinz. 2009. Wstęp do lingwistyki tekstu. Struktura i rozumienie tekstów. Tłum. E. Błachut, A. Gołębiowski. Wrocław: Atut.

Wierzbicka Anna. 2010. Semantyka. Jednostki elementarne i uniwersalne. Lublin: Wydawnictwo UMCS.