RT Dissertation/Thesis T1 Thematic patterning in English and Spanish: contrastive annotation of a bilingual newspaper corpus for liguistic and computational applications T2 La estructuración temática en inglés y español: anotación contrastiva de un corpus bilingüe para aplicaciones lingüísticas y computacionales A1 Moratón Gutiérrez, Lara AB Thematization is recognized as a fundamental phenomenon in the construction of messages and texts by di erent linguistic schools. This location within a text privileges the elements that guide the reader in the orientation and interpretation of discourse at di erent levels. Thematizing a linguistic unit by locating it in the rst-initial position of a clause, paragraph, or text, confers upon it a special status: a signal of the organizational strategy which characterizes di erent text types playing a role as a variable in the distinction of registers, text types and genres. However, in spite of the importance of the study of thematization for message and textual structuring, to date there are no linguistic studies that have undertook the task of validating its aspects in a comparative manner, either for linguistic or computational purposes. This study, therefore, lls a research gap by implementing a methodology based on contrastive corpus annotation, which allows to empirically validate aspects of the phenomenon of Thematization in English and Spanish, it also seeks to develop a bilingual English-Spanish comparable corpus of newspaper texts automatically annotated with thematic features at clausal and discourse levels. The empirically validated categories (Thematic Field and its elements: Textual Theme, Interpersonal Theme, PreHead and Head) are used to annotate a larger corpus of three newspaper genres news reports, editorials and letters to the editor in terms of thematic choices. This characterization, reveals interesting results, such as the use of genre-speci c strategies in thematic position. In addition, the thesis investigates the possibility to automate the annotation of thematic features in the bilingual corpus through the development of a set of JAVA rules implemented in GATE. It also shows the e cacy of this method in comparison with the manual annotation results... PB Universidad Complutense de Madrid YR 2016 FD 2016-10-25 LK https://hdl.handle.net/20.500.14352/21285 UL https://hdl.handle.net/20.500.14352/21285 LA eng NO Tesis inédita de la Universidad Complutense de Madrid, Facultad de Filología, Departamento de Filología Inglesa, leída el 04-12-2015 DS Docta Complutense RD 3 may 2024