Keyword and multi-word terms extraction from a textual corpus of news about climate change

Authors

DOI:

https://doi.org/10.31637/epsir-2024-1133

Keywords:

text corpus, news, climate change, extraction, keywords, multi-word terms

Abstract

Introduction: In this paper we present the elaboration of a textual corpus on climate change and the extraction of keywords and multi-word terms to analyze it. Methodology: We used Sketch Engine both for the elaboration of the text corpus and for its extraction. We have compiled the texts from the web and, subsequently, we have downloaded the lists of keywords. Results and analysis: The resulting data demonstrate that, in addition to the most frequent words used when talking about climate change (like “calentamiento” and “calor”), a large part of the keywords are names of people who, from politics, science or activism, have dealt with this issue in one way or another. Therefore, it depends on the social moment in which the texts are published. On the other hand, multi-word terms do not contain proper names, they are made up of lexical elements. Conclusions: We have approached the very topical and cross-curricular issue of climate change based on the work carried out. We intend to use this corpus in future studies.

Downloads

Download data is not yet available.

Author Biography

Vanesa Álvarez Torres, University of Cádiz

Vanesa Álvarez Torres has a degree in Linguistics, Teacher in the specialty of Foreign Language and PhD in Linguistics from the University of Cadiz, with Extraordinary Doctorate Award. She has been technical support staff of the State Program for the Promotion of Talent and its Employability in R+D+i of the MINECO at the University Institute for Research in Applied Linguistics of the University of Cadiz (ILA). Currently, she is an Assistant Professor in the area of General Linguistics (Department of Philology) at the University of Cadiz and a member of the research group “Semaínein” and the University Institute for Research in Applied Linguistics.

References

Barkemeyer, R., Figge, F., Hoepner, A., Holt, D., Kraak, J. M. y Yu, P. S. (2017). Media coverage of climate change: An international comparison. Environment and Planning C: Politics and Space, 35(6), 1029-1054. https://doi.org/10.1177/0263774X16680818 DOI: https://doi.org/10.1177/0263774X16680818

Boykoff, Maxwell T. y J. Timmons Roberts. 2007. Media coverage of climate change: Current trends, strengths, weaknesses. Human Development Report 2007/2008. Fighting climate change: Human solidarity in a divided world, United Nations Development Programme. https://hdr.undp.org/content/media-coverage-climate-change

Gillings, M. y Dayrell, C. (2023). Climate change in the UK press: Examining discourse fluctuation over time. Applied Linguistics, 45(1), 111-133. https://doi.org/10.1093/applin/amad007 DOI: https://doi.org/10.1093/applin/amad007

Grundmann, R. y Ramesh, K. (2010). The Discourse of Climate Change: A Corpus-based Approach. Critical Approaches to Discourse Analysis across Disciplines, 4(2), 113-133. https://shre.ink/DTOC

Kilgarriff, A. (2009). Simple maths for keywords. En M. Mahlberg, V. González-Díaz y C. Smith (Eds.), Proceedings of Corpus Linguistics Conference. CL2009. University of Liverpool.

Kilgarriff, A., V. Baisa, J. Bušta, M. Jakubíček, V. Kovář, J. Michelfeit, P. Rychlý y V. Suchomel.

(2014). The Sketch Engine: ten years on. Lexicography, 1, 7-36. https://shre.ink/DTOK DOI: https://doi.org/10.1007/s40607-014-0009-9

Liu, M. y Huang, J. (2022). “Climate change” vs. “global warming”: A corpus-assisted discourse analysis of two popular terms in The New York Times. Journal of World Languages, 8(1), 34-55. https://doi.org/10.1515/jwl-2022-0004 DOI: https://doi.org/10.1515/jwl-2022-0004

Sánchez-Saus Laserna, M. y Álvarez Torres, V. (2022). ¿De qué hablamos cuando divulgamos sobre lingüística? Análisis de un corpus de textos divulgativos y aplicaciones al estudio terminológico de la semántica léxica. ELUA (Estudios de Lingüística. Universidad de Alicante), 38, 73-98. https://doi.org/10.14198/ELUA.22384 DOI: https://doi.org/10.14198/ELUA.22384

Statista Search Department. (2023). Ranking de las principales marcas de medios de comunicación online según el porcentaje de población que las usaba de forma semanal en España en 2022 [Infographic]. https://shre.ink/DTOu

UNICEF Comité Español. (2017). Glosario sobre cambio climático para el aula. Nuestro planeta, nuestros derechos: educación, derechos de infancia y cambio climático. https://www.unicef.es/educa/biblioteca/glosario-cambio-climatico

United Nations Framework on Climate Change. (s.f.). ¿Qué es el Protocolo de Kyoto? https://unfccc.int/es/kyoto_protocol

United Nations Framework on Climate Change. (s.f.). Conferencia de las Partes (COP). https://shre.ink/DTOj

Volkanovska, E., Tan, S., Duan, C., Changxu, D., Bartsch, S. y Stille, W. (2023). The InsightsNet Climate Change Corpus (ICCC). Datenbank Spektrum, 23, 177-188. https://doi.org/10.1007/s13222-023-00454-1 DOI: https://doi.org/10.1007/s13222-023-00454-1

Willis, R. (2017). Taming the Climate? Corpus analysis of politicians’ speech on climate change. Environmental Politics, 26(2), 212-231. https://doi.org/10.1080/09644016.2016.1274504 DOI: https://doi.org/10.1080/09644016.2016.1274504

Published

2024-10-29

How to Cite

Álvarez Torres, V. (2024). Keyword and multi-word terms extraction from a textual corpus of news about climate change. European Public & Social Innovation Review, 9, 1–17. https://doi.org/10.31637/epsir-2024-1133

Issue

Section

INNOVATING IN CUTTING-EDGE TECHNOLOGIES