The research focused on TUI, one of the world’s largest tour operators, analyzes 20 years of German-language catalogs for the destinations of Egypt, Cyprus, Malta, Turkey, the Balearic Islands, and the Canary Islands. The primary objective of this study is to analyze the projected image of these locations through language and visuals, systematically comparing the text and visual content of the introductory pages for each destination over two decades.
To conduct this analysis, scraping and data structuring techniques were employed using Python programming, which allowed for the extraction of texts and images from PDF catalogs through specific text patterns. This information was stored in dictionaries that were cleaned, organized, and integrated into a robust database containing 20 years of historical records.
Regarding sentiment analysis, the SentiWS lexicon (specifically for German) was used alongside the spaCy library, which enables the understanding of the linguistic context in German to accurately evaluate the emotional tone of the texts for each destination. Through textual content analysis, the research team identified the most frequent words and those that contribute most to sentiment, facilitating exhaustive comparisons between the various tourist destinations.
The results obtained thus far have identified significant differences in the tone and vocabulary TUI uses to present each destination in its catalogs. As a final phase of innovation, an Artificial Intelligence is currently being trained to automate the calculation of the projected image index.
