News and Events
January 26, 2022

TourBERT: A language model for the tourism industry

BERT is the most powerful language model, developed by Google, used in the field of Natural Language Processing. In recent years, several domain-specific "BERT" models have been developed, for example BioBERT, ClinicalBERT or FinBERT, in order to better meet the requirements of the individual business sectors. Now, with TourBERT, there is also a model for the tourism industry.

Junge Frau zeigt mit Finger auf einen transparenten Screen mit Urlaubsfotos

FH Prof. Dr. Roman Egger - Senior Lecturer and Head of eTourism at the Innovation & Management in Tourism degree programme - has now developed "TourBERT" together with Veronika Arefievea from JKU Linz. "We have been working intensively on 'TourBERT' for one and a half years. TourBERT has been trained on a tourism-specific text corpus of over 3.5 million documents in one million steps. In tourism-specific tests, TourBERT beats the original BERT model in all areas."

Broad areas of application in business and tourism

The two scientists have thus achieved a coup. Roman Egger: "We hope that the economy will use TourBERT for their Natural Language Processing projects." TourBERT is available as open source on 'Hugging Face' - a hub for language models - and was already downloaded over 500 times on the day of publication.

Interested persons can read the paper at: https://arxiv.org/abs/2201.07449

Further publications in journals are planned.

Read more: http://www.datascience-in-tourism.com/?p=461