Bubble up – A Fine-tuning Approach for Style Transfer to Community-specific Subreddit Language


Different online communities (social media bubbles) can be identified with their use of language. We looked at different social media bubbles and explored the task of translating between the language of one bubble into another while maintaining the intended meaning. We collected a dataset of Reddit comments from 20 different subreddits and for a smaller subset of them we obtained style-neutral versions generated by a large language model. Then we used the dataset to fine-tune different (smaller) language models to learn style transfers between social media bubbles. We evaluated the models on unseen data from four unseen social media bubbles to assess to what extent they had learned the style transfer task and compared their performance with the zero-shot performance of a larger, non-fine tuned, language model. We show that with a small amount of fine-tuning the smaller models achieve satisfactory performance, making them more attractive than a larger, more resource-intensive model.

Proceedings of the 3rd Workshop on Computational Linguistics for the Political and Social Sciences
Alessandra Zarcone
Alessandra Zarcone
Professor of Language Technologies and Cognitive Assistants

Computational linguist with a background in NLP and in psycholinguistics, working on human-machine interaction.