Language change and innovation is constant in on-line and off-line communication, and has led to new words entering people's lexicon and even entering modern day dictionaries, with recent additions of `e-cig' and `vape'. However the manual work required to identify these `innovations' is both time consuming and subjective. In this work we demonstrate how such innovations in language can be identified across two different OSN's (Online Social Networks) through the operationalisation of known language acceptance models that incorporate relatively simple statistical tests. From
grounding our work in language theory, we identified three statistical tests that can be applied - variation in; frequency,
form and meaning. Each show different success rates across the two networks (Geo-bound Twitter sample and a sample of Reddit). These tests were also applied to different community levels within the two networks allowing for different innovations to be identified across different community structures over the two networks, for instance: identifying regional variation across Twitter, and variation across
groupings of Subreddits, where identified example innovations included `casualidad' and `cym'.
Funding
This work is funded by the Digital Economy programme (RCUK Grant EP/G037582/1), which supports the High-Wire Centre for Doctoral Training (http://highwire.lancs.ac.uk).
History
School
Business and Economics
Department
Business
Published in
WSDM'16
Citation
KERSHAW, D., ROWE, M. and STACEY, P.K., 2016. Towards modelling language innovation acceptance in online social networks. IN: Proceedings of the Ninth ACM International Conference on Web Search and Data Mining (WSDM'16), San Francisco, 22-25 Feb. pp.553-562.
This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at: https://creativecommons.org/licenses/by-nc-nd/4.0/