Loughborough University
Browse
tkde16.pdf (819.58 kB)

SABINE: A multi-purpose dataset of semantically-annotated social content

Download (819.58 kB)
conference contribution
posted on 2019-03-06, 14:15 authored by Silvana Castano, Alfio Ferrara, Enrico Gallinucci, Matteo Golfarelli, Stefano Montanelli, Lorenzo Mosca, Stefano Rizzi, Cristian Vaccari
Social Business Intelligence (SBI) is the discipline that combines corporate data with social content to let decision makers analyze the trends perceived from the environment. SBI poses research challenges in several areas, such as IR, data mining, and NLP; unfortunately, SBI research is often restrained by the lack of publicly-available, real-world data for experimenting approaches, and by the difficulties in determining a ground truth. To fill this gap we present SABINE, a modular dataset in the domain of European politics. SABINE includes 6 millions bilingual clips crawled from 50 000 web sources, each associated with metadata and sentiment scores; an ontology with 400 topics, their occurrences in the clips, and their mapping to DBpedia; two multidimensional cubes for analyzing and aggregating sentiment and semantic occurrences. We also propose a set of research challenges that can be addressed using SABINE; remarkably, the presence of an expert-validated ground truth ensures the possibility of testing approaches to the whole SBI process as well as to each single task.

Funding

Supported by the Italian Ministry of Education “Future in Research 2012” initiative (code RBFR12BKZH) for the project “Building Inclusive Societies and a Global Europe Online: Political Information and Participation on Social Media in Comparative Perspective” (www.webpoleu.net).

History

School

  • Social Sciences and Humanities

Department

  • Communication and Media

Published in

International Semantic Web Conference Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

Volume

11137 LNCS

Pages

70 - 85

Citation

CASTANO, S. ... et al, 2018. SABINE: A multi-purpose dataset of semantically-annotated social content. IN: Vrandecic, D. ... et al (eds). The Semantic Web – ISWC 2018, International Semantic Web Conference, Monterey, CA, USA, 8-12 October 2018, pp.70-85.

Publisher

© Springer Nature

Version

  • AM (Accepted Manuscript)

Publisher statement

This work is made available according to the conditions of the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) licence. Full details of this licence are available at: https://creativecommons.org/licenses/by-nc-nd/4.0/

Acceptance date

2018-05-26

Publication date

2018

ISBN

9783030006679

ISSN

0302-9743

eISSN

1611-3349

Book series

Lecture Notes in Computer Science;11137

Language

  • en

Location

Monterey, CA, USA