Toward Selectivity-Based Keyword Extraction for Croatian News

Authors: S. Beliga, A. Meštrović, S. Martinčić-Ipšić.
Year: 2014
Venue: CEUR Proceedings of the Workshop on Surfacing the Deep and the Social Web (SDSW 2014), Vol. 1310, pp. 1-8, Riva del Garda, Trentino, Italy
Product of the Action: Yes

Keystone Members Authors:
, ,

Our approach proposes a novel network measure - the node selectivity for the task of keyword extraction. The node selectivity is defined as the average strength of the node. Firstly, we show that selectivity based keyword extraction slightly outperforms the extraction based on the standard centrality measures: in-degree, out-degree, betweenness, and closeness. Furthermore, from the data set of Croatian news we extract keyword candidates and expand extracted nodes to word-tuples ranked with the highest in/out selectivity values. The obtained sets are evaluated on manually annotated keywords: for the set of extracted keyword candidates the average F1 score is 24.63%, and the average F2 score is 21.19%; for the exacted word-tuples candidates the average F1 score is 25.9% and the average F2 score is 24.47%. Selectivity-based extraction does not require linguistic knowledge while it is purely derived from statistical and structural information of the network.