Profiles WG1-WG2 meeting



Location: Aldemar Royal Mare Resort, Hersonissos, Crete, Greece

Date: 25 May 2014


Preliminary Program

11:00 – 11:30 Start with coffee

11:30 – 11:40 Welcome and introduction of the agenda

11:40 – 12:10 Keynote talk: Dr. Thanassis Tiropanis

12:10 – 12:30 Introduction of the participants

12:30 – 14:00 Lunch break

14:00 – 14:30 Keynote talk: Dr. Stefan Dietze

14:30 – 16:00 Work in groups

16:00 – 16:30 Coffee break

16:30 – 17:00 Work in groups (continued)

17:00 – 18:00 Presentation of group results, conclusions

In the evening: Go out together



Dr. Thanassis Tiropanis
University of Southampton, UK

Title: Web Observatory Architectures


Web Observatories are conceptualised as global distributed resources that can engage communities with analytics and make the most of existing and emergent datasets including those of the linked data cloud, social media, online archives and media archives. There is a plethora of Web Observatories engaging communities in longitudinal studies over different datasets. However, as the number and size of datasets and analytic tools are increasing, scalable Web Observatory architectures and standardisation will be critical. This talk will discuss those requirements and architectural responses.

Slides: Web Observatory Architectures


Dr. Stefan Dietze
L3S Research Center, Leibniz University of Hannover, Germany

Title: From data to knowledge – profiling and interlinking datasets on the Web


While the Web of Data, and in particular Linked Data, has seen tremendous growth over the past years, take-up, usage and reuse of data is still limited and is often focused on well-known reference datasets such as DBpedia. Datasets vary heavily with respect to their characteristics, such as the amount, quality, domain coverage or currency of exposed data. While this heterogeneity is natural, given the distributed nature of Linked Data, it also poses challenges for finding, adopting and reusing data across the Web. This problem is further elevated by the lack of reliable information (or metadata) about such dataset characteristics. Hence, judging the suitability and trustworthyness of arbitrary datasets for a given task is challenging. In addition, the evolving nature of Linked Data calls for scalable and efficient techniques which are able to automatically (re-)compute dataset metadata (profiles) and links. This talk will give an overview of ongoing research on dataset profiling and interlinking. Profiling involves the automated extraction of metadata, in particular, about the type or topic coverage of datasets. Interlinking, on the other hand, investigates methods for linking not only entities across the Web of data but also to identify candidate datasets for interlinking tasks. The talk will introduce methods, techniques and resulting datasets, aimed towards the overall goal of improving take-up and reuse of Web datasets.

SlidesFrom data to knowledge – profiling and interlinking datasets on the Web


Discussion topics for work in groups:

  1. Representation and analytics of structured data sources
  2. Analytics of structured data sources in the context of query routing and search
  3. Scenarios of interactions for improvement of keyword based retrieval
  4. Linked Data: federated keyword search, mining and preservation
  5. Semantic keyword search in structured data
  6. Aspects of semantic keyword search in big data
  7. Collaborative search for learning scenarios
  8. Privacy aspects of keyword search



Minutes of the meeting on 25 May 2014

See also details
