Gianluca Demartini
building Data Scientists since 2014

Datasets

This page contains a collection of links to datasets produced for my research work.

Crowdsourcing Truthfulness Judgements
https://github.com/KevinRoitero/crowdsourcingTruthfulness/

INEX Entity Ranking (XER) Evaluation collections

TRank: Ranking Entity Types Using the Web of Data
http://trank.exascale.info/

The Arabic Keyphrase Extraction Corpus (AKEC)
https://ailab-uniud.github.io/akec/

Pick-A-Crowd: Tell Me What You Like, and I'll Tell You What to Do
http://exascale.info/PickACrowd/

Combining Inverted Indices and Structured Search for Ad-hoc Object Retrieval
http://diuf.unifr.ch/main/xi/HybridAOR/

ZenCrowd - Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for Large-Scale Entity Linking
http://exascale.info/zencrowd/

A Dataset for Evaluating Entity Retrieval over Time
http://www.l3s.de/~demartini/deert/

A Wikipedia Dataset for Evaluating Entity Ranking
http://www.l3s.de/~iofciu/wikipediaER/

Gianluca Demartini, Ph.D.
School of Information Technology and Electrical Engineering,
University of Queensland

GP South Building, Staff House Road
St Lucia
QLD 4072 Australia

Office: +61 7 336 58325
demartini@acm.org

Photo of Gianluca Demartini

Dr. Gianluca Demartini is an Associate Professor in Data Science at the University of Queensland, School of Information Technology and Electrical Engineering. His main research interests are Information Retrieval, Semantic Web, and Human Computation. His research has been supported by the Australian Research Council (ARC), the UK Engineering and Physical Sciences Research Council (EPSRC), the EU H2020 framework program, Facebook, and Google. He received Best Paper Awards at the AAAI Conference on Human Computation and Crowdsourcing (HCOMP) in 2018 and at the European Conference on Information Retrieval (ECIR) in 2016, the Best Short Paper award at ECIR in 2020 and the Best Demo Award at the International Semantic Web Conference (ISWC) in 2011. He has published more than 100 peer-reviewed scientific publications including papers at major venues such as WWW, ACM SIGIR, VLDBJ, ISWC, and ACM CHI.
He has given several invited talks, tutorials, and keynotes at a number of academic conferences (e.g., ISWC, ICWSM, WebScience, and the RuSSIR Summer School), companies (e.g., Facebook), and Dagstuhl seminars. He is a senior member of the ACM since 2020, an ACM Distinguished Speaker since 2015, and has been a TEDx speaker in 2019.
He serves as co-editor in chief for the Human Computation journal, area editor for the Journal of Web Semantics, and editorial board member for the Information Retrieval journal. He is General co-Chair for the ACM International Conference on Information and Knowledge Management (CIKM) 2021. He has been Senior Program Committee member for, among others, the ACM Conference on Research and Development in Information Retrieval (SIGIR), the ACM Web Search and Data Mining (WSDM) Conference, the AAAI Conference on Human Computation and Crowdsourcing (HCOMP), and the International Conference on Web Engineering (ICWE). He is Program Committee member for several conferences including WWW, SIGIR, KDD, IJCAI, AAAI, ISWC, and ICWSM. He was Crowdsourcing and Human Computation Track co-Chair at WWW 2018 and co-chair for the Human Computation and Crowdsourcing Track at ESWC 2015. He co-organized several workshops and tutorials at international conferences as well as the Entity Ranking Track at the Initiative for the Evaluation of XML Retrieval in 2008 and 2009.
Before joining the University of Queensland, he was Lecturer at the University of Sheffield in UK, post-doctoral researcher at the eXascale Infolab at the University of Fribourg in Switzerland, visiting researcher at UC Berkeley, junior researcher at the L3S Research Center in Germany, and intern at Yahoo! Research in Spain. In 2011, he obtained a Ph.D. in Computer Science at the Leibniz University of Hanover focusing on Semantic Search.

Datasets

This page contains a collection of links to datasets produced for my research work.

Crowdsourcing Truthfulness Judgements
https://github.com/KevinRoitero/crowdsourcingTruthfulness/

INEX Entity Ranking (XER) Evaluation collections

TRank: Ranking Entity Types Using the Web of Data
http://trank.exascale.info/

The Arabic Keyphrase Extraction Corpus (AKEC)
https://ailab-uniud.github.io/akec/

Pick-A-Crowd: Tell Me What You Like, and I'll Tell You What to Do
http://exascale.info/PickACrowd/

Combining Inverted Indices and Structured Search for Ad-hoc Object Retrieval
http://diuf.unifr.ch/main/xi/HybridAOR/

ZenCrowd - Leveraging Probabilistic Reasoning and Crowdsourcing Techniques for Large-Scale Entity Linking
http://exascale.info/zencrowd/

A Dataset for Evaluating Entity Retrieval over Time
http://www.l3s.de/~demartini/deert/

A Wikipedia Dataset for Evaluating Entity Ranking
http://www.l3s.de/~iofciu/wikipediaER/