Unsupervised ranking for plagiarism source retrieval: Notebook for PAN at CLEF 2013

Kyle Williams, Hung Hsuan Chen, Sagnik Ray Choudhury, C. Lee Giles

Research output: Contribution to journalConference articlepeer-review

4 Scopus citations


The source retrieval task for plagiarism detection involves the use of a search engine to retrieve candidate sources of plagiarism for a suspicious document and provides a way to efficiently identify candidate documents so that more accurate comparisons can take place. We describe a strategy for source retrieval that makes use of an unsupervised ranking method to rank the results returned by a search engine by their similarity with the query document and that only retrieves documents that are likely to be sources of plagiarism. Evaluation shows the performance of our approach, which achieved the highest F1 score (0.47) among all task participants.

Original languageEnglish (US)
JournalCEUR Workshop Proceedings
StatePublished - 2013
Event2013 Cross Language Evaluation Forum Conference, CLEF 2013 - Valencia, Spain
Duration: Sep 23 2013Sep 26 2013

All Science Journal Classification (ASJC) codes

  • General Computer Science


Dive into the research topics of 'Unsupervised ranking for plagiarism source retrieval: Notebook for PAN at CLEF 2013'. Together they form a unique fingerprint.

Cite this