Project Description

Raid on Code Pirate (ROCOP) is a plagiarism detection system. It checks the plagiarism in the input document by comparing against its internal database. The whole system is composed of two interlinking components:

  1. Web base component
  2. Plagiarism detection component

Web base is a term derived from the Stanford's Web Base. It is a local repository which is created by crawling the web pages from the short listed websites. Plagiarism detection component is reponsible for carrying out fingerprint creation as well as comparison process. The plagiarism detection part is implemented in the following successive steps:

  1. Generation of k-grams from the standard string
  2. Generation of hash values from the k-grams using Karp-Rabin Rolling Hash function
  3. Selecting fingerprints which is the subset of the hash values using winnowing algorithm
  4. Comparing against the internal database

Web base component also uses plagiarism detection component for generating the fingerprints of the web pages

Project News

  • February 20, 2011 - Uploaded the project website in sourceforge

Documents