Plagiabot provides an API for testing articles with the Turnitin engine (see https://en.wikipedia.org/wiki/Wikipedia:Turnitin). Here is an example of the API output for a specific article: http://tools.wmflabs.org/eranbot/plagiabot/api.py?action=suspected_diffs&page_title=Rajesh_Khanna&report=1. (It returns an array of 1 or more potential violations.)
It would be great if the Copyvio Detector tool (https://tools.wmflabs.org/copyvios/) had the option of using Turnitin as well as Yahoo BOSS for detecting possible copyright violations.
Acceptance criteria:
- In the "Copyvio search" options, add a new option for "Use Turnitin" (off by default for now)
- If "Use Turnitin" is checked, add an extra box to the output (between the generation-time div and the cv-result div) that shows the results from the Plagiabot query.
- If there are no matches from the Plagiabot query, the div should use class=green-box and say something like "Turnitin found no matching sources."
- If there are matches from the Plagiabot query, the div should use class=red-box and say something like "Turnitin found sources that may have been plagiarized. Please review them." It should then include output similar to the Source column at https://en.wikipedia.org/wiki/User:EranBot/Copyright/2#Added, but in a nicer format. Specifically, it should include a link to the full report followed by a tabular display of the source matches, confidence, etc. Try to make it fairly similar to the formatting of the existing Copyvio Detector tool output.
- Do not feed the results from Plagiabot into the Copyvio Detector's list of sources to check.
Source code for Copyvio Detector tool: https://github.com/earwig/copyvios