Skip to content

Plagiarism Detection API (Beta)

Overview

The plagiarism detection API allows organizations to evaluate the authenticity of written content. This API provides a score indicating the originality of the text, helping organizations maintain trust and accountability and ensure proper citation of sources.

This plagiarism detection API developer guide explains how to create a score request, upload a document for scoring, and get the document scoring status and result.

The API is intended for programmatic consumption and is accessible through an HTTP REST interface.

How scoring works

Grammarly compares the text of the uploaded document to billions of web pages and academic papers in private databases, looking for sentences or paragraphs that have been published elsewhere. The API then calculates an originality score— the higher the score, the more original the text.

Accessing Plagiarism Detection API

The base URL for accessing the plagiarism detection API:

https://api.grammarly.com/ecosystem/api/v1/plagiarism

For authentication, API requests must include an Access token as the Authorization header with the Bearer type. For more information, please refer to the OAuth 2.0 article.

The scopes required for the Access token are plagiarism-api:read and plagiarism-api:write.

Requesting a transaction

POST https://api.grammarly.com/ecosystem/api/v1/plagiarism

This API endpoint initiates transactions for plagiarism detection.

Request Parameters

ParameterTypeRequiredDescription
filenameStringYesName of the file that will be uploaded.

Example Request

An example cURL request to create a score request:

bash
curl -X POST \
  'https://api.grammarly.com/ecosystem/api/v1/plagiarism \
  -H 'Authorization: Bearer <ACCESS_TOKEN>' \
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json' \
  -H 'user-agent: API client' \
  -d '{ "filename": "example.doc" }'

Response Format

The response body contains the score_request_id and file_upload_url fields, which will be required in subsequent steps.

Response Parameters

ParameterTypeDescription
score_request_idStringRequest ID that is used to request the plagiarism detection evaluation results.
file_upload_urlStringURL for uploading the requested file.

Example response

An example response body:

json
{
  "score_request_id":"401124dd-c774-46af-b652-f6627a4ca5f6",
  "file_upload_url":"https://prod-writingscore-file-upload.s3-external-1.amazonaws.com/625270646/401124dd-c774-46af-b652-f6627a4ca5f6?..."
}

Upload file
Upload the document to be scored using the pre-signed URL from the response, file_upload_url. The upload must start before the pre-signed URL expires, which is in 120 seconds.

This request is a simple PUT request that does not require the Authorization header. All security parameters are specified as query attributes of file_upload_url.

An example cURL request to upload a file to the pre-signed URL:

bash
curl -T example.doc <file_upload_url>

Requesting the evaluation result

GET https://api.grammarly.com/ecosystem/api/v1/plagiarism

This API endpoint returns the result of the plagiarism detection evaluation.

Request Parameters

ParameterTypeRequiredDescription
score_request_idStringYesRequest ID from the previous step.

Example Request

An example cURL request to check the status of the score:

bash
curl -X GET \
'https://api.grammarly.com/ecosystem/api/v1/plagiarism/<score_request_id>' \
  -H 'Authorization: Bearer <ACCESS_TOKEN>' \
  -H 'Accept: application/json' \
  -H 'user-agent: API client'

Response Format

The response body contains fields describing the status of the evaluation process. If evaluation results are ready, they are provided in the score object.

Response Parameters

ParameterTypeDescription
score_request_idStringRequest ID that is used to request the plagiarism detection score evaluation results.
statusStringRequest processing status (PENDING, FAILED, COMPLETED).
updated_atDateTimeDate and time of the status update.
scoreObjectObject with the resulting value.
score.originalityNumberA number from 0 to 1 that represents the originality of the text. The higher the score, the more original the text. Therefore, a completely plagiarised text will have a score of 0.

Example response

An example response body:

json
{
  "score_request_id": "4bed7ce8-95f1-4e84-85d4-8f1e5744f950",
  "status": "COMPLETED",
  "updated_at": "2024-11-18T14:55:27.337443115Z",
  "score": {
    "originality": 0.89
  }
}

Supported Document Formats

The plagiarism API supports the following document formats:

FormatMIME typeExtension
Microsoft Wordapplication/msword.doc
Microsoft Word (OpenXML)application/vnd.openxmlformats-officedocument.wordprocessingml.document.docx
OpenDocument Textapplication/vnd.oasis.opendocument.text.odt
Texttext/plain.txt
Rich Text Formatapplication/rtf.rtf
(Adobe) Portable Document Formatapplication/pdf.pdf

Constraints

Max File Size

4 MB. The file size limit is 4 megabytes (MB) or 4,194,304 bytes. Uploaded documents larger than 4 MB will result in the FAILED status.

Common error reason: document\_size\_exceeds\_limit

Max Text Length

100,000 characters. The text limit is 100,000 characters, including spaces and new lines. Text extracted from a document exceeding the limit will result in the FAILED status.

Common error reason: content\_length\_exceeds\_limit

Minimum Text Length

30 words. The minimum word count is 30, including special characters (e.g., emoji). Text extracted from a document containing fewer than 30 words will result in an unexpected score status. A document that contains at least 1 word will result in the COMPLETED status with no score (i.e., score is null).

Common error reasons: file\_upload\_failed (blank document, 0 bytes), text\_extraction\_failed (blank document, > 0 bytes)

Max Timeout Duration

120 seconds. The maximum duration for uploading a document after submitting a score request is 120 seconds. A score request without an uploaded document will result in the FAILED status.

Common error reason: document\_text\_not\_found

Score Variance

Score results may vary for document formats that support content other than text (e.g., images, macros, metadata).

Max Request Rates

Consider exponential backoff to handle requests with status code 429. Load test results indicate an ideal base factor of 2 seconds.

10 requests per second. The maximum number of requests per second is 10 for POST /ecosystem/api/v1/plagiarism.

50 requests per second. The maximum number of requests per second is 50 for GET /ecosystem/api/v1/plagiarism/<score_request_id>.

Score Retention

The score is accessible via the API for 30 days starting from the date it was requested.

Data Retention

Grammarly retains documents only for the duration necessary to perform the analysis, but not longer than 24 hours.