Plagiarism Detection API (Beta)

Overview

The plagiarism detection API allows organizations to evaluate the authenticity of written content. This API provides a score indicating the originality of the text, helping organizations maintain trust and accountability and ensure proper citation of sources.

This plagiarism detection API developer guide explains how to create a score request, upload a document for scoring, and get the document scoring status and result.

The API is intended for programmatic consumption and is accessible through an HTTP REST interface.

How scoring works

Grammarly compares the text of the uploaded document to billions of web pages and academic papers in private databases, looking for sentences or paragraphs that have been published elsewhere. The API then calculates an originality score— the higher the score, the more original the text.

Accessing Plagiarism Detection API

The base URL for accessing the plagiarism detection API:

https://api.grammarly.com/ecosystem/api/v1/plagiarism

For authentication, API requests must include an Access token as the Authorization header with the Bearer type. For more information, please refer to the OAuth 2.0 article.

The scopes required for the Access token are plagiarism-api:read and plagiarism-api:write.

Requesting a transaction

POST https://api.grammarly.com/ecosystem/api/v1/plagiarism

This API endpoint initiates transactions for plagiarism detection.

Request Parameters

Parameter	Type	Required	Description
`filename`	String	Yes	Name of the file that will be uploaded.

Example Request

An example cURL request to create a score request:

bash

curl -X POST \
  'https://api.grammarly.com/ecosystem/api/v1/plagiarism' \
  -H 'Authorization: Bearer <ACCESS_TOKEN>' \
  -H 'Accept: application/json' \
  -H 'Content-Type: application/json' \
  -H 'user-agent: API client' \
  -d '{ "filename": "example.doc" }'

Response Format

The response body contains the score_request_id and file_upload_url fields, which will be required in subsequent steps.

Response Parameters

Parameter	Type	Description
`score_request_id`	String	Request ID that is used to request the plagiarism detection evaluation results.
`file_upload_url`	String	URL for uploading the requested file.

Example response

An example response body:

json

{
  "score_request_id":"401124dd-c774-46af-b652-f6627a4ca5f6",
  "file_upload_url":"https://prod-writingscore-file-upload.s3-external-1.amazonaws.com/625270646/401124dd-c774-46af-b652-f6627a4ca5f6?..."
}

Upload file
Upload the document to be scored using the pre-signed URL from the response, file_upload_url. The upload must start before the pre-signed URL expires, which is in 120 seconds.

This request is a simple PUT request that does not require the Authorization header. All security parameters are specified as query attributes of file_upload_url.

An example cURL request to upload a file to the pre-signed URL:

bash

curl -T example.doc <file_upload_url>

Requesting the evaluation result

GET https://api.grammarly.com/ecosystem/api/v1/plagiarism

This API endpoint returns the result of the plagiarism detection evaluation.

Request Parameters

Parameter	Type	Required	Description
`score_request_id`	String	Yes	Request ID from the previous step.

Example Request

An example cURL request to check the status of the score:

bash

curl -X GET \
'https://api.grammarly.com/ecosystem/api/v1/plagiarism/<score_request_id>' \
  -H 'Authorization: Bearer <ACCESS_TOKEN>' \
  -H 'Accept: application/json' \
  -H 'user-agent: API client'

Response Format

The response body contains fields describing the status of the evaluation process. If evaluation results are ready, they are provided in the score object.

Response Parameters

Parameter	Type	Description
`score_request_id`	String	Request ID that is used to request the plagiarism detection score evaluation results.
`status`	String	Request processing status (`PENDING`, `FAILED`, `COMPLETED`).
`updated_at`	DateTime	Date and time of the status update.
`score`	Object	Object with the resulting value.
`score.originality`	Number	A number from 0 to 1 that represents the originality of the text. The higher the score, the more original the text. Therefore, a completely plagiarised text will have a score of 0.

Example response

An example response body:

json

{
  "score_request_id": "4bed7ce8-95f1-4e84-85d4-8f1e5744f950",
  "status": "COMPLETED",
  "updated_at": "2024-11-18T14:55:27.337443115Z",
  "score": {
    "originality": 0.89
  }
}

Supported Document Formats

The plagiarism API supports the following document formats:

Format	MIME type	Extension
Microsoft Word	application/msword	.doc
Microsoft Word (OpenXML)	application/vnd.openxmlformats-officedocument.wordprocessingml.document	.docx
OpenDocument Text	application/vnd.oasis.opendocument.text	.odt
Text	text/plain	.txt
Rich Text Format	application/rtf	.rtf

Constraints

Max File Size

4 MB. The file size limit is 4 megabytes (MB) or 4,194,304 bytes. Uploaded documents larger than 4 MB will result in the FAILED status.

Common error reason: document\_size\_exceeds\_limit

Max Text Length

100,000 characters. The text limit is 100,000 characters, including spaces and new lines. Text extracted from a document exceeding the limit will result in the FAILED status.

Common error reason: content\_length\_exceeds\_limit

Minimum Text Length

30 words. The minimum word count is 30, including special characters (e.g., emoji). Text extracted from a document containing fewer than 30 words will result in an unexpected score status. A document that contains at least 1 word will result in the COMPLETED status with no score (i.e., score is null).

Common error reasons: file\_upload\_failed (blank document, 0 bytes), text\_extraction\_failed (blank document, > 0 bytes)

Max Timeout Duration

120 seconds. The maximum duration for uploading a document after submitting a score request is 120 seconds. A score request without an uploaded document will result in the FAILED status.

Common error reason: document\_text\_not\_found

Score Variance

Score results may vary for document formats that support content other than text (e.g., images, macros, metadata).

Max Request Rates

Consider exponential backoff to handle requests with status code 429. Load test results indicate an ideal base factor of 2 seconds.

10 requests per second. The maximum number of requests per second is 10 for POST /ecosystem/api/v1/plagiarism.

50 requests per second. The maximum number of requests per second is 50 for GET /ecosystem/api/v1/plagiarism/<score_request_id>.

Score Retention

The score is accessible via the API for 30 days starting from the date it was requested.

Data Retention

Grammarly retains documents only for the duration necessary to perform the analysis, but not longer than 24 hours.

Plagiarism Detection API (Beta) ​

Overview ​

How scoring works ​

Accessing Plagiarism Detection API ​

Requesting a transaction ​

Request Parameters ​

Example Request ​

Response Format ​

Response Parameters ​

Requesting the evaluation result ​

Request Parameters ​

Example Request ​

Response Format ​

Response Parameters ​

Supported Document Formats ​

Constraints ​

Max File Size ​

Max Text Length ​

Minimum Text Length ​

Max Timeout Duration ​

Score Variance ​

Max Request Rates ​

Score Retention ​

Data Retention ​

Plagiarism Detection API (Beta)

Overview

How scoring works

Accessing Plagiarism Detection API

Requesting a transaction

Request Parameters

Example Request

Response Format

Response Parameters

Requesting the evaluation result

Request Parameters

Example Request

Response Format

Response Parameters

Supported Document Formats

Constraints

Max File Size

Max Text Length

Minimum Text Length

Max Timeout Duration

Score Variance

Max Request Rates

Score Retention

Data Retention