Apply Adobe Document AI to OCR scanned PDF file
00 min
Jul 1, 2023
Aug 15, 2024
slug
tags
password
icon
lang
date
category
type
status
summary
Adobe recent updated their service PDF Services API. With the Free Tier service, you can scan 500 pdf files each month.
{/* truncate */}
You can use the following procedures to apply and deploy the API with Node.js.
 

Create Credentials from Adobe

Remember to select Node.js as the preferred language.
 
Once finish, the web will notify you to download a .zip file, which includes the credential json.
 

Create script to deploy OCR process

 
Since Adobe updated their service, they would not provide private.key in the .zip file. You should apply the new method to OCR your PDF.
 

Prepare the Credential details

Once you unzip the downloaded file, you can find pdfservices-api-credentials.json in the folder root. Remember to replace the client-id and client-secret in the following script with your own details.

OCR single PDF file

If you only want to deploy the process for single file, you can try the following code, save it as ocr-pdf.js:

OCR multi files or a folder

If you plan to deploy the OCR service for the whole folder. You can save the following code as ocr-pdf-multi.js:
 
In my code, the folder template is saved in an Excel file, I read the file and collect the c1 information to map the folder.

Comments
  • Twikoo
  • Giscus