BCL easyPDF SDK
easyPDF SDK Usermanual
PDF Creator Programming API  |  Download Free Trial  |  Contact Us to Purchase

ExtractText Method

Extract text from a PDF file.

Sub ExtractText(InputFileName As String,
                OutputFileName As String,
                [Password],
                [From],
                [To],
                [PageSeparator],
                [OutputCodePage])

void ExtractText(string InputFileName,
                 string OutputFileName,
                 String Password,
                 int From,
                 int To,
                 string PageSeparator,
                 int OutputCodePage)

void ExtractText(String InputFileName,
                 String OutputFileName,
                 String Password,
                 int From,
                 int To,
                 String PageSeparator,
                 int OutputCodePage) throws PDFProcessorException

Parameters

Return Values

N/A.

Remarks

  1. This method extracts text from PDF file with all formatting information discarded.
  2. The extracted text can be useful for text indexing purpose.
  3. The page number uses zero-based index, meaning that page number starts from 0.

Example Usage

Set oProcessor = CreateObject("easyPDF.PDFProcessor.8")
 
' just extract with default option
oProcessor.ExtractText "C:\test\input1.pdf", "C:\test\output1.txt"
 
' extract first 5 pages from input PDF file
oProcessor.ExtractText "C:\test\input2.pdf",
                       "C:\test\output2.txt",_
                       From:=0,
                       To:=4
 
' extract using all options
oProcessor.ExtractText "C:\test\input3.pdf",
                       "C:\test\output3.txt",_
                       Password:="my_password",
                       From:=0,
                       To:=4,
                       PageSeparator:="[MY_PAGE_SEP]",
                       OutputCodePage:=PRC_CP_UTF8