English | 日本
BCL Technologies
Shopping CartContact Us
   Create-PDF API    PDF-to-Word API    PDF-to-HTML API    PDF-Conversion API for LINUX
 

 

Getting Started

To try the following, please contact us to download a free trial.

The SDK is distributed as a ZIP file, which you can unpack anywhere under your home directory. You are going to end up with the following directory structure after extraction:

easyConverter/
   readme.pdf
   include/
      EasyConverter.h
   lib/
      libEasyConverter.so
   share/
      easyConverter/
         cmaps/*
         Fonts/*
         easyConverter.ini
         english.dic
   Samples/
      build.sh
      build.txt
      PDF2WordBasic
      PDF2WordBasic.c
      PDF2WordBatch
      PDF2WordBatch.c

The “include” directory contains the C header file “EasyConverter.h”, which contains the SDK declarations. You can use this header file from C and C++ languages.

The “lib” directory is the place for the easyConverter shared object. This is the dynamic library module that you link to your own executable in order to use the SDK.

The “share” directory contains important resource and configuration files necessary for the SDK to work properly. This includes Adobe character maps (“cmap”), TrueType font files (“Fonts”), the main product configuration file (“easyConverter.ini”), and a hyphenation dictionary (“english.dic”).

The “Samples” directory includes some sample applications to demonstrate the usage of the SDK. “PDF2WordBasic” is the simplest possible command-line utility, which takes an input PDF file and creates an output RTF document. Basic error handling and progress indication are included.

“PDF2WordBatch” is a more complex command-line tool that is capable of batch processing.

Running Sample Applications

After you have extracted the ZIP file, you can try to execute one of the sample applications. Open a terminal, and run the following commands:

cd easyConverter/Samples
./PDF2WordBasic input.pdf output.rtf

This is going to open input.pdf and convert it into a file called output.rtf, which can be opened with Microsoft Office or OpenOffice.

You can also batch convert multiple PDF files in a given directory using the following command:

./PDF2WordBatch directory-to-convert

This will enumerate all PDF files in the specified directory and automatically convert them each into RTF.

Note: The default conversion timeout is 5 minutes for each PDF file. Should it take more time to process a PDF file, the program will skip the current document and go on to the next one.

If you wish to compile the C programs on your own, you can run the ./build.sh shell script, which executes the following commands:

gcc PDF2WordBasic.c ../lib/libEasyConverter.so -L ../lib -o PDF2WordBasic gcc PDF2WordBatch.c ../lib/libEasyConverter.so -L ../lib -o PDF2WordBatch

This is perhaps the simplest way to compile the sample applications. The -L switch sets the library path, while the -o switch specifies the output executable filename.

Note: Before you build, please supply your license key at the top of each *.c file. There is a comment that reads: “YOUR LICENSE KEY COMES HERE”. You must edit that line.

When you create your own projects, you are free to put the libEasyConverter.so file anywhere, as long as your system can access it. It is recommended that you use a relative library reference, such as “./libEasyConverter.so”, or “../lib/libEasyConverter.so”. Alternatively, you could link using the “-l EasyConverter” GCC switch, and install libEasyConverter.so to a location that is on your library path.

Although you are free to install libEasyConverter.so anywhere on the hard drive, the “../share” directory must be available relative to the shared object’s location. In other words, there must be a “share” directory under the parent directory where libEasyConverter.so is located. The “../share/easyConverter/” path is hard-coded into EasyConverter, and this location cannot be changed. However, this convention is compatible with how Linux libraries are deployed. For example, if you install libEasyConverter.so under “/usr/lib/”, you can put the configuration and resource files under “/usr/share/easyConverter/”, which is standard practice under Linux:

/usr/
   include/
      EasyConverter.h
   lib/
      libEasyConverter.so
   share/
      easyConverter/
         cmaps/*
         Fonts/*
         easyConverter.ini
         english.dic

Optionally you could place this hierarchy under /usr/local/, which is where developers usually put their work libraries. Or you could just as easily leave everything under your home directory, which is what we recommend if you are unsure about your decision. You can always create symbolic links from your project to the easyConverter directories.

Configuration

The main product configuration file is called easyConverter.ini. It is located at “../share/easyConverter/easyConverter.ini” relative to where libEasyConverter.so is located.

The INI file may contain the following configuration options:

  • [Directories] section:
    • Fonts: Path to a directory where all the TrueType font files are located. This can be a relative path from the INI file’s location, or an absolute path. Fonts are not enumerated recursively. Only files with lowercase .ttf extension are used. Unsupported or corrupt fonts are ignored. Default value: “./Fonts”.
  • [Fonts] section:
    • DefaultSerifFont: When the PDF file contains a non-existing serif font, it will be substituted by this one. Default value: Times New Roman.
    • DefaultSansSerifFont: When the PDF file contains a non-existing sans serif font, it will be substituted by this one. Default value: Arial.

 

Example INI file:

[Directories]
Fonts=./Fonts
[Fonts]
DefaultSerifFont=Times New Roman
DefaultSansSerifFont=Arial

Since the default values are very reasonable, you may not have to edit this INI file at all. In most cases you can simply leave it empty.

You are required to supply your own fonts in the share/easyConverter/Fonts directory. For legal reasons we are not able to distribute Microsoft Windows TrueType fonts, such as Arial and Times New Roman. The Linux version of easyConverter comes with no fonts at all. Note that the product is not going to work unless at least one serif and one sans-serif TrueType font is present in the Fonts directory.

There are two approaches to the font issue:

  • Choice 1: The customer supplies the Windows version of the fonts Arial, Courier, Times New Roman, and Symbol, as well as other fonts commonly present in typical PDF documents. This is the easiest way to ensure that the Linux version of easyConverter produces the same RTF output as the Windows version. In a Windows installation the font files can be located in the C:\Windows\Fonts directory. Note that easyConverter currently doesn’t support TTC (TrueType Collection) files, and the .ttf extension must be lowercase. TTF files that work under Windows are very well supported by easyConverter Linux. It is the customer’s responsibility to properly license their own fonts before they copy them to their Linux installation.
  • Choice 2: There are free open-source fonts available. For example, the company called (URW)++ (formerly known as URW) offers a freely downloadable GPL-licensed TrueType font collection, which is available for download on the Internet. However, they are not called Arial, Courier, and Times New Roman, but Nimbus SanL, Nimbus Mono and Nimbus Roman, respectively (the actual names may be slightly different). Therefore the INI file’s DefaultSerifFont and DefaultSansSerifFont settings must be configured accordingly. On the other hand, the Nimbus fonts do not exist on a typical end-user’s computer, so an RTF file using the URW fonts may not look accurate enough. Despite being free, the URW fonts are protected by copyright and a license agreement, just like any commercial font.

Note that all of these configuration options are global to the entire system. libEasyConverter.so reads the easyConverter.ini file only once, when the library is first loaded. If you change the settings in the INI file, the changes are not going to be visible unless you restart every process that uses libEasyConverter.so.

Similarly, should you decide to add or remove TrueType font files, you are expected to stop every process that uses easyConverter first. If easyConverter needs to open a font file that you just deleted, a conversion error might occur.

There are several other PDF to RTF conversion settings that affect the conversion’s result. However, those are reentrant (thread-safe) settings passed programmatically to the SDK, as opposed to via global INI files.

Distribution

The following files and directories are not for re-distribution:

include/
   EasyConverter.h
Samples/
   compile.sh
   compile.txt
   PDF2WordBasic
   PDF2WordBasic.c
   PDF2WordBatch
   PDF2WordBatch.c

All product documentations are not redistributable either.

When you distribute your application, the following files should be shipped and installed:

lib/
   libEasyConverter.so
share/
   easyConverter/
      cmaps/*
      Fonts/*
      easyConverter.ini
      english.dic

Note that on the Linux system easyConveter is not able to automatically locate the fonts installed to your computer. You are expected to ship your fonts under the “../share/easyConverter/Fonts” directory, or configure the easyConverter.ini file to point to a directory where all the TrueType fonts you want to use are located. Since the font file enumeration is not recursive, all the *.ttf files must be directly under the same directory. Another options is to make “../share/easyConverter/Fonts” a symbolic link to a directory where all the fonts really are.

Note that this symbolic link option is also available for the entire “share” directory, as well as “libEasyConverter.so”. The actual physical files can be located anywhere, as long as a symbolic link points to them. For example, EasyConverter may be installed under your home directory, while symbolic links can be present under “/usr/lib/” and “/usr/share/”. The only true restriction is that “../share/easyConverter” must be present relative to the physical libEasyConverter.so file that your processes loaded.

Note: Most font files are not legally redistributable. Proprietary fonts are licensed by their respective authors. Open-source fonts are often under the GPL license.


 

 

 
-
 
BCL
© 1993 - , BCL Technologies.
All other trademarks are the property of their respective owners.