BCL easyConverter SDK
easyConverter SDK Usermanual
PDF-to-HTML Programming API  |  Download Free Trial  |  Contact Us to Purchase

ConvertToHTML3 Method

Convert a PDF stream to an Array containing HTML stream, image streams and image file name streams as they are referenced from HTML. Method ignores AbsolutePositioning property.

Function ConvertToHTML3(InStrean As Variant,
                        [Password] As Variant,
                        [From] As Variant,
                        [To] As Variant)
                        As Variant

byte[][] ConvertToHTML3(byte[] InStrean,
                        string Password,
                        int From,
                        int To)

byte[][] ConvertToHTML3(byte[] InStrean,
                        String Password,
                        int From,
                        int To) throws PDF2HTMLException


Parameters

InStream

Input PDF stream. This is a variant array of bytes.

Password (optional)

Password to open the PDF document if any.

From (optional)

The starting page number to convert.

To (optional)

The ending page number to convert.

Return Values

Array of streams (array of array of bytes). The array has 1 + 2N elements, where N is the number of images referenced in the HTML document. The first stream in the array contains the HTML code. The second stream contains the name of the first image referenced in the HTML (if exists), as an ASCII-encoded byte array. The third stream contains the image data for the first image. The image name and data streams keep alternating until all images have been listed.

Exception Handling

Please refer to the list of return exceptions.

Example Usage in C# COM object


byte[] pdfBytes = File.ReadAllBytes(PDFFileName);

PDF2HTML oConverter = new PDF2HTML();
Array data = oConverter.ConvertToHTML3(pdfBytes);
File.WriteAllBytes(HtmlFileName, (byte[])data.GetValue(0));
for (int i = 1; i < data.Length - 1; i+=2)
{
    File.WriteAllBytes(HtmlFilePath + System.Text.Encoding.ASCII.GetString((byte[])data.GetValue(i)), (byte[])data.GetValue(i+1));
}

Example Usage in Native C#

byte[] pdfBytes = File.ReadAllBytes(PDFFileName);

using(PDF2HTML oConverter = new PDF2HTML())
{
    byte[][] data = oConverter.ConvertToHTML3(pdfBytes);
    File.WriteAllBytes(HtmlFileName, (byte[])data.GetValue(0));
    for (int i = 1; i < data.Length - 1; i += 2)
    {
        File.WriteAllBytes(HtmlFilePath + System.Text.Encoding.ASCII.GetString(data[i]), data[i + 1]);
    }
}

Full Java COM Sample Code


import com.bcl.easyConverter.*;
import com.bcl.easyConverter.EasyConverterHTML.*;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;
import com.jacob.com.SafeArray;
import com.jacob.com.Variant;

public class TestConverterMem
{
   public static void main(String[] args) throws Exception
   {
      if (args.length == 2)
      {
         File inputFile = new File(args[0]);
         String inputFileName = inputFile.getCanonicalPath();

         File htmlFile = new File(args[1]);
         String htmlFileName = htmlFile.getCanonicalPath();

         EasyConverter.initialize();

         PDF2HTML pdf2html = new PDF2HTML();

         FileInputStream inputFileStream = new FileInputStream(inputFile.getCanonicalPath());
         byte[] inputStream = new byte[(int)inputFile.length()];
         inputFileStream.read(inputStream);
         SafeArray output = pdf2html.ConvertToHTML3(inputStream, null, null, null).toSafeArray();
         int indexBase = output.getLBound(1);
         int outputCount = output.getUBound(1) - indexBase + 1;
         byte[] htmlStream = output.getVariant(indexBase).toSafeArray().toByteArray();
         FileOutputStream htmlFileStream = new FileOutputStream(htmlFile.getCanonicalPath());
         htmlFileStream.write(htmlStream);
         int imagesCount = (outputCount - 1) / 2;
         System.out.print("Number of images = ");
         System.out.println(imagesCount);
         for(int i = 0; i < imagesCount; ++i)
         {      
            String imageFilename = new String(output.getVariant(i * 2 + 1 + indexBase).toSafeArray().toByteArray(), "US-ASCII");
            byte[] imageStream = output.getVariant(i * 2 + 2 + indexBase).toSafeArray().toByteArray();
            File imageFile = new File(new File(htmlFileName).getParent(), imageFilename); // compose path from output directory + image filename
            FileOutputStream imageFileStream = new FileOutputStream(imageFile);
            imageFileStream.write(imageStream);
         }  

         EasyConverter.uninitialize();
      }
      else
      {
         System.out.println("Usage: java TestConverterMem  
"); System.out.println("For example:"); System.out.println("java TestConverterMem c:\\input\\smile.pdf c:\\output\\smile.html"); } } }

Java COM Sample Code Explanation

Here is how it works. ConvertToHTML3 returns a Variant, which we convert into SafeArray using toSafeArray().

The output array has 1 + 2*N elements, where N is the number of images. getLBound(1) is the index of the first element, which is always 0. getUBound(1) is the index of the last element, which is always 2*N. That is a total of 1 + 2*N elements.

Use output.getVariant(i) to get the ith element from the outer array. The result of this expression is a Variant, which is a SafeArray of bytes.

So the following expression is used to get a particular stream:

byte[] stream = output.getVariant(i).toSafeArray().toByteArray();

Stream 0 is the HTML content. Stream 1 is the name of the first image file (ASCII encoded). Stream 2 is the first image data. And so on. There is always exactly one HTML stream, but any number of images may be present (even zero).

Since the image file names are always ASCII byte arrays, they must be converted into a proper String first. Fortunately the String class' constructor can create a string from a byte array directly.

Note that the image filename is not a full path, only the name + extension portion, such as "1x1.jpg". The image files must go to the same directory as the HTML file. The File object's getParent() method is used to get the directory portion from the HTML filename.


Full Native Java Sample Code

import com.bcl.easyconverter.html.*;
import java.io.File;
import java.io.FileInputStream;
import java.io.FileOutputStream;

public class TestConverterMem
{
   public static void main(String[] args) throws Exception
   {
      if (args.length == 2)
      {
         File inputFile = new File(args[0]);
         String inputFileName = inputFile.getCanonicalPath();

         File htmlFile = new File(args[1]);
         String htmlFileName = htmlFile.getCanonicalPath();

         IPDF2HTML pdf2html = new IPDF2HTML();

         try
         {
            FileInputStream inputFileStream = new FileInputStream(inputFile.getCanonicalPath());
            byte[] inputStream = new byte[(int)inputFile.length()];
            inputFileStream.read(inputStream);
            byte[][] output = pdf2html.ConvertToHTML3(inputStream, "", -1, -1);
            int outputCount = output.length;
            byte[] htmlStream = output[0];
            FileOutputStream htmlFileStream = new FileOutputStream(htmlFile.getCanonicalPath());
            htmlFileStream.write(htmlStream);
            int imagesCount = (outputCount - 1) / 2;
            System.out.print("Number of images = ");
            System.out.println(imagesCount);
            for(int i = 0; i < imagesCount; ++i)
            {   
               String imageFilename = new String(output[i * 2 + 1], "US-ASCII");
               byte[] imageStream = output[i * 2 + 2];
               File imageFile = new File(new File(htmlFileName).getParent(), imageFilename); // compose path from output directory + image filename
               FileOutputStream imageFileStream = new FileOutputStream(imageFile);
               imageFileStream.write(imageStream);
            }
         }
         finally
         {
            pdf2html.dispose();
         }

      }
      else
      {
         System.out.println("Usage: java TestConverterMem  
"); System.out.println("For example:"); System.out.println("java TestConverterMem c:\\input\\smile.pdf c:\\output\\smile.html"); } } }