tracker issue : CF-3038557

select a category, or use search below
(searches all categories and all time range)
Title:

Bug 77227:When a pdf page contains tables, cfpdf extract text not extracting text properly

| View in Tracker

Status/Resolution/Reason: Closed/Fixed/

Reporter/Name(from Bugbase): Ahamad Patan / Ahamad Patan (apatan)

Created: 05/14/2009

Components: Document Management, PDF manipulation

Versions: 9.0

Failure Type: Unspecified

Found In Build/Fixed In Build: 0000 / 238738

Priority/Frequency: Normal / Unknown

Locale/System: English / Platforms All

Vote Count: 0

Problem:

When a pdf page contains tables, cfpdf extract text not extracting text properly.

Method:

Use the cfml reference doc and try to extract text.
The Result: First the headers are extracted and then the tables.

Code:
<cfpdf action="extracttext" source="cfmlref.pdf" pages="21" name="mystring">

<cfdump var="#mystring#">

Result:
Hello <?xml version="1.0" encoding="UTF-8"?> <DocText xmlns="http://ns.adobe.com/DDX/DocText/1.0/"> <TextPerPage> <Page pageNumber="21">13 CENTAUR CFML REFERENCEReserved Words and Variables CGI client variablesThe following table describes common CGI environment variables the browser creates and passes in the request header:CGI client certificate variablesColdFusion makes available the following client certificate data . These variables are available when running Microsoft IIS 4.0 or Netscape Enterprise under SSL if your web server is configured to accept client certificates.PATH_TRANSLATED Translated version of PATH_INFO after any virtual - to - physical mapping.SCRIPT_NAME Virtual path to the script that is executing ; used for self - referencing URLs.QUERY_STRING Query information that follows the ? in the URL that referenced this script.REMOTE_HOST Hostname making the request . If the server does not have this information , it sets REMOTE_ADDR and does not set REMOTE_HOST.REMOTE_ADDR IP address of the remote host making the request.AUTH_TYPE If the server supports user authentication , and the script is protected , the protocol - specific authentication method used to validate the user.REMOTE_USER AUTH_USER If the server supports user authentication , and the script is protected , the username the user has authenticated as . ( Also available as AUTH_USER . ) REMOTE_IDENT If the HTTP server supports RFC 931 identification , this variable is set to the remote username retrieved from the server . Use this variable for logging only.CONTENT_TYPE For queries that have attached information , such as HTTP POST and PUT , this is the content type of the data.CONTENT_LENGTH Length of the content as given by the client.CGI client variableDescriptionHTTP_REFERER The referring document that linked to or submitted form data.HTTP_USER_AGENT The browser that the client is currently using to send the request . Format : software / version library / version.HTTP_IF_MODIFIED_SINCE The last time the page was modified . The browser determines whether to set this variable , usually in response to the server having sent the LAST_MODIFIED HTTP header . It can be used to take advantage of browser - side caching.CGI client certificate variableDescriptionCERT_SUBJECT Client - specific information provided by the web server . This data typically includes the client's name , e - mail address , and so on , for example:O = " VeriSign , Inc . " , OU = VeriSign Trust Network , OU = " www.verisign.com / repository / RPA Incorp . by Ref . , LIAB.LTD ( c ) 98 " , OU = Persona Not Validated , OU = Digital ID Class 1 - Microsoft , CN = Matthew Lund , E = mlund @ . comCERT_ISSUER Information about the authority that provided the client certificate , for example:O = " VeriSign , Inc . " , OU = VeriSign Trust Network , OU = " www.verisign.com / repository / RPA Incorp . By Ref . , LIAB.LTD ( c ) 98 " , CN = VeriSign Class 1 CA Individual Subscriber - Persona Not ValidatedCGI server variableDescription</Page> </TextPerPage> </DocText> 


Result:

----------------------------- Additional Watson Details -----------------------------

Watson Bug ID:	3038557

Deployment Phase:	Release Candidate

External Customer Info:
External Company:  
External Customer Name: Ahamad Patan
External Customer Email: 7EFE2BCC447C3E589920157F
External Test Config: 05/14/2009

Attachments:

Comments: