tracker issue : CF-3678093

select a category, or use search below
(searches all categories and all time range)
Title:

cfloop looping over a file can't handle unicode encoded files

| View in Tracker

Status/Resolution/Reason: Closed/Fixed/

Reporter/Name(from Bugbase): Rafael Salomon / Rafael Salomon (Rafael Salomon)

Created: 12/02/2013

Components: File Management

Versions: 10.0

Failure Type: Data Corruption

Found In Build/Fixed In Build: Final / 287681

Priority/Frequency: Major / All users will encounter

Locale/System: English / Win 2008 Server x64

Vote Count: 0

Problem Description: cfloop can't handle files that are encoded in anything other than iso-8859-1

Steps to Reproduce:

Step 1: create a test file with the following content and save it as unicode (on Windows you can use Notepad, File, Save As..., Encoding: Unicode) with the filename test.txt (I've also uploaded one you can use):
abcdefghijklmnopqrstuvwxy*()__+[]\{}|;':",./<>?

Step 2: create a coldfusion file called test.cfm with the following content (make sure test.txt and test.cfm are in the same folder):
<cfset folder = ExpandPath(".") & "\">
<cfset fileName = "test.txt">
<cfset filePath = folder & fileName>

<cffile action="read" file="#filePath#" variable="cffileRead">
<cfoutput><br>[#cffileRead#]</cfoutput>

<cfloop index="cfloopRead" file="#filePath#">
  <cfoutput><br>[#cfloopRead#]</cfoutput>
</cfloop>

<cfif (cffileRead EQ cfloopRead)>
  <cfoutput><br>[ok]</cfoutput>
<cfelse>
  <cfoutput><br>[problem]</cfoutput>
</cfif>

Step 3: run test.cfm in a browser

Actual Result:
cffileRead does not equal cfloopRead

Expected Result:
cffileRead should equal cfloopRead

Any Workarounds:
You can read the file with cffile and save it as charset="iso-8859-1" with cffile write.  And then read it with cfloop.

What cfloop needs is the same functionality as cffile for handling different character sets.  It should have a charset attribute and be able to automatically detect the correct character set.

----------------------------- Additional Watson Details -----------------------------

Watson Bug ID:	3678093

Deployment Phase:	Release Candidate

External Customer Info:
External Company:  
External Customer Name: Rafael Salomon
External Customer Email:  
External Test Config: My Hardware and Environment details:



Windows 2008 Server, 64bit

ColdFusion 10

Attachments:

  1. December 03, 2013 00:00:00: 1_test.txt

Comments:

cfloop does have a charset attribute which is used for reading the file. however this is missing from the documentation. This needs to be documented.
Comment by Rupesh K.
13895 | January 03, 2014 07:11:28 AM GMT
There's still a bug here, even if the docs need updating. Look at this gist: https://gist.github.com/daccfml/8255274 NB: use the same input file as per the ticket, but make sure to save it *with a byte-order marker". CFFILE and CFLOOP handle the BOM differently. They should not. NB: Railo does this correctly. -- Adam
Comment by External U.
13896 | January 04, 2014 07:54:42 AM GMT
I have updated the docs (https://learn.adobe.com/wiki/display/coldfusionen/cfloop%3A+looping+over+a+list%2C+a+file%2C+or+an+array) I can't help but think you probably could have done this, Rupesh, instead of simply saying it ought to be done. -- Adam
Comment by External U.
13897 | January 04, 2014 08:00:24 AM GMT
With the fix, the unicode encoded file's content read through cfloop equates with the cffile-read statement.
Comment by Akhila K.
13898 | January 20, 2014 05:08:16 PM GMT