tracker issue : CF-3342141

select a category, or use search below
(searches all categories and all time range)
Title:

Source code file should not need special charset encoding instructions for them to compile properly

| View in Tracker

Status/Resolution/Reason: Closed/Withdrawn/CannotReproduce

Reporter/Name(from Bugbase): Adam Cameron / Adam Cameron (Adam Cameron)

Created: 10/07/2012

Components: General Server

Versions: 10.0

Failure Type: Data Corruption

Found In Build/Fixed In Build: Final /

Priority/Frequency: Critical / Most users will encounter

Locale/System: English / Win All

Vote Count: 2

Problem Description:
If a source code file uses - for example - UTF-8 encoding, then one needs to TELL the CF compiler this, with s <cfprocessingdirective> tag.

Steps to Reproduce:
Put some non-ASCII-equivalent UTF-8 text into a CFM file and compile it, inspect results.  EG, something like this:
<!---<cfprocessingdirective pageencoding="UTF-8">--->
<cfset message = "???? ??????">
<cfoutput>#message#</cfoutput>

Actual Result:
हैलो द�निया

Expected Result:
???? ??????

Any Workarounds:
Don't care.  Should just work.  Any text processor worth its salt (even NOTEPAD) can just work it out for itself.  CF should be able to as well.

----------------------------- Additional Watson Details -----------------------------

Watson Bug ID:	3342141

External Customer Info:
External Company:  
External Customer Name: Adam Cameron.
External Customer Email:  
External Test Config: My Hardware and Environment details:

Attachments:

Comments:

Hi Adam, I see the correct result, but it's b/c I use the JVM arg: -Dfile.encoding=UTF-8 to patch-up that UTF-8 hole. (I say 'patch-up' just b/c I want UTF-8 everywhere, by default) Basically, CF reads URL and Form input as UTF-8 by default. And it encodes output as UTF-8 by default. BUT, it reads files in the OS's default encoding by default. =( If CF can't include the -Dfile.encoding=UTF-8 JVM arg by default, then this would probably be a good CF Admin setting. I'd kinda like the CF Admin's JVM page to have a set of checkboxes for the most common args that people add (rather than having to remember them). Additionally, it'd be nice if CF had a charsetDetect() function which tried to guess the encoding of a string and of a text file. Thanks, -Aaron
Comment by External U.
17696 | October 08, 2012 12:31:06 PM GMT
Very valid, I'm having to publish code with cfprocessingdirective because of that now, as I don't wanna have to change the JVM args on all my instances.
Vote by External U.
17698 | November 05, 2012 06:26:26 AM GMT
As someone in a hosted environment, I can't change the JVM args to get around this. I run a manga/anime review site and a personal site which regularly uses Japanese romaji characters. This bug has caused me so many headaches trying to figure out why my characters suddenly went haywire even though I had all the proper encodings set for display and in the database. Please fix this, hopefully in both 10 and 9.
Vote by External U.
17699 | June 04, 2013 09:00:13 PM GMT
unable to observe the issue with an unpatched CF10 (Win server 2008/IIS). no encoding arguments in the jvm.config file. @Adam, have you saved that test cfm in UTF format. Can we take a look at you jvm.config file.
Comment by Piyush K.
17697 | July 24, 2013 03:56:20 AM GMT