Status/Resolution/Reason: Closed/Fixed/
Reporter/Name(from Bugbase): Adam Cameron / Adam Cameron (Adam Cameron)
Created: 05/06/2012
Components: Language
Versions: 9.0.1
Failure Type: Data Corruption
Found In Build/Fixed In Build: 9.0.1 / 287710
Priority/Frequency: Critical / Some users will encounter
Locale/System: English / Windows 7
Vote Count: 1
I have a UTF-8-encoded file, which xmlParse() appears to read as ISO-8859-1 or whatever it is CF ass-u-me`s all files to be unless otherwise told.
The result being any non-ASCII "extended" unicode characters are munged.
The code I'm using is:
<cfset x = xmlParse(expandPath("./junk.xml"))>
<cfdump var="#x#">
The XML I am using is:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<DATAPACKET Version="2.0">
<ROWDATA>
<ROW Data="20000717" NotePub="La società debitrice è in liquidazione coatta amministrativa. Abbiamo provveduto a comunicare al Commissario liquidatore l'importo complessivo del credito. Ammessi al chirografo per L.2.473.460." AttIst="FALSE" Concordato="TRUE" Fallimento="FALSE" RagSocDeb="ORTO PIU' SOC. COOP. A R.L." CodDeb="RP49" NumDec="" Acconti="0.0000" Emissione="20001230" ValMat="0.0000" Notifica="20010103" Deposito="20001115" ValCap="1001.1899" IDCliente="3" IDPratica="196"/>
<ROW Data="20030120" NotePub="DIFFIDA INVIATA E NON RICEVUTA. 1 RICEVUTA Si è depositata la domanda di ammissione al passivo AMMESSO TUTTO AL CHIROGRAFO" AttIst="FALSE" Concordato="FALSE" Fallimento="TRUE" RagSocDeb="NIKO MARKET S.R.L." CodDeb="VQ23" Acconti="0.0000" ValMat="0.0000" ValCap="430.4600" IDCliente="3" IDPratica="1375" UdiVer="20030922"/>
</ROWDATA>
</DATAPACKET>
NB: this code, on the other hand, works fine:
<cfset s = fileRead(expandPath("./junk.xml"), "UTF-8")>
<cfset x = xmlParse(s)>
<cfdump var="#x#">
if xmlParse() is to read from the file system, it needs to understand that files can be encoded in different encoding schemes. Or if it cannot do that, then it should take an attribute to help it understand (but it should just be able to get it right).
--
Adam
----------------------------- Additional Watson Details -----------------------------
Watson Bug ID: 3183072
Deployment Phase: Release Candidate
External Customer Info:
External Company:
External Customer Name: Adam Cameron.
External Customer Email:
External Test Config: My Hardware and Environment details:
Attachments:
Comments: