Status/Resolution/Reason: To Fix//BugVerified
Reporter/Name(from Bugbase): James M. / ()
Created: 06/14/2019
Components: Language, Validation
Versions: 2016
Failure Type: Incorrectly functioning
Found In Build/Fixed In Build: 2016,0,10,314028 (and 2018) /
Priority/Frequency: Normal / Few users will encounter
Locale/System: English / Platforms All
Vote Count: 0
Problem Description:
While sanitizing external data that may contain unicode characters, I discovered that unicode is treated differently when using isValid("email") vesus isValid("url").
I created a fake domain that uses a unicode character 'CIRCLED LATIN SMALL LETTER G' (U+24D6) in ".or?". Email validation returns TRUE, but URL validation returns FALSE. (I compared Lucee and both validations consistently failed.)
When using java.net.URL to determine whether the URL is valid, it returns TRUE. A java regex rule that I use and HTML5 field URL validation both return TRUE.
Steps to Reproduce:
I've generated a GIST that can be run at TryCF.com. (CFFiddle.org doesn't allow CreateObject so that java.netURL can be tested.)
https://gist.github.com/JamoCA/4f9951753f2fe3ed79ef5d7945d2f926
https://www.trycf.com/gist/4f9951753f2fe3ed79ef5d7945d2f926
Actual Result:
Email = true; URL = false; java.net.URL= true
Expected Result:
Email = true; URL = true; java.net.URL = true
Any Workarounds:
Use your own UDF.
Use regex.
Use java.net.URL.
Convert to ASCII7 and test (especially if internally comparing against a URL blacklist.)
Attachments:
Comments: