AltME: Ann-Reply

Messages

Gregg
Thanks Chris!

Chris
I know JSON parsers are a saturated space, but I was curious once again about the Red conversion process and needed the surrogate pairs -> UTF string.
PeterWood
I don't really understand the surrogate pairs to UTF issue. Does AltJSON UTF-16 encode the JSON string?
Chris
Yes--it should possibly be optional, but is part of the RFC7159 spec.
It's--as I understand it--the only prescribed way to encode characters above the BMP as ascii.
Chris
Is also in the icky-sounding ECMA-404.
PeterWood
I haven't looked at the RFC7159 spec as I followed the link on JOSN.org to the ECMA standard. The ECMA standard doesn't seem to mention text encoding but does mention that JSON strings are Unicode code points and refers to the \u notation.
JSONLint validates  {"str": "\u1F606"}
Is the issue that you want to load the JSON directly in JavaScript?
Chris
\u notation is strictly 4 hex digits.
No, my first encounter was someone saying my code errored out on an emoji symbol. I then encountered it myself in the wilds.
PeterWood
I thought the \u notation requires a minimum of 4 hex digits. This what Mozilla says at https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Text_formatting
Unicode escape sequences
The Unicode escape sequences require at least four hexadecimal digits following \u.
Chris
It also uses surrogate pairs--5 hex digits requires curly brackets.
In JSON "\u1F606" is the same as "^(1F60)6".
PeterWood
Thanks for the clarification though I'm now puzzled as to why JSONlint.conm validates
{"str": "\u1F606"}
Chris
It's still valid.
There's no end delimiter-it's just the character followed by a 6.
PeterWood
Thanks.
Chris
Like Red's "^(1F60)6" -- which is why Rebol/Red get's it right :)

DideC
Very nice Chris. I remember enjoying R-forces articles at my beginning. CodeConsious was the second one.
Gregg
Thanks for doing that Chris. Wonderful memories of Allen and Rebol Forces.

Endo
Thanks a lot Chris!

Last message posted 108 weeks ago.