I see that the documentation suggests that (entity-charset) is supposed to
return a symbol. However, it nearly always returns a string. In particular,
it appears to me that it returns a symbol only when it returns its default,
'us-ascii.
I feel compelled to repair this, but there are two ways to fix it:
1) make it match the docs and always return a symbol, or
2) change the docs and the default to return a string.
It looks to me like #2 will break (less) code, though it's certainly
possible that people depend on the default value's being a string.
Opinions? In my tree, I've added contract checks on the structure exports
and changed the documentation and default to always return a string. If
people like this, I can just submit it as a pull request.
John
Post by Matthew FlattYou can use "windows-1252" as an encoding name with, for example,
(read-line (reencode-input-port (open-input-bytes #"\xA3")
"windows-1252"))
â£"
Perfect!
I went looking for a place where I might add a âwindows-1252â search term,
but it looks like it might be hard, since the list of supported encodings
is apparently platform dependent. Would it make sense simply to attach a
free-floating search tag of âwindows-1252â to this part of the
documentation?
Post by Matthew FlattFor handling e-mail, see also `generalize-encoding` from `net/unihead`.
That probably saved me another half-hour of searching and head-scratching.
Thanks!
John
(p.s.: no one whose mailer checks DMARC records will get this e-mail,
sadly. Canât wait to change to google groups.)
Post by Matthew FlattI'm trying to process a bunch of e-mail, and I've discovered that lots
of
Post by Matthew Flattit is encoded using the "windows-1252" charset. It looks pretty
straightforward to map this to unicode, but I thought I'd check: has
anyone
Post by Matthew Flattwritten this code already?
John Clements
____________________
http://lists.racket-lang.org/users