Trying to figure out how to print a UTF32 character in C and so far the answer seems to be "you can't"
@eniko On conforming implementions, printf("%lc", unicode_codepoint_val);
@dalias what type is unicode_codepoint_val
@eniko wint_t, but default promotions from wchar_t should be fine.
@dalias everything i've found tells me not to use wchar_t because it is unclear what width its going to be
@eniko Because Windows is wrong. If wchar_t is too narrow for full Unicode you're not allowed to support all of Unicode. C explicitly forbids "multi wchar_t chars" (thus UTF-16) which they do because they insisted on contradicting the experts in the early 90s who told them 16 bits wasn't enough and got themselves stuck. C11 strongly prefers wchar_t numeric vals be UCS codepoints (there's a macro that tells you this) and unless I'm misremembering, C23 requires it.
@dalias ok so then how do i support printing cross platform 32-bit unicode code points
@eniko With modern Windows, you can set the locale codepage to UTF-8 and it should just work doing everything in UTF-8 not touching wchar_t. Arguably this is the best way to do things, but it doesn't respect systems with legacy unix systems with non-UTF-8 encodings. Modern C also has char32_t (always UTF-32) which can be used if you're worried the system wchar_t is broken like on Windows but what you can easily do with it is limited..
@dalias from what I read char32_t isn't actually guaranteed to be utf32 and also I couldn't find a way to print it
@eniko Unfortunately the only way to print it is c32rtomb to convert it to a multibyte char string (in any reasonable setup this is UTF-8) in the current locale encoding.
@dalias i found https://beej.us/guide/bgc/html/split/unicode-wide-characters-and-all-that.html earlier and it says:
are values in these stored in UTF-16 or UTF-32? Depends on the implementation.
But you can test to see if they are. If the macros __STDC_UTF_16__ or __STDC_UTF_32__ are defined (to 1) it means the types hold UTF-16 or UTF-32, respectively.
@eniko C23 now mandates that.
@kittylyst @lulu @eniko Getting rid of Java?
Yes, I know. Same for javascript. But this is so bad.