After a long period of quiet, I have released an update to the `unicode-age` #Python package
https://pypi.org/project/unicode-age/
The package now supports #Unicode 16.0
When I wrote `unicode-age` I just sorta felt like writing it in Cython as a fun exercise, but upon reflection (and naturally, immediately after updating it), I'm wondering if it can be converted into a pure Python module.
The main waste of parsing ages into `list[int | None]` is that it ignores the span-oriented nature of DerivedAge.txt
A quick sketch suggests that the in-memory representation of the span information as `list[tuple[int, int, int, int]]` is ~300 KiB worth. That's ~10x the Cython approach (mostly because CPython's integers are >=24 bytes worth), but still pretty small.
We'll see.
I'll file an issue about it for the next update and forget about it until Fall (or whenever Unicode 16.1 would be if there will be one)
I may actually write up the pure Python implementation (which will be my first serious use of `struct.Struct`!), open a PR
@SnoopJ could you write it in the pure python cython subset?
@graingert maybe, but then I'd still have Cython in my life which isn't really worth it for this project
@SnoopJ what about mypyc?
@graingert might be slightly less of a bother but I'm trying to move *away* from an extension module here, not tweak how I get it