hachyderm.io is one of the many independent Mastodon servers you can use to participate in the fediverse.
Hachyderm is a safe space, LGBTQIA+ and BLM, primarily comprised of tech industry professionals world wide. Note that many non-user account types have restrictions - please see our About page.

Administered by:

Server stats:

9.8K
active users

Firefox's new offline translation is certianly a nice feature.

My question is, does clicking on these buttons install non-free software on my computer?

I don't think that the free software community has settled on an answer to this question.

Did a quick test with a fresh install of firefox and it automatically uses the ML model for detecting language without being configured. It also automatically downloaded the spanish translation model even though I didn't ask it to do the translation.

I'm unsure if the language detection model is being shipped in firefox or downloaded on first run.

The fasttext.cc/ model is shipped in the firefox package, verified with this experiment.

edit: turns out not to be, see followups to this post

or did I? I forgot about .cache/mozilla

Looking in the source, they have horrible things like 100+ kb of minifired js that loads the fasttext LLM, but I didn't find the LLM binary itself

yeah, verified this using a fresh user account

where they hide the LLM in the source code I don't know, wild!

@joeyh I guess it would maybe make sense to have the JS blob in something like nonfree and I guess the LLM translator should have a clear opt-in in firefox or be shipped in Debian.

@joeyh
Err, does any of that show llm use for *detection*? Language classification alone turns out to be easy (trigram or ngram stats, I think? It was an "intro to NLP" problem set over a decade ago) so I'd expect they'd start with that instead...

@eichin yeah and firefox does contain such ngrams (I think used by something else)... but according to their docs about this, it's using the fasttext llm for classification.

Unless perhaps it falls back to the ngrams when fasttext can't be downloaded? Could be.

@eichin oho! browser.translations.languageIdentification.useFastText config exists and is false by default apparently. So maybe it is using the ngrams.

@eichin what difference if any there is between ngrams and a LLM when it comes to providing source code is an interesting thing to ponder...

@joeyh
Training recognizers (specifically that could *not* produce any of their training material as output) was pretty well believed-by-lawyers to not infringe the copyright of a work 15 years ago - so the more interesting part of it to ponder is what you want to enable the users of you code to do/what does "preferred form for editing" even mean. Today, disk is cheap, maybe just check in your training data :)