hachyderm.io is one of the many independent Mastodon servers you can use to participate in the fediverse.
Hachyderm is a safe space, LGBTQIA+ and BLM, primarily comprised of tech industry professionals world wide. Note that many non-user account types have restrictions - please see our About page.

Administered by:

Server stats:

8.9K
active users

@chikim @ZBennoui what's this, more precisely? is this some kind of tts? how does it sound like? I tryed to understand something from the github, but it's kinda flying above my head

@esoteric_programmer @chikim @ZBennoui
It converts the following text:
Every time I see someone light up, um, because of something I’ve made, it’s like, wow, a little piece of my inner child gets healed, you know? And, um, when...snip

To the attached speech.

00:00/00:26

@esoteric_programmer @chikim @ZBennoui
You can easily plug this into an open-source LLM and get something akin to NotebookLM.
Totally free and open-source, with very high quality.

@mush42 @chikim @ZBennoui so, this does text completion and then generates speech using something like tts? is that correct so far? or do you attach audio of something, the model transcribes it and gets its meaning in whatever way that's considered meaning anyway, then concatenates your prompt text to that? that could create so, so many deepfakes, it's not even funny, if what I'm imagining is actually what's happening

Musharraf :verified:

@esoteric_programmer @chikim @ZBennoui
Other than the text completion part, you are almost correct.
You give it some text, and an audio sample, and it tries to replicate the given voices characteristics.
Research is active in the areas of speaker verification and audio deep fake detection to combat misuse.

@mush42 @chikim @ZBennoui aha, interesting! could I make it generate, say, something like a podcast? would I have to generate each part of the dialog in turn, then splice replies into the result?