Hachyderm @hachyderm

Recent searches

Search options

Only available when logged in.

**Chi Kim** @chikim@mastodon.social · Oct 13, 2024

Oct 13, 2024

F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching! The quality is pretty impressive for open source, and it even supports mps for Mac! I was able to get it going on my Mac with no problem. #TTS #ML #AI
https://github.com/SWivid/F5-TTS
@ZBennoui

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" - SWivid/F5-TTS

GitHubGitHub - SWivid/F5-TTS: Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching" - SWivid/F5-TTS

**the esoteric programmer** @esoteric_programmer@social.stealthy.club · Oct 16, 2024

Oct 16, 2024

the esoteric programmer @esoteric_programmer@social.stealthy.club

@chikim @ZBennoui what's this, more precisely? is this some kind of tts? how does it sound like? I tryed to understand something from the github, but it's kinda flying above my head

**Musharraf** @mush42 · Oct 16, 2024

Oct 16, 2024

Musharraf @mush42

@esoteric_programmer @chikim @ZBennoui
It converts the following text:
Every time I see someone light up, um, because of something I’ve made, it’s like, wow, a little piece of my inner child gets healed, you know? And, um, when...snip

To the attached speech.

00:00/00:26

**Musharraf** @mush42 · Oct 16, 2024

Oct 16, 2024

Musharraf @mush42

@esoteric_programmer @chikim @ZBennoui
You can easily plug this into an open-source LLM and get something akin to NotebookLM.
Totally free and open-source, with very high quality.

**the esoteric programmer** @esoteric_programmer@social.stealthy.club · Oct 16, 2024

Oct 16, 2024

the esoteric programmer @esoteric_programmer@social.stealthy.club

@mush42 @chikim @ZBennoui so, this does text completion and then generates speech using something like tts? is that correct so far? or do you attach audio of something, the model transcribes it and gets its meaning in whatever way that's considered meaning anyway, then concatenates your prompt text to that? that could create so, so many deepfakes, it's not even funny, if what I'm imagining is actually what's happening

Musharraf @mush42@hachyderm.io

@esoteric_programmer @chikim @ZBennoui
But again, can't you say the same thing about ElevenLabs? And other voice conversion tech?

Oct 16, 2024, 10:43 PM··Tusky

0boosts·0favorites

**the esoteric programmer** @esoteric_programmer@social.stealthy.club · Oct 16, 2024

Oct 16, 2024

the esoteric programmer @esoteric_programmer@social.stealthy.club

@mush42 @chikim @ZBennoui I dk about 11 labs and such, I don't use those, don't intend to either, but this seemns like it'd be explicitly used for such a thing, by a lot and a lot of people, much easier than with 11 labs. Yeah, I could just be imagining this wrong and blowing it out of proportion in my mind, that's always a possibility

**Erion** @erion@tardis.pw · Oct 17, 2024

Oct 17, 2024

Erion @erion@tardis.pw

@esoteric_programmer @mush42 @chikim @ZBennoui The short answer to this is, anything can be misused. Someone could create fake clips, but someone could also create an audio book narrated by their favorite narrator for personal use, or someone could create an app that uses a loved one's lost voice if they only have a few audio clips as well.

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back