Visionaire Studio Bugfix Update 5.0.5

  • #30, by afrlmeTuesday, 05. June 2018, 15:01 6 years ago
    when you use your own lipsync solution you would need to set the frame pause time very low (e. g. 10ms) to allow the animation to quickly adapt to the changed frame settings. 
    I think i have each of my 10 Viseme frames set at about 100ms, which happens to be about right (most of the time).

    I cut them down to the Rhubarb 8 and re-ordered them to match the rhubarb setup, but they don't look good - Not appearing to lipsync at all.
    As you guys have said that there's nothing more to the setup, I will just investigate further and work out what's wrong.
    Can you post some screenshots please? Have you made sure to include the tsv file in the same folder as the speech audio file? The tsv file should have the same name as the audio file. Let's say you have an audio file called: "hello_world.ogg", now the tsv file should be named like this: "hello_world.ogg.tsv", with ogg.tsv being the format name & not the file name.

    Imperator

    7278 Posts


  • #31, by MachtnixTuesday, 05. June 2018, 16:01 6 years ago
    Synchro is interesting, but I don't understand in English how it works.
    I need a frame for each phonem, right? I need such a text file with a translation between the shape and a word in the sound folder?

    For example I have a text: "Hallo, mein Name ist Gregor". I have a sound file with this words, but of course they doesn't take the same time. Often the player jumps forward. 
    I need three or five phonems (not to every word): perhaps a, o, m, i. Mostly it's open mouth with maybe three shapes and a closed mouth. How I fit this together? Is there a German tutorial?

    Meanwhile the character speaks other animations are not possible (or I have to create speech animations wirh a lot of gestures...)?

    Thread Captain

    1097 Posts

  • #32, by darren-beckettTuesday, 05. June 2018, 16:13 6 years ago
    @AFRLme:

    Pics attached.
    No idea what i'm missing.

    Speech.ogg.tsv:
    0.00	X
    
    0.03	C
    
    0.11	B
    
    0.18	F
    
    0.32	B
    
    0.74	D
    
    0.88	B
    
    1.37	A
    
    1.45	B
    
    1.87	C
    
    1.94	G
    
    2.08	C
    
    2.15	B
    
    2.36	G
    
    2.43	E
    
    2.50	B
    
    2.57	A
    
    2.65	B
    
    3.20	D
    
    3.27	A
    
    3.37	C
    
    3.54	B
    
    3.96	F
    
    4.10	C
    
    4.24	X
    
    4.28	X 

    Great Poster

    384 Posts

  • #33, by afrlmeTuesday, 05. June 2018, 17:17 6 years ago
    What's it doing when the display text is displayed? Is it not forcing the animation frames which should be played or pausing when the speech file is silent?

    Hmm would it be possible for you to record a short video of the character talking?

    Imperator

    7278 Posts

  • #34, by afrlmeTuesday, 05. June 2018, 17:25 6 years ago
    @Seb:
    I used this website LINK to convert words into Phonemes
    I then have lookup tables to convert phonemes into visemes and then display each corresponding talk animation frame in sequence (using actions after each frame to lookup the next Viseme)

    It works well - but has no timing to the speech
    --Translation of Phonemes into Visemes(Mouth Shapes)
    
    PhonemeViseme = {}
    
    PhonemeViseme["PAUSE"] = "PAUSE" 
    
    PhonemeViseme["AA"] = "A" 
    
    PhonemeViseme["AE"] = "A" 
    
    PhonemeViseme["AH"] = "A" 
    
    PhonemeViseme["AO"] = "W" 
    
    PhonemeViseme["AW"] = "W" 
    
    PhonemeViseme["AY"] = "A" 
    
    PhonemeViseme["B"] = "M" 
    
    PhonemeViseme["CH"] = "U" 
    
    PhonemeViseme["D"] = "U" 
    
    PhonemeViseme["DH"] = "TH" 
    
    PhonemeViseme["EH"] = "A" 
    
    PhonemeViseme["ER"] = "O" 
    
    PhonemeViseme["EY"] = "A" 
    
    PhonemeViseme["F"] = "F" 
    
    PhonemeViseme["G"] = "U" 
    
    PhonemeViseme["HH"] = "E" 
    
    PhonemeViseme["IH"] = "E" 
    
    PhonemeViseme["IY"] = "E" 
    
    PhonemeViseme["JH"] = "U" 
    
    PhonemeViseme["K"] = "U" 
    
    PhonemeViseme["L"] = "L" 
    
    PhonemeViseme["M"] = "M" 
    
    PhonemeViseme["N"] = "TH" 
    
    PhonemeViseme["NG"] = "M" 
    
    PhonemeViseme["OW"] = "W" 
    
    PhonemeViseme["OY"] = "TH" 
    
    PhonemeViseme["P"] = "M" 
    
    PhonemeViseme["R"] = "R" 
    
    PhonemeViseme["S"] = "Y" 
    
    PhonemeViseme["SH"] = "Y" 
    
    PhonemeViseme["T"] = "Y" 
    
    PhonemeViseme["TH"] = "TH" 
    
    PhonemeViseme["UH"] = "W" 
    
    PhonemeViseme["UW"] = "W" 
    
    PhonemeViseme["V"] = "F" 
    
    PhonemeViseme["W"] = "W" 
    
    PhonemeViseme["Y"] = "U" 
    
    PhonemeViseme["Z"] = "U" 
    
    PhonemeViseme["ZH"] = "U" 
    
    
    
    --Visemes (Mouth Shape) - Frame Numbers
    
    VisemeFrame = {}
    
    VisemeFrame["BLANK"] = 1
    
    VisemeFrame["PAUSE"] = 1
    
    VisemeFrame["A"] = 2
    
    VisemeFrame["E"] = 3
    
    VisemeFrame["F"] = 4
    
    VisemeFrame["L"] = 5
    
    VisemeFrame["M"] = 6
    
    VisemeFrame["O"] = 7
    
    VisemeFrame["R"] = 8
    
    VisemeFrame["TH"] = 9
    
    VisemeFrame["U"] = 10
    
    VisemeFrame["W"] = 11
    
    
    
    SpeechPhonemes = {}
    
    SpeechPhonemes["A"] = { "AH" } 
    
    SpeechPhonemes["AGAIN"] = { "AH","G","EH","N" } 
    
    SpeechPhonemes["AM"] = { "AE","M" } 
    
    SpeechPhonemes["AN"] = { "AE","N" } 
    
    SpeechPhonemes["ANOTHER"] = { "AH","N","AH","DH","ER" } 
    
    SpeechPhonemes["ARE"] = { "AA","R" } 
    
    SpeechPhonemes["AWAY"] = { "AH","W","EY" } 
    
    SpeechPhonemes["BELIEVE"] = { "B","IH","L","IY","V" } 
    
    SpeechPhonemes["BONES"] = { "B","OW","N","Z" } 
    
    SpeechPhonemes["BOOKCASE"] = { "B","UH","K","K","EY","S" } 
    
    SpeechPhonemes["BUT"] = { "B","AH","T" } 
    
    SpeechPhonemes["BUTTONS"] = { "B","AH","T","AH","N","Z" } 
    
    SpeechPhonemes["BYE"] = { "B","AY" }
    ...
    @Sebastian: that's what I was talking - or trying to talk - about for text to phonetic lip syncing. It makes sense that you would check the words & tell it what to use based on each word. Lot of bloody work. Bet that took you ages @darren-beckett?

    Imperator

    7278 Posts

  • #35, by darren-beckettTuesday, 05. June 2018, 17:34 6 years ago
    It took some working out.
    But now, i just export the text, run it through the website, paste it into Excel to format the LUA script for me and paste it into VS.

    Great Poster

    384 Posts

  • #36, by afrlmeTuesday, 05. June 2018, 17:40 6 years ago
    Why through excel?

    As for the screenshots you posted, they all look ok to me. Only thing I can think of is that your script is interfering somehow.

    Imperator

    7278 Posts

  • #37, by sebastianTuesday, 05. June 2018, 19:09 6 years ago
    Why through excel?

    As for the screenshots you posted, they all look ok to me. Only thing I can think of is that your script is interfering somehow.

    he uses his own lipsync script based on visemes from that dictuonary... and thats maybe what causing the issues (?) 

    Thread Captain

    2346 Posts

  • #38, by afrlmeTuesday, 05. June 2018, 19:29 6 years ago
    Why through excel?

    As for the screenshots you posted, they all look ok to me. Only thing I can think of is that your script is interfering somehow.

    he uses his own lipsync script based on visemes from that dictuonary... and thats maybe what causing the issues (?) 
    Aye that's what I just said. grin

    Imperator

    7278 Posts

  • #39, by sebastianTuesday, 05. June 2018, 19:44 6 years ago
    arghh i should read until the end razz

    @machtnix i'll write later and describe the steps in german when im home. Not much to do. smile 

    Thread Captain

    2346 Posts

  • #40, by afrlmeTuesday, 05. June 2018, 20:12 6 years ago
    Synchro is interesting, but I don't understand in English how it works.
    I need a frame for each phonem, right? I need such a text file with a translation between the shape and a word in the sound folder?

    For example I have a text: "Hallo, mein Name ist Gregor". I have a sound file with this words, but of course they doesn't take the same time. Often the player jumps forward. 
    I need three or five phonems (not to every word): perhaps a, o, m, i. Mostly it's open mouth with maybe three shapes and a closed mouth. How I fit this together? Is there a German tutorial?

    Meanwhile the character speaks other animations are not possible (or I have to create speech animations wirh a lot of gestures...)?
    @machtnix you could create lots of animations for the speech if you wanted or outfits that has the character talking with gestures or you could layer the mouth as a different character to allow the character to gesture while talking.

    As for the mouth shapes, did you check out the linked Rhubarb website? If you scroll down the page there is a section about mouth shapes. Technically you don't actually have to use Rhubarb to generate the lip sync frames or timestamps, you can manually create them if you want or edit the files Rhubarb generates as you can replace the letters with frame numbers instead, which means that you could actually insert a few additional timestamps if you wanted to have the character play the same mouth shape but animate in a gesture or mood change.

    Um, I believe the creator of Rhubarb Lip Sync is actually German. You could send him a message via twitter & see if he has a German version of the instructions laying around?

    Imperator

    7278 Posts