Page 1 of 2 12 LastLast
Results 1 to 10 of 12

Thread: HLS and Face Animation

  1. #1
    Join Date
    Dec 2011
    Location
    Key West, FL
    Posts
    3,633
    Post Thanks / Like

    Default HLS and Face Animation

    I've fallen in love with Halloween and Christmas animation - and I would like to share where current thoughts are with HLS supporting that activity.

    As of last light, HLS can accept text vocals and produce the series of mouth positions required to animate speech (same results many of you obtained by using Papagayo).

    HLS will enable the building of a Word dictionary that will hold "words" and their associated series of mouth movements required to animate speech.

    HLS will have several new channel types added to it's capabilities which will greatly improve animation sequencing.

    New channels types: "Word", "Mouth", and "Eye".

    A Word channel will allow easy placement and sizing of a "word effect" on to a Word channel. The user will be able to select a word from the vocals and position/size it just like any other effect in HLS.

    Once the vocal's words are placed, the user will inform HLS as to the number of mouth positions their specific display requires (some of the Halloween faces that can be purchased utilize 3, 5, or 7 different mouth positions). The user will create a translation map uniquely for that Word channels that tells HLS how to go from the 10 mouth positions derived from phonic analysis of the vocal's words to those required for their display. Assume we have a Halloween face that utilizes 5 unique mouth formations.

    HLS will utilize a "Mouth" channel for each mouth position associated with the Word channel. If you have not looked into animation, a single word will require multiple different mouth positions to animate speaking that word.

    So ... the Word channel defines the start and duration of the word .... a long spooky "ooooooo" can be as long as needed. HLS then drops "mouth" effects on to the associated Mouth channels. The user can then position the placement and duration of each mouth movement within the time frame of the Word.

    Looking at a real scenario ... take a display utilizing 3 singing faces .... lead and two backup singers. Lead singer has 5 mouth positions and the backs each has 3 mouth positions.

    The sequencing would go like this ...

    1 Word channel for Lead's Only vocals ... leading to 5 mouth channels.
    1 Word channel for Left backup Only vocals ... leading to 3 mouth channels.
    1 Word channel for Right backup Only vocals ... leading to 3 mouth channels.
    1 Word channel for vocals where right and left backups sing in harmony ... leading to 3 mouth channels.

    HLS will then provide a mechanism where physical illumination channels are assign to one or more Mouth channels. HLS will then automatically populate the physical illumination channels with Level effects being driven from the multitude of Mouth channels stated above.

    In summary - the process flow will be:

    Vocal Text
    Text to 10 animation mouth positions
    Position vocal's words on to a Word channel
    Translate mouth position from the 10 to the number required for the display and create that number of Mouth channels.
    Position your mouth effects onto the associated mouth channel.
    Tell HLS how to map you mouth channels into physical channels.
    HLS automatically populates the physical illumination channels as required.

    All comments and suggestions are welcomed as I'm still in development.

    I would like to thank "timon" for the long discussion on this topic last night ... he helped solidify a number of items.

    Joe

    Here are the 10 mouth positions that HLS currently extracts by phonetically analyzing spoken text.

    Viseme and Monster.png
    Attached Images Attached Images
    Last edited by JHinkle; 10-08-2012 at 09:09 AM.
    Link to my DownLoad Site: [B][COLOR=#ff0000][URL]http://www.joehinkle.com/HLS[/URL]

    [/COLOR][/B][IMG]http://joehinkle.com/HLS/HLS%20Logo%20Small.jpg[/IMG]

  2. #2
    Join Date
    Dec 2010
    Location
    Tustin, CA
    Posts
    2,143
    Post Thanks / Like

    Default Re: HLS and Face Animation

    The potential of what Joe is doing is enormous. To date no sequencing program has been able to handle face animation assistance without the need of an external program. Having the song words right on the sequencing screen will make it so much easier to sequence.

    Joe and I talked about other helper functions such as gross word placement when you start a sequence and the ability to click on in the lyrics window so the sequencing pointer would jump to that locations.

    Way to go Joe.

    Side note:

    Joe, Forgot to tell you something last night. Since you can now handle MP3's you want to look at pulling the ID3v2 Lyrics field out of the MP3 if they have been included. It's not a bit deal but it would make it a bit quicker to load the Lyrics. Also now that your handle MP3 think about AAC's. That puts you on your way to adding Video which uses it.

  3. #3
    Join Date
    Dec 2011
    Location
    Key West, FL
    Posts
    3,633
    Post Thanks / Like

    Default Re: HLS and Face Animation

    Many times when I'm doing development and I have not solidified my implementation, I will build test boxes to play in.

    The latest release has such a box available.

    Far right Menu - says - "Mouth Movement"

    Play with it if you like.

    Enter a word - or phrase and click OK - then WAIT!

    HLS will speak your word or phrase and then display the sequence of mouth positions required to animate speech - based on the phonemes extracted during phonic analysis.

    Value ZERO is silence - before and after the conversion.

    The non-zero numbers represent an index into an array of mouth positions (the 10 shown in a previous post).

    Type in your name and see how many positions are returned. Ask yourself, how many milliseconds did it take to say the name. Look at the time required and that each movement requires a minimum or 25 msecs .... many times the mouth movements are longer than the total word time.

    I am thinking about some sort of grammar rule that will look at the size of the sequencing time slice, the total time the word is active - the number of animation movements required - and try to collapse it some more - even above and beyond your limiting movements to 3, or 5, or 7 positions.

    If you have thoughts or experience it this area - please speak up.

    Thanks.

    Joe
    Link to my DownLoad Site: [B][COLOR=#ff0000][URL]http://www.joehinkle.com/HLS[/URL]

    [/COLOR][/B][IMG]http://joehinkle.com/HLS/HLS%20Logo%20Small.jpg[/IMG]

  4. #4
    Join Date
    Nov 2011
    Location
    Las Vegas, Nevada, U.S.
    Posts
    2,155
    Post Thanks / Like

    Default Re: HLS and Face Animation

    Joe... stop, makeing me purchase more stuff for your "free" program.

    Also, very impressive you can get prototype like this up and running so quick.
    "Jack of all trades, master of none,

    Certainly better than a master of one"
    Projects:
    [URL]http://hackaday.io/project/1092-Raptor12-AC-light-controller[/URL]

    Videos
    [URL]http://www.youtube.com/user/KingOfKYA/videos?flow=grid&view=0[/URL]

  5. #5
    Join Date
    Dec 2011
    Location
    Key West, FL
    Posts
    3,633
    Post Thanks / Like

    Default Re: HLS and Face Animation

    When I started HLS last January - I put in some simple first-order DSP filters.

    They were fine for bringing out the Beat for the Beat track .... but I recently implemented a similar Bandpass - used to bring out the vocals.

    I was totally dis-satisfied!!!

    Version 6Y now has some hi-powered DSP filters.

    You can select from first-order filtering all the the up to 10th order.

    The walls of these filters at 10 order is almost vertical.

    Below is a shot of audio from "Thriller" - where Michael is singing.

    First picture is normal audio - vocals and instruments.

    The second picture is my version of "Vocal Enhance" plus a 2nd order Band Pass filter with Gain of 1, Center Freq of 3000hz and a BandWidth of 1500hz.

    The results make it easy to listen and identify where each word in the vocals occur.

    Enjoy.

    Joe

    ThrillerNormal.jpg

    ThrillerBandPass.jpg
    Link to my DownLoad Site: [B][COLOR=#ff0000][URL]http://www.joehinkle.com/HLS[/URL]

    [/COLOR][/B][IMG]http://joehinkle.com/HLS/HLS%20Logo%20Small.jpg[/IMG]

  6. #6
    Join Date
    Oct 2011
    Location
    Oakley, CA
    Posts
    611
    Post Thanks / Like

    Default Re: HLS and Face Animation

    Joe, I think this might have solidified my decision for software this year. Thank you many times over for what you do, and continue to do!!
    Nick

  7. #7
    Join Date
    Nov 2010
    Location
    Charlotte NC
    Posts
    646
    Post Thanks / Like

    Default Re: HLS and Face Animation

    Joe or anybody,
    Can you provide or suggest a reference (Basic or layman level) to help me understand how to use the DSP filters on your tool.
    In the military I was a sonar technician with a Basic understanding of audio properties, That being said, (It was over 15 years ago) I have a rough time understanding how to best manipulate/use the tool you have provided.

    People have said this a milliion times, but it still is not enough, Thanks for the great program and all the time and effort you put into it.

    Quote Originally Posted by JHinkle View Post
    When I started HLS last January - I put in some simple first-order DSP filters.

    They were fine for bringing out the Beat for the Beat track .... but I recently implemented a similar Bandpass - used to bring out the vocals.

    I was totally dis-satisfied!!!

    Version 6Y now has some hi-powered DSP filters.

    You can select from first-order filtering all the the up to 10th order.

    The walls of these filters at 10 order is almost vertical.

    Below is a shot of audio from "Thriller" - where Michael is singing.

    First picture is normal audio - vocals and instruments.

    The second picture is my version of "Vocal Enhance" plus a 2nd order Band Pass filter with Gain of 1, Center Freq of 3000hz and a BandWidth of 1500hz.

    The results make it easy to listen and identify where each word in the vocals occur.

    Enjoy.

    Joe
    Last edited by rfallatt; 10-09-2012 at 10:46 AM. Reason: Added expanded audience
    3rd Year - 640 channels - 18,706 lights - 1 Lost Mind
    FACEBOOK: Fallatt's Charlotte Christmas

  8. #8
    Join Date
    Dec 2011
    Posts
    6,016
    Post Thanks / Like

    Default Re: HLS and Face Animation

    This is perfect Joe . Couldn't have asked for better. now i can use a vocals beat track.

    Thanks much !

  9. #9
    Join Date
    Dec 2011
    Location
    Key West, FL
    Posts
    3,633
    Post Thanks / Like

    Default Re: HLS and Face Animation

    A filter will attenuate (decrease) the audio volume within a specific range of frequencies.

    A Low Pass filter will attenuate frequencies Above a specific Corner frequency --- hence the name Low Pass meaning is passes frequencies lower than the Corner frequency unaltered.

    A High Pass filter will attenuate frequencies Up to a specific Corner frequency --- hence the name Hi Pass meaning is passes frequencies above the Corner frequency unaltered

    A Band Pass is a window - between which no attenuation takes place - outside of the window - the audio IS attenuated.
    With a Band Pass - you have a center frequency and a window whose size if the BandWidth. This window is centered on the Center frequency.

    A simple filter is a first order filter which provides a 3db attenuation slope at the skirt ( attenuation does not occur like a wall but has a slope up to the knee/Corner frequency - think ski slope).

    If you place two of the same simple filters in series - the slope increases. The more you have the steeper it gets --- almost to a vertical wall.

    The Term ORDER defines how many of these filters are in series - hence how steep is the attenuation slope.

    The telephone system treats speech within a band or frequencies from 300hz to 3000hz.

    Your BASS - or Beat in the music is usually less than 300hz.

    So ... as some examples ....

    If you wanted to set a Beat Track in your sequence - let your eyes help.

    Use a Low Pass filter set to 300hz - order depending on the response you want. Watch the Gain - you don't want Clipping - as shown by audio line segments hitting the top or the bottom of the audio track display. Clipping will produce clicks/pops in the audio.

    Process the Low Pass .... most of your voice will be gone as well as a lot of instrumentals. The Bass Notes will appear - so you can set your Beat effects by both listening to them and seeing them.

    Lets say you want to have a lighting effects when ever a hi-pitch bell rings.

    Use a Hi-Pass --- find the Corner Frequency that best gets rid of the vocals and audio with frequencies lower than the Bell's.

    Use a Band Pass when you want audio in the middle of the frequency spectrum while ignoring those above and below corner points.

    When doing Mouth animation - HLS wants you to place a WORD effect that shows the position and duration of the word.

    Using a Band Pass filter will help you SEE the word as well as hear it.

    Again - speech is primarily in the 300 to 3000hz range - so first try a Band Pass at 2000 with a BandWidth of 1000.

    That will give you corner frequencies at 1000hz and 3000hz. Play with the frequency inputs and Order to See the words as well as still be able to hear them.

    Please note - these are powerful DSP filters - they WILL attenuate you audio to nothing if not used properly.

    Also note --- The DSP filters are being used in HLS to make certain aspects of the audio recognizable - NOT without distortion.

    Odds are --- when you apply the BandPass to extract the Vocals - the result will sound very tinny - BUT - you will be able to see the words.

    HLS also has a check box at the bottom of the DSP Filter Dialog - called "Audio Based on Vocals".

    If this button is checked - I move from processing the audio as two channel audio to one channel MONO.

    I do this by adding the two channels together (you may need to Gain of .9 or .8 if Clipping occurs).

    A lot of music has the Vocals "In The Center" - meaning the vocals are in both channels (left and right) and are in-phase. By adding the channels together --- the vocal will tend to get louder while the instrumentals (which are not in-phase in both channels) will tend to be cancelled somewhat.

    Hence - the vocals are somewhat lifted above the music.

    Please note ... I have found that the BASS tends to be centered also --- so the BASS will also be elevated. Make sure your BandPass is of sufficient Order and the low Corner is such to remove most of the Bass.

    I hope that helps.

    Enjoy.

    Joe
    Link to my DownLoad Site: [B][COLOR=#ff0000][URL]http://www.joehinkle.com/HLS[/URL]

    [/COLOR][/B][IMG]http://joehinkle.com/HLS/HLS%20Logo%20Small.jpg[/IMG]

  10. Thanks Macrosill thanked for this post
  11. #10
    Join Date
    Nov 2010
    Location
    Charlotte NC
    Posts
    646
    Post Thanks / Like

    Default Re: HLS and Face Animation

    Thanks Joe... It does help

    Now off to messing with it!
    3rd Year - 640 channels - 18,706 lights - 1 Lost Mind
    FACEBOOK: Fallatt's Charlotte Christmas

Page 1 of 2 12 LastLast

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •