Ce site est optimisé pour être consulté depuis un navigateur moderne dans lequel JavaScript est activé.

Lip-sync, Spine and automating animations

chrisj

I know that there has been some work done with Papagayo and lip-sync but we considering using the commercial solution from Annosoft. It does phoneme extraction and well as mouth position but its standard integrations are for 3D models and so we will have to roll our own for Spine. What I am wondering is how to best do the transitions from mouth position to mouth position? (we are a cartoon-y style not realistic so there is some leeway here)

Our animator will start by creating 10 or so mouth positions with rigged lips/teeth and we can map each phoneme to a mouth position. The question is - can we do a programmatic way of interpolating mouth positions? With no interpolation at all we could just use the 10 mouth positions as separate attachments and switch attachments as needed but surely we can do better. The animator could also generate 10*9 animations from any one mouth position to any other but that seems time consuming. I was thinking of generating those animations programmatically and adding them to the exported JSON - have not looked into it in detail but seems doable. We would also need a post-process step on every export which is acceptable but not desirable. Am I missing something in the tool itself that would make this straightforward?

Nate

How would you interpolate the mouths? Are they images? If so, maybe you can scale the current mouth toward the size of the next mouth. Matching it exactly might stretch some mouths too far, but scaling it x% of the way might be nice. One way to do this would be to make each mouth a mesh with 4 vertices, then move the vertices x% of the distance from the original current mouth position to the new mouth position of the same vertex. I'm not sure what other kind of transform you could do between mouths.

amcleod

I've had some success with this using some rigging & weights, in a manner similar to what Nate described. It's a bit of a curious setup, but it looks something like this:

Starting with an "oh" phoneme shape, create a mesh that maps the lips of the mouth.
Create a copy of this mesh for every other mouth shape (duplicate the mesh and change the image path).
Reposition the vertices on each mesh to be in the appropriate positions for the mouth pose. The idea is to have all of your mouth shapes share a common set of vertices.
Create a set of bones to control parts of the mouth.
For each mouth shape, position the bones in the appropriate locations, and bind them to the mesh. Be sure to position the bones before binding them to the mesh.
After you've bound the bones to all the mouth shapes, restore the bones to their original position for the "oh" phoneme (or whichever phoneme you want for your setup pose).

This will give you a setup where you can move the mouth bones around, and each mouth shape will distort to match the bone position as closely as possible. This way you don't have to create specific animations to transition from one mouth shape to another. You just move the bones, and change the visible attachment at the correct point in time. So for example, to transition from "oh" to "aah", stretch the edges of the mouth wide, and narrow the center gap for the lips.

The key is that the mouth bones must be in the proper position for the mouth shape when they are bound to the mesh. You can uses weighting to smooth out how each particular shape distorts when the bone positions move around. Some shapes don't always distort well, but with the right weighting, I found this setup surprisingly effective.