[CVPR 2026] Generate talking face videos with the desired emotion driven by speech — no reference image needed.
Note: Source image and driving video must be pre-cropped to 256×256 face crops. See crop_image2.py for preprocessing.