With interactive converter you can change any parameter of any frame and see the result in real time.
Converter: added motion_blur_power param.
Motion blur is applied by precomputed motion vectors.
So the moving face will look more realistic.
RecycleGAN model is removed.
Added experimental AVATAR model. Minimum required VRAM is 6GB (NVIDIA), 12GB (AMD)
Usage:
1) place data_src.mp4 10-20min square resolution video of news reporter sitting at the table with static background,
other faces should not appear in frames.
2) process "extract images from video data_src.bat" with FULL fps
3) place data_dst.mp4 video of face who will control the src face
4) process "extract images from video data_dst FULL FPS.bat"
5) process "data_src mark faces S3FD best GPU.bat"
6) process "data_dst extract unaligned faces S3FD best GPU.bat"
7) train AVATAR.bat stage 1, tune batch size to maximum for your card (32 for 6GB), train to 50k+ iters.
8) train AVATAR.bat stage 2, tune batch size to maximum for your card (4 for 6GB), train to decent sharpness.
9) convert AVATAR.bat
10) converted to mp4.bat
updated versions of modules
Enable autobackup? (y/n ?:help skip:%s) :
Autobackup model files with preview every hour for last 15 hours. Latest backup located in model/<>_autobackups/01
SAE: added option only for CUDA builds:
Enable gradient clipping? (y/n, ?:help skip:%s) :
Gradient clipping reduces chance of model collapse, sacrificing speed of training.
Pretrain the model with large amount of various faces. This technique may help to train the fake with overly different face shapes and light conditions of src/dst data. Face will be look more like a morphed. To reduce the morph effect, some model files will be initialized but not be updated after pretrain: LIAE: inter_AB.h5 DF: both decoders.h5. The longer you pretrain the model the more morphed face will look. After that, save and run the training again.
Fixed bug when SAE can be collapsed during a time.
SAE: removed CA weights and encoder/decoder dims.
added new options:
Encoder dims per channel (21-85 ?:help skip:%d)
More encoder dims help to recognize more facial features, but require more VRAM. You can fine-tune model size to fit your GPU.
Decoder dims per channel (11-85 ?:help skip:%d)
More decoder dims help to get better details, but require more VRAM. You can fine-tune model size to fit your GPU.
Add residual blocks to decoder? (y/n, ?:help skip:n) :
These blocks help to get better details, but require more computing time.
Remove gray border? (y/n, ?:help skip:n) :
Removes gray border of predicted face, but requires more computing resources.