This commit is contained in:
iperov 2018-06-04 17:12:43 +04:00
parent 73de93b4f1
commit 6bd5a44264
71 changed files with 8448 additions and 0 deletions

16
.github/ISSUE_TEMPLATE.md vendored Normal file
View file

@ -0,0 +1,16 @@
## Expected behavior
*Describe, in some detail, what you are trying to do and what the output is that you expect from the program.*
## Actual behavior
*Describe, in some detail, what the program does instead. Be sure to include any error message or screenshots.*
## Steps to reproduce
*Describe, in some detail, the steps you tried that resulted in the behavior described above.*
## Other relevant information
- **Command lined used (if not specified in steps to reproduce)**: main.py ...
- **Operating system and version:** Windows, macOS, Linux
- **Python version:** 3.5, 3.6.4, ...

15
.gitignore vendored Normal file
View file

@ -0,0 +1,15 @@
*
!*.py
!*.md
!*.txt
!*.jpg
!requirements*
!doc
!facelib
!gpufmkmgr
!localization
!mainscripts
!mathlib
!models
!nnlib
!utils

5
CODEGUIDELINES Normal file
View file

@ -0,0 +1,5 @@
Please don't ruin the code and this good (as I think) architecture.
Please follow the same logic and brevity/pithiness.
Don't abstract the code into huge classes if you only win some lines of code in one place, because this can prevent programmers from understanding it quickly.

116
README.md Normal file
View file

@ -0,0 +1,116 @@
## **DeepFaceLab** is a tool that utilizes deep learning to recognize and swap faces in pictures and videos.
Based on original FaceSwap repo. **Facesets** of FaceSwap or FakeApp are **not compatible** with this repo. You should to run extract again.
### **Features**:
- new models
- new architecture, easy to experiment with models
- works on 2GB old cards , such as GT730. Example of fake trained on 2GB gtx850m notebook in 18 hours https://www.youtube.com/watch?v=bprVuRxBA34
- face data embedded to png files
- automatic GPU manager, chooses best gpu(s) and supports --multi-gpu
- new preview window
- extractor in parallel
- converter in parallel
- added **--debug** option for all stages
- added **MTCNN extractor** which produce less jittered aligned face than DLIBCNN, but can produce more false faces. Comparison dlib (at left) vs mtcnn on hard case:
![](https://i.imgur.com/5qLiiOV.gif)
MTCNN produces less jitter.
- added **Manual extractor**. You can fix missed faces manually or do full manual extract, click on video:
[![Watch the video](https://i.imgur.com/BDrPKR2.jpg)](https://webm.video/i/ogL0DL.mp4)
![Result](https://user-images.githubusercontent.com/8076202/38454756-0fa7a86c-3a7e-11e8-9065-182b4a8a7a43.gif)
- standalone zero dependencies ready to work prebuilt binary for all windows versions, see below
### **Model types**:
- **H64 (2GB+)** - half face with 64 resolution. It is as original FakeApp or FaceSwap, but with new TensorFlow 1.8 DSSIM Loss func and separated mask decoder + better ConverterMasked. for 2GB and 3GB VRAM model works in reduced mode.
* H64 Robert Downey Jr.:
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/H64_Downey_0.jpg)
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/H64_Downey_1.jpg)
- **H128 (3GB+)** - as H64, but in 128 resolution. Better face details. for 3GB and 4GB VRAM model works in reduced mode.
* H128 Cage:
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/H128_Cage_0.jpg)
* H128 asian face on blurry target:
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/H128_Asian_0.jpg)
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/H128_Asian_1.jpg)
- **DF (5GB+)** - @dfaker model. As H128, but fullface model.
* DF example - later
- **LIAEF128 (5GB+)** - new model. Result of combining DF, IAE, + experiments. Model tries to morph src face to dst, while keeping facial features of src face, but less agressive morphing. Model has problems with closed eyes recognizing.
* LIAEF128 Cage:
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/LIAEF128_Cage_0.jpg)
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/LIAEF128_Cage_1.jpg)
* LIAEF128 Cage video:
* [![Watch the video](https://img.youtube.com/vi/mRsexePEVco/0.jpg)](https://www.youtube.com/watch?v=mRsexePEVco)
- **LIAEF128YAW (5GB+)** - currently testing. Useful when your src faceset has too many side faces vs dst faceset. It feeds NN by sorted samples by yaw.
- **MIAEF128 (5GB+)** - as LIAEF128, but also it tries to match brightness/color features.
* MIAEF128 model diagramm:
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/MIAEF128_diagramm.png)
* MIAEF128 Ford success case:
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/MIAEF128_Ford_0.jpg)
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/MIAEF128_Ford_1.jpg)
* MIAEF128 Cage fail case:
* ![](https://github.com/iperov/OpenDeepFaceSwap/blob/master/doc/MIAEF128_Cage_fail.jpg)
- **AVATAR (4GB+)** - face controlling model. Usage:
* src - controllable face (Cage)
* dst - controller face (your face)
* converter --input-dir contains aligned dst faces in sequence to be converted, its mean you can train on 1500 dst faces, but use only 100 for convert.
### **Sort tool**:
`hist` groups images by similar content
`hist-dissim` places most similar to each other images to end.
`hist-blur` sort by blur in groups of similar content
`brightness`
`hue`
`face` and `face-dissim` currently useless
Best practice for gather src faceset:
1) delete first unsorted aligned groups of images what you can to delete. Dont touch target face mixed with others.
2) `blur` -> delete ~half of them
3) `hist` -> delete groups of similar and leave only target face
4) `hist-blur` -> delete blurred at end of groups of similar
5) `hist-dissim` -> leave only first **1000-1500 faces**, because number of src faces can affect result. For YAW feeder model skip this step.
6) `face-yaw` -> just for finalize faceset
Best practice for dst faces:
1) delete first unsorted aligned groups of images what you can to delete. Dont touch target face mixed with others.
2) `hist` -> delete groups of similar and leave only target face
### **Prebuilt binary**:
Windows 7,8,8.1,10 zero dependency binary except NVidia Video Drivers can be downloaded from torrent.
Torrent page: https://rutracker.org/forum/viewtopic.php?p=75318742 (magnet link inside)
### **Facesets**:
- Nicolas Cage.
- Cage/Trump workspace
download from here: https://mega.nz/#F!y1ERHDaL!PPwg01PQZk0FhWLVo5_MaQ
### **Pull requesting**:
I understand some people want to help. But result of mass people contribution we can see in deepfakes\faceswap.
High chance I will decline PR. Therefore before PR better ask me what you want to change or add to save your time.

BIN
doc/H128_Asian_0.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 83 KiB

BIN
doc/H128_Asian_1.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 86 KiB

BIN
doc/H128_Cage_0.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 92 KiB

BIN
doc/H64_Downey_0.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

BIN
doc/H64_Downey_1.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 42 KiB

BIN
doc/LIAEF128_Cage_0.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 112 KiB

BIN
doc/LIAEF128_Cage_1.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 112 KiB

BIN
doc/MIAEF128_Cage_fail.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 92 KiB

BIN
doc/MIAEF128_Ford_0.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 77 KiB

BIN
doc/MIAEF128_Ford_1.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 75 KiB

BIN
doc/MIAEF128_diagramm.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 31 KiB

BIN
doc/landmarks.jpg Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 450 KiB

BIN
facelib/2DFAN-4.h5 Normal file

Binary file not shown.

40
facelib/DLIBExtractor.py Normal file
View file

@ -0,0 +1,40 @@
import numpy as np
import os
import cv2
from pathlib import Path
class DLIBExtractor(object):
def __init__(self, dlib):
self.scale_to = 1850
#3100 eats ~1.687GB VRAM on 2GB 730 desktop card, but >4Gb on 6GB card,
#but 3100 doesnt work on 2GB 850M notebook card, I cant understand this behaviour
#1850 works on 2GB 850M notebook card, works faster than 3100, produces good result
self.dlib = dlib
def __enter__(self):
self.dlib_cnn_face_detector = self.dlib.cnn_face_detection_model_v1( str(Path(__file__).parent / "mmod_human_face_detector.dat") )
self.dlib_cnn_face_detector ( np.zeros ( (self.scale_to, self.scale_to, 3), dtype=np.uint8), 0 )
return self
def __exit__(self, exc_type=None, exc_value=None, traceback=None):
del self.dlib_cnn_face_detector
return False #pass exception between __enter__ and __exit__ to outter level
def extract_from_bgr (self, input_image):
input_image = input_image[:,:,::-1].copy()
(h, w, ch) = input_image.shape
detected_faces = []
input_scale = self.scale_to / (w if w > h else h)
input_image = cv2.resize (input_image, ( int(w*input_scale), int(h*input_scale) ), interpolation=cv2.INTER_LINEAR)
detected_faces = self.dlib_cnn_face_detector(input_image, 0)
result = []
for d_rect in detected_faces:
if type(d_rect) == self.dlib.mmod_rectangle:
d_rect = d_rect.rect
left, top, right, bottom = d_rect.left(), d_rect.top(), d_rect.right(), d_rect.bottom()
result.append ( (int(left/input_scale), int(top/input_scale), int(right/input_scale), int(bottom/input_scale)) )
return result

34
facelib/FaceType.py Normal file
View file

@ -0,0 +1,34 @@
from enum import IntEnum
class FaceType(IntEnum):
HALF = 0,
FULL = 1,
HEAD = 2,
AVATAR = 3, #centered nose only
MARK_ONLY = 4, #no align at all, just embedded faceinfo
QTY = 5
@staticmethod
def fromString (s):
r = from_string_dict.get (s.lower())
if r is None:
raise Exception ('FaceType.fromString value error')
return r
@staticmethod
def toString (face_type):
return to_string_list[face_type]
from_string_dict = {'half_face': FaceType.HALF,
'full_face': FaceType.FULL,
'head' : FaceType.HEAD,
'avatar' : FaceType.AVATAR,
'mark_only' : FaceType.MARK_ONLY,
}
to_string_list = [ 'half_face',
'full_face',
'head',
'avatar',
'mark_only'
]

View file

@ -0,0 +1,133 @@
import numpy as np
import os
import cv2
from pathlib import Path
from utils import std_utils
def transform(point, center, scale, resolution):
pt = np.array ( [point[0], point[1], 1.0] )
h = 200.0 * scale
m = np.eye(3)
m[0,0] = resolution / h
m[1,1] = resolution / h
m[0,2] = resolution * ( -center[0] / h + 0.5 )
m[1,2] = resolution * ( -center[1] / h + 0.5 )
m = np.linalg.inv(m)
return np.matmul (m, pt)[0:2]
def crop(image, center, scale, resolution=256.0):
ul = transform([1, 1], center, scale, resolution).astype( np.int )
br = transform([resolution, resolution], center, scale, resolution).astype( np.int )
if image.ndim > 2:
newDim = np.array([br[1] - ul[1], br[0] - ul[0], image.shape[2]], dtype=np.int32)
newImg = np.zeros(newDim, dtype=np.uint8)
else:
newDim = np.array([br[1] - ul[1], br[0] - ul[0]], dtype=np.int)
newImg = np.zeros(newDim, dtype=np.uint8)
ht = image.shape[0]
wd = image.shape[1]
newX = np.array([max(1, -ul[0] + 1), min(br[0], wd) - ul[0]], dtype=np.int32)
newY = np.array([max(1, -ul[1] + 1), min(br[1], ht) - ul[1]], dtype=np.int32)
oldX = np.array([max(1, ul[0] + 1), min(br[0], wd)], dtype=np.int32)
oldY = np.array([max(1, ul[1] + 1), min(br[1], ht)], dtype=np.int32)
newImg[newY[0] - 1:newY[1], newX[0] - 1:newX[1] ] = image[oldY[0] - 1:oldY[1], oldX[0] - 1:oldX[1], :]
newImg = cv2.resize(newImg, dsize=(int(resolution), int(resolution)), interpolation=cv2.INTER_LINEAR)
return newImg
def get_pts_from_predict(a, center, scale):
b = a.reshape ( (a.shape[0], a.shape[1]*a.shape[2]) )
c = b.argmax(1).reshape ( (a.shape[0], 1) ).repeat(2, axis=1).astype(np.float)
c[:,0] %= a.shape[2]
c[:,1] = np.apply_along_axis ( lambda x: np.floor(x / a.shape[2]), 0, c[:,1] )
for i in range(a.shape[0]):
pX, pY = int(c[i,0]), int(c[i,1])
if pX > 0 and pX < 63 and pY > 0 and pY < 63:
diff = np.array ( [a[i,pY,pX+1]-a[i,pY,pX-1], a[i,pY+1,pX]-a[i,pY-1,pX]] )
c[i] += np.sign(diff)*0.25
c += 0.5
return [ transform (c[i], center, scale, a.shape[2]) for i in range(a.shape[0]) ]
class LandmarksExtractor(object):
def __init__ (self, keras):
self.keras = keras
K = self.keras.backend
class TorchBatchNorm2D(self.keras.engine.topology.Layer):
def __init__(self, axis=-1, momentum=0.99, epsilon=1e-3, **kwargs):
super(TorchBatchNorm2D, self).__init__(**kwargs)
self.supports_masking = True
self.axis = axis
self.momentum = momentum
self.epsilon = epsilon
def build(self, input_shape):
dim = input_shape[self.axis]
if dim is None:
raise ValueError('Axis ' + str(self.axis) + ' of ' 'input tensor should have a defined dimension ' 'but the layer received an input with shape ' + str(input_shape) + '.')
shape = (dim,)
self.gamma = self.add_weight(shape=shape, name='gamma', initializer='ones', regularizer=None, constraint=None)
self.beta = self.add_weight(shape=shape, name='beta', initializer='zeros', regularizer=None, constraint=None)
self.moving_mean = self.add_weight(shape=shape, name='moving_mean', initializer='zeros', trainable=False)
self.moving_variance = self.add_weight(shape=shape, name='moving_variance', initializer='ones', trainable=False)
self.built = True
def call(self, inputs, training=None):
input_shape = K.int_shape(inputs)
broadcast_shape = [1] * len(input_shape)
broadcast_shape[self.axis] = input_shape[self.axis]
broadcast_moving_mean = K.reshape(self.moving_mean, broadcast_shape)
broadcast_moving_variance = K.reshape(self.moving_variance, broadcast_shape)
broadcast_gamma = K.reshape(self.gamma, broadcast_shape)
broadcast_beta = K.reshape(self.beta, broadcast_shape)
invstd = K.ones (shape=broadcast_shape, dtype='float32') / K.sqrt(broadcast_moving_variance + K.constant(self.epsilon, dtype='float32'))
return (inputs - broadcast_moving_mean) * invstd * broadcast_gamma + broadcast_beta
def get_config(self):
config = { 'axis': self.axis, 'momentum': self.momentum, 'epsilon': self.epsilon }
base_config = super(TorchBatchNorm2D, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
self.TorchBatchNorm2D = TorchBatchNorm2D
def __enter__(self):
keras_model_path = Path(__file__).parent / "2DFAN-4.h5"
if not keras_model_path.exists():
return None
self.keras_model = self.keras.models.load_model ( str(keras_model_path), custom_objects={'TorchBatchNorm2D': self.TorchBatchNorm2D} )
return self
def __exit__(self, exc_type=None, exc_value=None, traceback=None):
del self.keras_model
return False #pass exception between __enter__ and __exit__ to outter level
def extract_from_bgr (self, input_image, rects):
input_image = input_image[:,:,::-1].copy()
(h, w, ch) = input_image.shape
landmarks = []
for (left, top, right, bottom) in rects:
center = np.array( [ (left + right) / 2.0, (top + bottom) / 2.0] )
center[1] -= (bottom - top) * 0.12
scale = (right - left + bottom - top) / 195.0
image = crop(input_image, center, scale).transpose ( (2,0,1) ).astype(np.float32) / 255.0
image = np.expand_dims(image, 0)
with std_utils.suppress_stdout_stderr():
predicted = self.keras_model.predict (image)
pts_img = get_pts_from_predict ( predicted[-1][0], center, scale)
pts_img = [ ( int(pt[0]), int(pt[1]) ) for pt in pts_img ]
landmarks.append ( ( (left, top, right, bottom),pts_img ) )
return landmarks

View file

@ -0,0 +1,193 @@
import colorsys
import cv2
import numpy as np
from enum import IntEnum
from mathlib.umeyama import umeyama
from utils import image_utils
from facelib import FaceType
import math
mean_face_x = np.array([
0.000213256, 0.0752622, 0.18113, 0.29077, 0.393397, 0.586856, 0.689483, 0.799124,
0.904991, 0.98004, 0.490127, 0.490127, 0.490127, 0.490127, 0.36688, 0.426036,
0.490127, 0.554217, 0.613373, 0.121737, 0.187122, 0.265825, 0.334606, 0.260918,
0.182743, 0.645647, 0.714428, 0.793132, 0.858516, 0.79751, 0.719335, 0.254149,
0.340985, 0.428858, 0.490127, 0.551395, 0.639268, 0.726104, 0.642159, 0.556721,
0.490127, 0.423532, 0.338094, 0.290379, 0.428096, 0.490127, 0.552157, 0.689874,
0.553364, 0.490127, 0.42689 ])
mean_face_y = np.array([
0.106454, 0.038915, 0.0187482, 0.0344891, 0.0773906, 0.0773906, 0.0344891,
0.0187482, 0.038915, 0.106454, 0.203352, 0.307009, 0.409805, 0.515625, 0.587326,
0.609345, 0.628106, 0.609345, 0.587326, 0.216423, 0.178758, 0.179852, 0.231733,
0.245099, 0.244077, 0.231733, 0.179852, 0.178758, 0.216423, 0.244077, 0.245099,
0.780233, 0.745405, 0.727388, 0.742578, 0.727388, 0.745405, 0.780233, 0.864805,
0.902192, 0.909281, 0.902192, 0.864805, 0.784792, 0.778746, 0.785343, 0.778746,
0.784792, 0.824182, 0.831803, 0.824182 ])
landmarks_2D = np.stack( [ mean_face_x, mean_face_y ], axis=1 )
def get_transform_mat (image_landmarks, output_size, face_type):
if not isinstance(image_landmarks, np.ndarray):
image_landmarks = np.array (image_landmarks)
if face_type == FaceType.AVATAR:
centroid = np.mean (image_landmarks, axis=0)
mat = umeyama(image_landmarks[17:], landmarks_2D, True)[0:2]
a, c = mat[0,0], mat[1,0]
scale = math.sqrt((a * a) + (c * c))
padding = (output_size / 64) * 32
mat = np.eye ( 2,3 )
mat[0,2] = -centroid[0]
mat[1,2] = -centroid[1]
mat = mat * scale * (output_size / 3)
mat[:,2] += output_size / 2
else:
if face_type == FaceType.HALF:
padding = 0
elif face_type == FaceType.FULL:
padding = (output_size / 64) * 12
elif face_type == FaceType.HEAD:
padding = (output_size / 64) * 24
else:
raise ValueError ('wrong face_type')
mat = umeyama(image_landmarks[17:], landmarks_2D, True)[0:2]
mat = mat * (output_size - 2 * padding)
mat[:,2] += padding
return mat
def transform_points(points, mat, invert=False):
if invert:
mat = cv2.invertAffineTransform (mat)
points = np.expand_dims(points, axis=1)
points = cv2.transform(points, mat, points.shape)
points = np.squeeze(points)
return points
def get_image_hull_mask (image, image_landmarks):
if len(image_landmarks) != 68:
raise Exception('get_image_hull_mask work only with 68 landmarks')
hull_mask = np.zeros(image.shape[0:2]+(1,),dtype=np.float32)
cv2.fillConvexPoly( hull_mask, cv2.convexHull( np.concatenate ( (image_landmarks[0:17], image_landmarks[48:], [image_landmarks[0]], [image_landmarks[8]], [image_landmarks[16]])) ), (1,) )
cv2.fillConvexPoly( hull_mask, cv2.convexHull( np.concatenate ( (image_landmarks[27:31], [image_landmarks[33]]) ) ), (1,) )
cv2.fillConvexPoly( hull_mask, cv2.convexHull( np.concatenate ( (image_landmarks[17:27], [image_landmarks[0]], [image_landmarks[27]], [image_landmarks[16]], [image_landmarks[33]])) ), (1,) )
return hull_mask
def get_image_eye_mask (image, image_landmarks):
if len(image_landmarks) != 68:
raise Exception('get_image_eye_mask work only with 68 landmarks')
hull_mask = np.zeros(image.shape[0:2]+(1,),dtype=np.float32)
cv2.fillConvexPoly( hull_mask, cv2.convexHull( image_landmarks[36:42]), (1,) )
cv2.fillConvexPoly( hull_mask, cv2.convexHull( image_landmarks[42:48]), (1,) )
return hull_mask
def get_image_hull_mask_3D (image, image_landmarks):
result = get_image_hull_mask(image, image_landmarks)
return np.repeat ( result, (3,), -1 )
def blur_image_hull_mask (hull_mask):
maxregion = np.argwhere(hull_mask==1.0)
miny,minx = maxregion.min(axis=0)[:2]
maxy,maxx = maxregion.max(axis=0)[:2]
lenx = maxx - minx;
leny = maxy - miny;
masky = int(minx+(lenx//2))
maskx = int(miny+(leny//2))
lowest_len = min (lenx, leny)
ero = int( lowest_len * 0.085 )
blur = int( lowest_len * 0.10 )
hull_mask = cv2.erode(hull_mask, cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(ero,ero)), iterations = 1 )
hull_mask = cv2.blur(hull_mask, (blur, blur) )
hull_mask = np.expand_dims (hull_mask,-1)
return hull_mask
def get_blurred_image_hull_mask(image, image_landmarks):
return blur_image_hull_mask ( get_image_hull_mask(image, image_landmarks) )
mirror_idxs = [
[0,16],
[1,15],
[2,14],
[3,13],
[4,12],
[5,11],
[6,10],
[7,9],
[17,26],
[18,25],
[19,24],
[20,23],
[21,22],
[36,45],
[37,44],
[38,43],
[39,42],
[40,47],
[41,46],
[31,35],
[32,34],
[50,52],
[49,53],
[48,54],
[59,55],
[58,56],
[67,65],
[60,64],
[61,63] ]
def mirror_landmarks (landmarks, val):
result = landmarks.copy()
for idx in mirror_idxs:
result [ idx ] = result [ idx[::-1] ]
result[:,0] = val - result[:,0] - 1
return result
def draw_landmarks (image, image_landmarks, color):
for i, (x, y) in enumerate(image_landmarks):
cv2.circle(image, (x, y), 2, color, -1)
#text_color = colorsys.hsv_to_rgb ( (i%4) * (0.25), 1.0, 1.0 )
#cv2.putText(image, str(i), (x, y), cv2.FONT_HERSHEY_SIMPLEX, 0.1,text_color,1)
def draw_rect_landmarks (image, rect, image_landmarks, face_size, face_type):
image_utils.draw_rect (image, rect, (255,0,0), 2 )
draw_landmarks(image, image_landmarks, (0,255,0) )
image_to_face_mat = get_transform_mat (image_landmarks, face_size, face_type)
points = transform_points ( [ (0,0), (0,face_size-1), (face_size-1, face_size-1), (face_size-1,0) ], image_to_face_mat, True)
image_utils.draw_polygon (image, points, (0,0,255), 2)
def calc_face_pitch(landmarks):
if not isinstance(landmarks, np.ndarray):
landmarks = np.array (landmarks)
t = ( (landmarks[6][1]-landmarks[8][1]) + (landmarks[10][1]-landmarks[8][1]) ) / 2.0
b = landmarks[8][1]
return float(b-t)
def calc_face_yaw(landmarks):
if not isinstance(landmarks, np.ndarray):
landmarks = np.array (landmarks)
l = ( (landmarks[27][0]-landmarks[0][0]) + (landmarks[28][0]-landmarks[1][0]) + (landmarks[29][0]-landmarks[2][0]) ) / 3.0
r = ( (landmarks[16][0]-landmarks[27][0]) + (landmarks[15][0]-landmarks[28][0]) + (landmarks[14][0]-landmarks[29][0]) ) / 3.0
return float(r-l)

66
facelib/MTCExtractor.py Normal file
View file

@ -0,0 +1,66 @@
import numpy as np
import os
import cv2
from pathlib import Path
from .mtcnn import *
class MTCExtractor(object):
def __init__(self, keras, tf, tf_session):
self.scale_to = 1920
self.keras = keras
self.tf = tf
self.tf_session = tf_session
self.min_face_size = self.scale_to * 0.042
self.thresh1 = 0.7
self.thresh2 = 0.85
self.thresh3 = 0.6
self.scale_factor = 0.95
'''
self.min_face_size = self.scale_to * 0.042
self.thresh1 = 7
self.thresh2 = 85
self.thresh3 = 6
self.scale_factor = 0.95
'''
def __enter__(self):
with self.tf.variable_scope('pnet2'):
data = self.tf.placeholder(self.tf.float32, (None,None,None,3), 'input')
pnet2 = PNet(self.tf, {'data':data})
pnet2.load(str(Path(__file__).parent/'det1.npy'), self.tf_session)
with self.tf.variable_scope('rnet2'):
data = self.tf.placeholder(self.tf.float32, (None,24,24,3), 'input')
rnet2 = RNet(self.tf, {'data':data})
rnet2.load(str(Path(__file__).parent/'det2.npy'), self.tf_session)
with self.tf.variable_scope('onet2'):
data = self.tf.placeholder(self.tf.float32, (None,48,48,3), 'input')
onet2 = ONet(self.tf, {'data':data})
onet2.load(str(Path(__file__).parent/'det3.npy'), self.tf_session)
self.pnet_fun = self.keras.backend.function([pnet2.layers['data']],[pnet2.layers['conv4-2'], pnet2.layers['prob1']])
self.rnet_fun = self.keras.backend.function([rnet2.layers['data']],[rnet2.layers['conv5-2'], rnet2.layers['prob1']])
self.onet_fun = self.keras.backend.function([onet2.layers['data']],[onet2.layers['conv6-2'], onet2.layers['conv6-3'], onet2.layers['prob1']])
faces, pnts = detect_face ( np.zeros ( (self.scale_to, self.scale_to, 3)), self.min_face_size, self.pnet_fun, self.rnet_fun, self.onet_fun, [ self.thresh1, self.thresh2, self.thresh3 ], self.scale_factor )
return self
def __exit__(self, exc_type=None, exc_value=None, traceback=None):
return False #pass exception between __enter__ and __exit__ to outter level
def extract_from_bgr (self, input_image):
input_image = input_image[:,:,::-1].copy()
(h, w, ch) = input_image.shape
input_scale = self.scale_to / (w if w > h else h)
input_image = cv2.resize (input_image, ( int(w*input_scale), int(h*input_scale) ), interpolation=cv2.INTER_LINEAR)
detected_faces, pnts = detect_face ( input_image, self.min_face_size, self.pnet_fun, self.rnet_fun, self.onet_fun, [ self.thresh1, self.thresh2, self.thresh3 ], self.scale_factor )
detected_faces = [ ( int(face[0]/input_scale), int(face[1]/input_scale), int(face[2]/input_scale), int(face[3]/input_scale)) for face in detected_faces ]
return detected_faces

5
facelib/__init__.py Normal file
View file

@ -0,0 +1,5 @@
from .FaceType import FaceType
from .DLIBExtractor import DLIBExtractor
from .MTCExtractor import MTCExtractor
from .LandmarksExtractor import LandmarksExtractor
from .LandmarksProcessor import *

BIN
facelib/det1.npy Normal file

Binary file not shown.

BIN
facelib/det2.npy Normal file

Binary file not shown.

BIN
facelib/det3.npy Normal file

Binary file not shown.

Binary file not shown.

761
facelib/mtcnn.py Normal file
View file

@ -0,0 +1,761 @@
# Source: https://github.com/davidsandberg/facenet/blob/master/src/align/
""" Tensorflow implementation of the face detection / alignment algorithm found at
https://github.com/kpzhang93/MTCNN_face_detection_alignment
"""
# MIT License
#
# Copyright (c) 2016 David Sandberg
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
from six import string_types, iteritems
import numpy as np
#from math import floor
import cv2
import os
def layer(op):
"""Decorator for composable network layers."""
def layer_decorated(self, *args, **kwargs):
# Automatically set a name if not provided.
name = kwargs.setdefault('name', self.get_unique_name(op.__name__))
# Figure out the layer inputs.
if len(self.terminals) == 0:
raise RuntimeError('No input variables found for layer %s.' % name)
elif len(self.terminals) == 1:
layer_input = self.terminals[0]
else:
layer_input = list(self.terminals)
# Perform the operation and get the output.
layer_output = op(self, layer_input, *args, **kwargs)
# Add to layer LUT.
self.layers[name] = layer_output
# This output is now the input for the next layer.
self.feed(layer_output)
# Return self for chained calls.
return self
return layer_decorated
class Network(object):
def __init__(self, tf, inputs, trainable=True):
# The input nodes for this network
self.tf = tf
self.inputs = inputs
# The current list of terminal nodes
self.terminals = []
# Mapping from layer names to layers
self.layers = dict(inputs)
# If true, the resulting variables are set as trainable
self.trainable = trainable
self.setup()
def setup(self):
"""Construct the network. """
raise NotImplementedError('Must be implemented by the subclass.')
def load(self, data_path, session, ignore_missing=False):
"""Load network weights.
data_path: The path to the numpy-serialized network weights
session: The current TensorFlow session
ignore_missing: If true, serialized weights for missing layers are ignored.
"""
data_dict = np.load(data_path, encoding='latin1').item() #pylint: disable=no-member
for op_name in data_dict:
with self.tf.variable_scope(op_name, reuse=True):
for param_name, data in iteritems(data_dict[op_name]):
try:
var = self.tf.get_variable(param_name)
session.run(var.assign(data))
except ValueError:
if not ignore_missing:
raise
def feed(self, *args):
"""Set the input(s) for the next operation by replacing the terminal nodes.
The arguments can be either layer names or the actual layers.
"""
assert len(args) != 0
self.terminals = []
for fed_layer in args:
if isinstance(fed_layer, string_types):
try:
fed_layer = self.layers[fed_layer]
except KeyError:
raise KeyError('Unknown layer name fed: %s' % fed_layer)
self.terminals.append(fed_layer)
return self
def get_output(self):
"""Returns the current network output."""
return self.terminals[-1]
def get_unique_name(self, prefix):
"""Returns an index-suffixed unique name for the given prefix.
This is used for auto-generating layer names based on the type-prefix.
"""
ident = sum(t.startswith(prefix) for t, _ in self.layers.items()) + 1
return '%s_%d' % (prefix, ident)
def make_var(self, name, shape):
"""Creates a new TensorFlow variable."""
return self.tf.get_variable(name, shape, trainable=self.trainable)
def validate_padding(self, padding):
"""Verifies that the padding is one of the supported ones."""
assert padding in ('SAME', 'VALID')
@layer
def conv(self,
inp,
k_h,
k_w,
c_o,
s_h,
s_w,
name,
relu=True,
padding='SAME',
group=1,
biased=True):
# Verify that the padding is acceptable
self.validate_padding(padding)
# Get the number of channels in the input
c_i = int(inp.get_shape()[-1])
# Verify that the grouping parameter is valid
assert c_i % group == 0
assert c_o % group == 0
# Convolution for a given input and kernel
convolve = lambda i, k: self.tf.nn.conv2d(i, k, [1, s_h, s_w, 1], padding=padding)
with self.tf.variable_scope(name) as scope:
kernel = self.make_var('weights', shape=[k_h, k_w, c_i // group, c_o])
# This is the common-case. Convolve the input without any further complications.
output = convolve(inp, kernel)
# Add the biases
if biased:
biases = self.make_var('biases', [c_o])
output = self.tf.nn.bias_add(output, biases)
if relu:
# ReLU non-linearity
output = self.tf.nn.relu(output, name=scope.name)
return output
@layer
def prelu(self, inp, name):
with self.tf.variable_scope(name):
i = int(inp.get_shape()[-1])
alpha = self.make_var('alpha', shape=(i,))
output = self.tf.nn.relu(inp) + self.tf.multiply(alpha, -self.tf.nn.relu(-inp))
return output
@layer
def max_pool(self, inp, k_h, k_w, s_h, s_w, name, padding='SAME'):
self.validate_padding(padding)
return self.tf.nn.max_pool(inp,
ksize=[1, k_h, k_w, 1],
strides=[1, s_h, s_w, 1],
padding=padding,
name=name)
@layer
def fc(self, inp, num_out, name, relu=True):
with self.tf.variable_scope(name):
input_shape = inp.get_shape()
if input_shape.ndims == 4:
# The input is spatial. Vectorize it first.
dim = 1
for d in input_shape[1:].as_list():
dim *= int(d)
feed_in = self.tf.reshape(inp, [-1, dim])
else:
feed_in, dim = (inp, input_shape[-1].value)
weights = self.make_var('weights', shape=[dim, num_out])
biases = self.make_var('biases', [num_out])
op = self.tf.nn.relu_layer if relu else self.tf.nn.xw_plus_b
fc = op(feed_in, weights, biases, name=name)
return fc
"""
Multi dimensional softmax,
refer to https://github.com/tensorflow/tensorflow/issues/210
compute softmax along the dimension of target
the native softmax only supports batch_size x dimension
"""
@layer
def softmax(self, target, axis, name=None):
max_axis = self.tf.reduce_max(target, axis, keepdims=True)
target_exp = self.tf.exp(target-max_axis)
normalize = self.tf.reduce_sum(target_exp, axis, keepdims=True)
softmax = self.tf.div(target_exp, normalize, name)
return softmax
class PNet(Network):
def setup(self):
(self.feed('data') #pylint: disable=no-value-for-parameter, no-member
.conv(3, 3, 10, 1, 1, padding='VALID', relu=False, name='conv1')
.prelu(name='PReLU1')
.max_pool(2, 2, 2, 2, name='pool1')
.conv(3, 3, 16, 1, 1, padding='VALID', relu=False, name='conv2')
.prelu(name='PReLU2')
.conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv3')
.prelu(name='PReLU3')
.conv(1, 1, 2, 1, 1, relu=False, name='conv4-1')
.softmax(3,name='prob1'))
(self.feed('PReLU3') #pylint: disable=no-value-for-parameter
.conv(1, 1, 4, 1, 1, relu=False, name='conv4-2'))
class RNet(Network):
def setup(self):
(self.feed('data') #pylint: disable=no-value-for-parameter, no-member
.conv(3, 3, 28, 1, 1, padding='VALID', relu=False, name='conv1')
.prelu(name='prelu1')
.max_pool(3, 3, 2, 2, name='pool1')
.conv(3, 3, 48, 1, 1, padding='VALID', relu=False, name='conv2')
.prelu(name='prelu2')
.max_pool(3, 3, 2, 2, padding='VALID', name='pool2')
.conv(2, 2, 64, 1, 1, padding='VALID', relu=False, name='conv3')
.prelu(name='prelu3')
.fc(128, relu=False, name='conv4')
.prelu(name='prelu4')
.fc(2, relu=False, name='conv5-1')
.softmax(1,name='prob1'))
(self.feed('prelu4') #pylint: disable=no-value-for-parameter
.fc(4, relu=False, name='conv5-2'))
class ONet(Network):
def setup(self):
(self.feed('data') #pylint: disable=no-value-for-parameter, no-member
.conv(3, 3, 32, 1, 1, padding='VALID', relu=False, name='conv1')
.prelu(name='prelu1')
.max_pool(3, 3, 2, 2, name='pool1')
.conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv2')
.prelu(name='prelu2')
.max_pool(3, 3, 2, 2, padding='VALID', name='pool2')
.conv(3, 3, 64, 1, 1, padding='VALID', relu=False, name='conv3')
.prelu(name='prelu3')
.max_pool(2, 2, 2, 2, name='pool3')
.conv(2, 2, 128, 1, 1, padding='VALID', relu=False, name='conv4')
.prelu(name='prelu4')
.fc(256, relu=False, name='conv5')
.prelu(name='prelu5')
.fc(2, relu=False, name='conv6-1')
.softmax(1, name='prob1'))
(self.feed('prelu5') #pylint: disable=no-value-for-parameter
.fc(4, relu=False, name='conv6-2'))
(self.feed('prelu5') #pylint: disable=no-value-for-parameter
.fc(10, relu=False, name='conv6-3'))
def detect_face(img, minsize, pnet, rnet, onet, threshold, factor):
"""Detects faces in an image, and returns bounding boxes and points for them.
img: input image
minsize: minimum faces' size
pnet, rnet, onet: caffemodel
threshold: threshold=[th1, th2, th3], th1-3 are three steps's threshold
factor: the factor used to create a scaling pyramid of face sizes to detect in the image.
"""
factor_count=0
total_boxes=np.empty((0,9))
points=np.empty(0)
h=img.shape[0]
w=img.shape[1]
minl=np.amin([h, w])
m=12.0/minsize
minl=minl*m
# create scale pyramid
scales=[]
while minl>=12:
scales += [m*np.power(factor, factor_count)]
minl = minl*factor
factor_count += 1
# first stage
for scale in scales:
hs=int(np.ceil(h*scale))
ws=int(np.ceil(w*scale))
#print ('scale %f %d %d' % (scale, ws,hs))
im_data = imresample(img, (hs, ws))
im_data = (im_data-127.5)*0.0078125
img_x = np.expand_dims(im_data, 0)
img_y = np.transpose(img_x, (0,2,1,3))
out = pnet([img_y])
out0 = np.transpose(out[0], (0,2,1,3))
out1 = np.transpose(out[1], (0,2,1,3))
boxes, _ = generateBoundingBox(out1[0,:,:,1].copy(), out0[0,:,:,:].copy(), scale, threshold[0])
# inter-scale nms
pick = nms(boxes.copy(), 0.5, 'Union')
if boxes.size>0 and pick.size>0:
boxes = boxes[pick,:]
total_boxes = np.append(total_boxes, boxes, axis=0)
numbox = total_boxes.shape[0]
if numbox>0:
pick = nms(total_boxes.copy(), 0.7, 'Union')
total_boxes = total_boxes[pick,:]
regw = total_boxes[:,2]-total_boxes[:,0]
regh = total_boxes[:,3]-total_boxes[:,1]
qq1 = total_boxes[:,0]+total_boxes[:,5]*regw
qq2 = total_boxes[:,1]+total_boxes[:,6]*regh
qq3 = total_boxes[:,2]+total_boxes[:,7]*regw
qq4 = total_boxes[:,3]+total_boxes[:,8]*regh
total_boxes = np.transpose(np.vstack([qq1, qq2, qq3, qq4, total_boxes[:,4]]))
total_boxes = rerec(total_boxes.copy())
total_boxes[:,0:4] = np.fix(total_boxes[:,0:4]).astype(np.int32)
dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h)
numbox = total_boxes.shape[0]
if numbox>0:
# second stage
tempimg = np.zeros((24,24,3,numbox))
for k in range(0,numbox):
tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3))
tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k]-1:ey[k],x[k]-1:ex[k],:]
if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0:
tempimg[:,:,:,k] = imresample(tmp, (24, 24))
else:
return np.empty()
tempimg = (tempimg-127.5)*0.0078125
tempimg1 = np.transpose(tempimg, (3,1,0,2))
out = rnet([tempimg1])
out0 = np.transpose(out[0])
out1 = np.transpose(out[1])
score = out1[1,:]
ipass = np.where(score>threshold[1])
total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)])
mv = out0[:,ipass[0]]
if total_boxes.shape[0]>0:
pick = nms(total_boxes, 0.7, 'Union')
total_boxes = total_boxes[pick,:]
total_boxes = bbreg(total_boxes.copy(), np.transpose(mv[:,pick]))
total_boxes = rerec(total_boxes.copy())
numbox = total_boxes.shape[0]
if numbox>0:
# third stage
total_boxes = np.fix(total_boxes).astype(np.int32)
dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(total_boxes.copy(), w, h)
tempimg = np.zeros((48,48,3,numbox))
for k in range(0,numbox):
tmp = np.zeros((int(tmph[k]),int(tmpw[k]),3))
tmp[dy[k]-1:edy[k],dx[k]-1:edx[k],:] = img[y[k]-1:ey[k],x[k]-1:ex[k],:]
if tmp.shape[0]>0 and tmp.shape[1]>0 or tmp.shape[0]==0 and tmp.shape[1]==0:
tempimg[:,:,:,k] = imresample(tmp, (48, 48))
else:
return np.empty()
tempimg = (tempimg-127.5)*0.0078125
tempimg1 = np.transpose(tempimg, (3,1,0,2))
out = onet([tempimg1])
out0 = np.transpose(out[0])
out1 = np.transpose(out[1])
out2 = np.transpose(out[2])
score = out2[1,:]
points = out1
ipass = np.where(score>threshold[2])
points = points[:,ipass[0]]
total_boxes = np.hstack([total_boxes[ipass[0],0:4].copy(), np.expand_dims(score[ipass].copy(),1)])
mv = out0[:,ipass[0]]
w = total_boxes[:,2]-total_boxes[:,0]+1
h = total_boxes[:,3]-total_boxes[:,1]+1
points[0:5,:] = np.tile(w,(5, 1))*points[0:5,:] + np.tile(total_boxes[:,0],(5, 1))-1
points[5:10,:] = np.tile(h,(5, 1))*points[5:10,:] + np.tile(total_boxes[:,1],(5, 1))-1
if total_boxes.shape[0]>0:
total_boxes = bbreg(total_boxes.copy(), np.transpose(mv))
pick = nms(total_boxes.copy(), 0.7, 'Min')
total_boxes = total_boxes[pick,:]
points = points[:,pick]
return total_boxes, points
def bulk_detect_face(images, detection_window_size_ratio, pnet, rnet, onet, threshold, factor):
"""Detects faces in a list of images
images: list containing input images
detection_window_size_ratio: ratio of minimum face size to smallest image dimension
pnet, rnet, onet: caffemodel
threshold: threshold=[th1 th2 th3], th1-3 are three steps's threshold [0-1]
factor: the factor used to create a scaling pyramid of face sizes to detect in the image.
"""
all_scales = [None] * len(images)
images_with_boxes = [None] * len(images)
for i in range(len(images)):
images_with_boxes[i] = {'total_boxes': np.empty((0, 9))}
# create scale pyramid
for index, img in enumerate(images):
all_scales[index] = []
h = img.shape[0]
w = img.shape[1]
minsize = int(detection_window_size_ratio * np.minimum(w, h))
factor_count = 0
minl = np.amin([h, w])
if minsize <= 12:
minsize = 12
m = 12.0 / minsize
minl = minl * m
while minl >= 12:
all_scales[index].append(m * np.power(factor, factor_count))
minl = minl * factor
factor_count += 1
# # # # # # # # # # # # #
# first stage - fast proposal network (pnet) to obtain face candidates
# # # # # # # # # # # # #
images_obj_per_resolution = {}
# TODO: use some type of rounding to number module 8 to increase probability that pyramid images will have the same resolution across input images
for index, scales in enumerate(all_scales):
h = images[index].shape[0]
w = images[index].shape[1]
for scale in scales:
hs = int(np.ceil(h * scale))
ws = int(np.ceil(w * scale))
if (ws, hs) not in images_obj_per_resolution:
images_obj_per_resolution[(ws, hs)] = []
im_data = imresample(images[index], (hs, ws))
im_data = (im_data - 127.5) * 0.0078125
img_y = np.transpose(im_data, (1, 0, 2)) # caffe uses different dimensions ordering
images_obj_per_resolution[(ws, hs)].append({'scale': scale, 'image': img_y, 'index': index})
for resolution in images_obj_per_resolution:
images_per_resolution = [i['image'] for i in images_obj_per_resolution[resolution]]
outs = pnet(images_per_resolution)
for index in range(len(outs[0])):
scale = images_obj_per_resolution[resolution][index]['scale']
image_index = images_obj_per_resolution[resolution][index]['index']
out0 = np.transpose(outs[0][index], (1, 0, 2))
out1 = np.transpose(outs[1][index], (1, 0, 2))
boxes, _ = generateBoundingBox(out1[:, :, 1].copy(), out0[:, :, :].copy(), scale, threshold[0])
# inter-scale nms
pick = nms(boxes.copy(), 0.5, 'Union')
if boxes.size > 0 and pick.size > 0:
boxes = boxes[pick, :]
images_with_boxes[image_index]['total_boxes'] = np.append(images_with_boxes[image_index]['total_boxes'],
boxes,
axis=0)
for index, image_obj in enumerate(images_with_boxes):
numbox = image_obj['total_boxes'].shape[0]
if numbox > 0:
h = images[index].shape[0]
w = images[index].shape[1]
pick = nms(image_obj['total_boxes'].copy(), 0.7, 'Union')
image_obj['total_boxes'] = image_obj['total_boxes'][pick, :]
regw = image_obj['total_boxes'][:, 2] - image_obj['total_boxes'][:, 0]
regh = image_obj['total_boxes'][:, 3] - image_obj['total_boxes'][:, 1]
qq1 = image_obj['total_boxes'][:, 0] + image_obj['total_boxes'][:, 5] * regw
qq2 = image_obj['total_boxes'][:, 1] + image_obj['total_boxes'][:, 6] * regh
qq3 = image_obj['total_boxes'][:, 2] + image_obj['total_boxes'][:, 7] * regw
qq4 = image_obj['total_boxes'][:, 3] + image_obj['total_boxes'][:, 8] * regh
image_obj['total_boxes'] = np.transpose(np.vstack([qq1, qq2, qq3, qq4, image_obj['total_boxes'][:, 4]]))
image_obj['total_boxes'] = rerec(image_obj['total_boxes'].copy())
image_obj['total_boxes'][:, 0:4] = np.fix(image_obj['total_boxes'][:, 0:4]).astype(np.int32)
dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(image_obj['total_boxes'].copy(), w, h)
numbox = image_obj['total_boxes'].shape[0]
tempimg = np.zeros((24, 24, 3, numbox))
if numbox > 0:
for k in range(0, numbox):
tmp = np.zeros((int(tmph[k]), int(tmpw[k]), 3))
tmp[dy[k] - 1:edy[k], dx[k] - 1:edx[k], :] = images[index][y[k] - 1:ey[k], x[k] - 1:ex[k], :]
if tmp.shape[0] > 0 and tmp.shape[1] > 0 or tmp.shape[0] == 0 and tmp.shape[1] == 0:
tempimg[:, :, :, k] = imresample(tmp, (24, 24))
else:
return np.empty()
tempimg = (tempimg - 127.5) * 0.0078125
image_obj['rnet_input'] = np.transpose(tempimg, (3, 1, 0, 2))
# # # # # # # # # # # # #
# second stage - refinement of face candidates with rnet
# # # # # # # # # # # # #
bulk_rnet_input = np.empty((0, 24, 24, 3))
for index, image_obj in enumerate(images_with_boxes):
if 'rnet_input' in image_obj:
bulk_rnet_input = np.append(bulk_rnet_input, image_obj['rnet_input'], axis=0)
out = rnet(bulk_rnet_input)
out0 = np.transpose(out[0])
out1 = np.transpose(out[1])
score = out1[1, :]
i = 0
for index, image_obj in enumerate(images_with_boxes):
if 'rnet_input' not in image_obj:
continue
rnet_input_count = image_obj['rnet_input'].shape[0]
score_per_image = score[i:i + rnet_input_count]
out0_per_image = out0[:, i:i + rnet_input_count]
ipass = np.where(score_per_image > threshold[1])
image_obj['total_boxes'] = np.hstack([image_obj['total_boxes'][ipass[0], 0:4].copy(),
np.expand_dims(score_per_image[ipass].copy(), 1)])
mv = out0_per_image[:, ipass[0]]
if image_obj['total_boxes'].shape[0] > 0:
h = images[index].shape[0]
w = images[index].shape[1]
pick = nms(image_obj['total_boxes'], 0.7, 'Union')
image_obj['total_boxes'] = image_obj['total_boxes'][pick, :]
image_obj['total_boxes'] = bbreg(image_obj['total_boxes'].copy(), np.transpose(mv[:, pick]))
image_obj['total_boxes'] = rerec(image_obj['total_boxes'].copy())
numbox = image_obj['total_boxes'].shape[0]
if numbox > 0:
tempimg = np.zeros((48, 48, 3, numbox))
image_obj['total_boxes'] = np.fix(image_obj['total_boxes']).astype(np.int32)
dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph = pad(image_obj['total_boxes'].copy(), w, h)
for k in range(0, numbox):
tmp = np.zeros((int(tmph[k]), int(tmpw[k]), 3))
tmp[dy[k] - 1:edy[k], dx[k] - 1:edx[k], :] = images[index][y[k] - 1:ey[k], x[k] - 1:ex[k], :]
if tmp.shape[0] > 0 and tmp.shape[1] > 0 or tmp.shape[0] == 0 and tmp.shape[1] == 0:
tempimg[:, :, :, k] = imresample(tmp, (48, 48))
else:
return np.empty()
tempimg = (tempimg - 127.5) * 0.0078125
image_obj['onet_input'] = np.transpose(tempimg, (3, 1, 0, 2))
i += rnet_input_count
# # # # # # # # # # # # #
# third stage - further refinement and facial landmarks positions with onet
# # # # # # # # # # # # #
bulk_onet_input = np.empty((0, 48, 48, 3))
for index, image_obj in enumerate(images_with_boxes):
if 'onet_input' in image_obj:
bulk_onet_input = np.append(bulk_onet_input, image_obj['onet_input'], axis=0)
out = onet(bulk_onet_input)
out0 = np.transpose(out[0])
out1 = np.transpose(out[1])
out2 = np.transpose(out[2])
score = out2[1, :]
points = out1
i = 0
ret = []
for index, image_obj in enumerate(images_with_boxes):
if 'onet_input' not in image_obj:
ret.append(None)
continue
onet_input_count = image_obj['onet_input'].shape[0]
out0_per_image = out0[:, i:i + onet_input_count]
score_per_image = score[i:i + onet_input_count]
points_per_image = points[:, i:i + onet_input_count]
ipass = np.where(score_per_image > threshold[2])
points_per_image = points_per_image[:, ipass[0]]
image_obj['total_boxes'] = np.hstack([image_obj['total_boxes'][ipass[0], 0:4].copy(),
np.expand_dims(score_per_image[ipass].copy(), 1)])
mv = out0_per_image[:, ipass[0]]
w = image_obj['total_boxes'][:, 2] - image_obj['total_boxes'][:, 0] + 1
h = image_obj['total_boxes'][:, 3] - image_obj['total_boxes'][:, 1] + 1
points_per_image[0:5, :] = np.tile(w, (5, 1)) * points_per_image[0:5, :] + np.tile(
image_obj['total_boxes'][:, 0], (5, 1)) - 1
points_per_image[5:10, :] = np.tile(h, (5, 1)) * points_per_image[5:10, :] + np.tile(
image_obj['total_boxes'][:, 1], (5, 1)) - 1
if image_obj['total_boxes'].shape[0] > 0:
image_obj['total_boxes'] = bbreg(image_obj['total_boxes'].copy(), np.transpose(mv))
pick = nms(image_obj['total_boxes'].copy(), 0.7, 'Min')
image_obj['total_boxes'] = image_obj['total_boxes'][pick, :]
points_per_image = points_per_image[:, pick]
ret.append((image_obj['total_boxes'], points_per_image))
else:
ret.append(None)
i += onet_input_count
return ret
# function [boundingbox] = bbreg(boundingbox,reg)
def bbreg(boundingbox,reg):
"""Calibrate bounding boxes"""
if reg.shape[1]==1:
reg = np.reshape(reg, (reg.shape[2], reg.shape[3]))
w = boundingbox[:,2]-boundingbox[:,0]+1
h = boundingbox[:,3]-boundingbox[:,1]+1
b1 = boundingbox[:,0]+reg[:,0]*w
b2 = boundingbox[:,1]+reg[:,1]*h
b3 = boundingbox[:,2]+reg[:,2]*w
b4 = boundingbox[:,3]+reg[:,3]*h
boundingbox[:,0:4] = np.transpose(np.vstack([b1, b2, b3, b4 ]))
return boundingbox
def generateBoundingBox(imap, reg, scale, t):
"""Use heatmap to generate bounding boxes"""
stride=2
cellsize=12
imap = np.transpose(imap)
dx1 = np.transpose(reg[:,:,0])
dy1 = np.transpose(reg[:,:,1])
dx2 = np.transpose(reg[:,:,2])
dy2 = np.transpose(reg[:,:,3])
y, x = np.where(imap >= t)
if y.shape[0]==1:
dx1 = np.flipud(dx1)
dy1 = np.flipud(dy1)
dx2 = np.flipud(dx2)
dy2 = np.flipud(dy2)
score = imap[(y,x)]
reg = np.transpose(np.vstack([ dx1[(y,x)], dy1[(y,x)], dx2[(y,x)], dy2[(y,x)] ]))
if reg.size==0:
reg = np.empty((0,3))
bb = np.transpose(np.vstack([y,x]))
q1 = np.fix((stride*bb+1)/scale)
q2 = np.fix((stride*bb+cellsize-1+1)/scale)
boundingbox = np.hstack([q1, q2, np.expand_dims(score,1), reg])
return boundingbox, reg
# function pick = nms(boxes,threshold,type)
def nms(boxes, threshold, method):
if boxes.size==0:
return np.empty((0,3))
x1 = boxes[:,0]
y1 = boxes[:,1]
x2 = boxes[:,2]
y2 = boxes[:,3]
s = boxes[:,4]
area = (x2-x1+1) * (y2-y1+1)
I = np.argsort(s)
pick = np.zeros_like(s, dtype=np.int16)
counter = 0
while I.size>0:
i = I[-1]
pick[counter] = i
counter += 1
idx = I[0:-1]
xx1 = np.maximum(x1[i], x1[idx])
yy1 = np.maximum(y1[i], y1[idx])
xx2 = np.minimum(x2[i], x2[idx])
yy2 = np.minimum(y2[i], y2[idx])
w = np.maximum(0.0, xx2-xx1+1)
h = np.maximum(0.0, yy2-yy1+1)
inter = w * h
if method is 'Min':
o = inter / np.minimum(area[i], area[idx])
else:
o = inter / (area[i] + area[idx] - inter)
I = I[np.where(o<=threshold)]
pick = pick[0:counter]
return pick
# function [dy edy dx edx y ey x ex tmpw tmph] = pad(total_boxes,w,h)
def pad(total_boxes, w, h):
"""Compute the padding coordinates (pad the bounding boxes to square)"""
tmpw = (total_boxes[:,2]-total_boxes[:,0]+1).astype(np.int32)
tmph = (total_boxes[:,3]-total_boxes[:,1]+1).astype(np.int32)
numbox = total_boxes.shape[0]
dx = np.ones((numbox), dtype=np.int32)
dy = np.ones((numbox), dtype=np.int32)
edx = tmpw.copy().astype(np.int32)
edy = tmph.copy().astype(np.int32)
x = total_boxes[:,0].copy().astype(np.int32)
y = total_boxes[:,1].copy().astype(np.int32)
ex = total_boxes[:,2].copy().astype(np.int32)
ey = total_boxes[:,3].copy().astype(np.int32)
tmp = np.where(ex>w)
edx.flat[tmp] = np.expand_dims(-ex[tmp]+w+tmpw[tmp],1)
ex[tmp] = w
tmp = np.where(ey>h)
edy.flat[tmp] = np.expand_dims(-ey[tmp]+h+tmph[tmp],1)
ey[tmp] = h
tmp = np.where(x<1)
dx.flat[tmp] = np.expand_dims(2-x[tmp],1)
x[tmp] = 1
tmp = np.where(y<1)
dy.flat[tmp] = np.expand_dims(2-y[tmp],1)
y[tmp] = 1
return dy, edy, dx, edx, y, ey, x, ex, tmpw, tmph
# function [bboxA] = rerec(bboxA)
def rerec(bboxA):
"""Convert bboxA to square."""
h = bboxA[:,3]-bboxA[:,1]
w = bboxA[:,2]-bboxA[:,0]
l = np.maximum(w, h)
bboxA[:,0] = bboxA[:,0]+w*0.5-l*0.5
bboxA[:,1] = bboxA[:,1]+h*0.5-l*0.5
bboxA[:,2:4] = bboxA[:,0:2] + np.transpose(np.tile(l,(2,1)))
return bboxA
def imresample(img, sz):
im_data = cv2.resize(img, (sz[1], sz[0]), interpolation=cv2.INTER_LINEAR) #@UndefinedVariable
return im_data
# This method is kept for debugging purpose
# h=img.shape[0]
# w=img.shape[1]
# hs, ws = sz
# dx = float(w) / ws
# dy = float(h) / hs
# im_data = np.zeros((hs,ws,3))
# for a1 in range(0,hs):
# for a2 in range(0,ws):
# for a3 in range(0,3):
# im_data[a1,a2,a3] = img[int(floor(a1*dy)),int(floor(a2*dx)),a3]
# return im_data

1
gpufmkmgr/__init__.py Normal file
View file

@ -0,0 +1 @@
from .gpufmkmgr import *

244
gpufmkmgr/gpufmkmgr.py Normal file
View file

@ -0,0 +1,244 @@
import os
import sys
import contextlib
from utils import std_utils
from .pynvml import *
dlib_module = None
def import_dlib(device_idx):
global dlib_module
if dlib_module is not None:
raise Exception ('Multiple import of dlib is not allowed, reorganize your program.')
import dlib
dlib_module = dlib
dlib_module.cuda.set_device(device_idx)
return dlib_module
tf_module = None
tf_session = None
keras_module = None
keras_contrib_module = None
keras_vggface_module = None
def get_tf_session():
global tf_session
return tf_session
#allow_growth=False for keras model
#allow_growth=True for tf only model
def import_tf( device_idxs_list, allow_growth ):
global tf_module
global tf_session
if tf_module is not None:
raise Exception ('Multiple import of tf is not allowed, reorganize your program.')
if 'TF_SUPPRESS_STD' in os.environ.keys() and os.environ['TF_SUPPRESS_STD'] == '1':
suppressor = std_utils.suppress_stdout_stderr().__enter__()
else:
suppressor = None
if 'CUDA_VISIBLE_DEVICES' in os.environ.keys():
os.environ.pop('CUDA_VISIBLE_DEVICES')
os.environ['TF_MIN_GPU_MULTIPROCESSOR_COUNT'] = '2'
import tensorflow as tf
tf_module = tf
visible_device_list = ''
for idx in device_idxs_list: visible_device_list += str(idx) + ','
visible_device_list = visible_device_list[:-1]
config = tf_module.ConfigProto()
config.gpu_options.allow_growth = allow_growth
config.gpu_options.visible_device_list=visible_device_list
config.gpu_options.force_gpu_compatible = True
tf_session = tf_module.Session(config=config)
if suppressor is not None:
suppressor.__exit__()
return tf_module
def finalize_tf():
global tf_module
global tf_session
tf_session.close()
tf_session = None
tf_module = None
def import_keras():
global keras_module
if keras_module is not None:
raise Exception ('Multiple import of keras is not allowed, reorganize your program.')
sess = get_tf_session()
if sess is None:
raise Exception ('No TF session found. Import TF first.')
if 'TF_SUPPRESS_STD' in os.environ.keys() and os.environ['TF_SUPPRESS_STD'] == '1':
suppressor = std_utils.suppress_stdout_stderr().__enter__()
import keras
keras.backend.tensorflow_backend.set_session(sess)
if 'TF_SUPPRESS_STD' in os.environ.keys() and os.environ['TF_SUPPRESS_STD'] == '1':
suppressor.__exit__()
keras_module = keras
return keras_module
def finalize_keras():
global keras_module
keras_module.backend.clear_session()
keras_module = None
def import_keras_contrib():
global keras_contrib_module
if keras_contrib_module is not None:
raise Exception ('Multiple import of keras_contrib is not allowed, reorganize your program.')
import keras_contrib
keras_contrib_module = keras_contrib
return keras_contrib_module
def finalize_keras_contrib():
global keras_contrib_module
keras_contrib_module = None
def import_keras_vggface(optional=False):
global keras_vggface_module
if keras_vggface_module is not None:
raise Exception ('Multiple import of keras_vggface_module is not allowed, reorganize your program.')
try:
import keras_vggface
except:
if optional:
print ("Unable to import keras_vggface. It will not be used.")
else:
raise Exception ("Unable to import keras_vggface.")
keras_vggface = None
keras_vggface_module = keras_vggface
return keras_vggface_module
def finalize_keras_vggface():
global keras_vggface_module
keras_vggface_module = None
#returns [ (device_idx, device_name), ... ]
def getDevicesWithAtLeastFreeMemory(freememsize):
result = []
nvmlInit()
for i in range(0, nvmlDeviceGetCount() ):
handle = nvmlDeviceGetHandleByIndex(i)
memInfo = nvmlDeviceGetMemoryInfo( handle )
if (memInfo.total - memInfo.used) >= freememsize:
result.append (i)
nvmlShutdown()
return result
def getDevicesWithAtLeastTotalMemoryGB(totalmemsize_gb):
result = []
nvmlInit()
for i in range(0, nvmlDeviceGetCount() ):
handle = nvmlDeviceGetHandleByIndex(i)
memInfo = nvmlDeviceGetMemoryInfo( handle )
if (memInfo.total) >= totalmemsize_gb*1024*1024*1024:
result.append (i)
nvmlShutdown()
return result
def getAllDevicesIdxsList ():
nvmlInit()
result = [ i for i in range(0, nvmlDeviceGetCount() ) ]
nvmlShutdown()
return result
def getDeviceVRAMFree (idx):
result = 0
nvmlInit()
if idx < nvmlDeviceGetCount():
handle = nvmlDeviceGetHandleByIndex(idx)
memInfo = nvmlDeviceGetMemoryInfo( handle )
result = (memInfo.total - memInfo.used)
nvmlShutdown()
return result
def getDeviceVRAMTotalGb (idx):
result = 0
nvmlInit()
if idx < nvmlDeviceGetCount():
handle = nvmlDeviceGetHandleByIndex(idx)
memInfo = nvmlDeviceGetMemoryInfo( handle )
result = memInfo.total / (1024*1024*1024)
nvmlShutdown()
return result
def getBestDeviceIdx():
nvmlInit()
idx = -1
idx_mem = 0
for i in range(0, nvmlDeviceGetCount() ):
handle = nvmlDeviceGetHandleByIndex(i)
memInfo = nvmlDeviceGetMemoryInfo( handle )
if memInfo.total > idx_mem:
idx = i
idx_mem = memInfo.total
nvmlShutdown()
return idx
def getWorstDeviceIdx():
nvmlInit()
idx = -1
idx_mem = sys.maxsize
for i in range(0, nvmlDeviceGetCount() ):
handle = nvmlDeviceGetHandleByIndex(i)
memInfo = nvmlDeviceGetMemoryInfo( handle )
if memInfo.total < idx_mem:
idx = i
idx_mem = memInfo.total
nvmlShutdown()
return idx
def isValidDeviceIdx(idx):
nvmlInit()
result = (idx < nvmlDeviceGetCount())
nvmlShutdown()
return result
def getDeviceIdxsEqualModel(idx):
result = []
nvmlInit()
idx_name = nvmlDeviceGetName(nvmlDeviceGetHandleByIndex(idx)).decode()
for i in range(0, nvmlDeviceGetCount() ):
if nvmlDeviceGetName(nvmlDeviceGetHandleByIndex(i)).decode() == idx_name:
result.append (i)
nvmlShutdown()
return result
def getDeviceName (idx):
result = ''
nvmlInit()
if idx < nvmlDeviceGetCount():
result = nvmlDeviceGetName(nvmlDeviceGetHandleByIndex(idx)).decode()
nvmlShutdown()
return result

1701
gpufmkmgr/pynvml.py Normal file

File diff suppressed because it is too large Load diff

2
localization/__init__.py Normal file
View file

@ -0,0 +1,2 @@
from .localization import get_default_ttf_font_name

View file

@ -0,0 +1,29 @@
import sys
import locale
system_locale = locale.getdefaultlocale()[0]
system_language = system_locale[0:2]
windows_font_name_map = {
'en' : 'cour',
'ru' : 'cour',
'zn' : 'simsun_01'
}
darwin_font_name_map = {
'en' : 'cour',
'ru' : 'cour',
'zn' : 'Apple LiSung Light'
}
linux_font_name_map = {
'en' : 'cour',
'ru' : 'cour',
'zn' : 'cour'
}
def get_default_ttf_font_name():
platform = sys.platform
if platform == 'win32': return windows_font_name_map.get(system_language, 'cour')
elif platform == 'darwin': return darwin_font_name_map.get(system_language, 'cour')
else: return linux_font_name_map.get(system_language, 'cour')

188
main.py Normal file
View file

@ -0,0 +1,188 @@
import os
import sys
import argparse
from utils import Path_utils
from utils import os_utils
from pathlib import Path
import numpy as np
if sys.version_info[0] < 3 or (sys.version_info[0] == 3 and sys.version_info[1] < 2):
raise Exception("This program requires at least Python 3.2")
class fixPathAction(argparse.Action):
def __call__(self, parser, namespace, values, option_string=None):
setattr(namespace, self.dest, os.path.abspath(os.path.expanduser(values)))
def str2bool(v):
if v.lower() in ('yes', 'true', 't', 'y', '1'):
return True
elif v.lower() in ('no', 'false', 'f', 'n', '0'):
return False
else:
raise argparse.ArgumentTypeError('Boolean value expected.')
if __name__ == "__main__":
os_utils.set_process_lowest_prio()
parser = argparse.ArgumentParser()
parser.add_argument('--tf-suppress-std', action="store_true", dest="tf_suppress_std", default=False, help="Suppress tensorflow initialization info. May not works on some python builds such as anaconda python 3.6.4. If you can fix it, you are welcome.")
subparsers = parser.add_subparsers()
def process_extract(arguments):
from mainscripts import Extractor
Extractor.main (
input_dir=arguments.input_dir,
output_dir=arguments.output_dir,
debug=arguments.debug,
face_type=arguments.face_type,
detector=arguments.detector,
multi_gpu=arguments.multi_gpu,
manual_fix=arguments.manual_fix,
manual_window_size=arguments.manual_window_size)
extract_parser = subparsers.add_parser( "extract", help="Extract the faces from a pictures.")
extract_parser.add_argument('--input-dir', required=True, action=fixPathAction, dest="input_dir", help="Input directory. A directory containing the files you wish to process.")
extract_parser.add_argument('--output-dir', required=True, action=fixPathAction, dest="output_dir", help="Output directory. This is where the extracted files will be stored.")
extract_parser.add_argument('--debug', action="store_true", dest="debug", default=False, help="Writes debug images to [output_dir]_debug\ directory.")
extract_parser.add_argument('--face-type', dest="face_type", choices=['half_face', 'full_face', 'head', 'avatar', 'mark_only'], default='full_face', help="Default 'full_face'. Don't change this option, currently all models uses 'full_face'")
extract_parser.add_argument('--detector', dest="detector", choices=['dlib','mt','manual'], default='dlib', help="Type of detector. Default 'dlib'. 'mt' (MTCNNv1) - faster, better, almost no jitter, perfect for gathering thousands faces for src-set. It is also good for dst-set, but can generate false faces in frames where main face not recognized! In this case for dst-set use either 'dlib' with '--manual-fix' or '--detector manual'. Manual detector suitable only for dst-set.")
extract_parser.add_argument('--multi-gpu', action="store_true", dest="multi_gpu", default=False, help="Enables multi GPU.")
extract_parser.add_argument('--manual-fix', action="store_true", dest="manual_fix", default=False, help="Enables manual extract only frames where faces were not recognized.")
extract_parser.add_argument('--manual-window-size', type=int, dest="manual_window_size", default=0, help="Manual fix window size. Example: 1368. Default: frame size.")
extract_parser.set_defaults (func=process_extract)
def process_sort(arguments):
from mainscripts import Sorter
Sorter.main (input_path=arguments.input_dir, sort_by_method=arguments.sort_by_method)
sort_parser = subparsers.add_parser( "sort", help="Sort faces in a directory.")
sort_parser.add_argument('--input-dir', required=True, action=fixPathAction, dest="input_dir", help="Input directory. A directory containing the files you wish to process.")
sort_parser.add_argument('--by', required=True, dest="sort_by_method", choices=("blur", "face", "face-dissim", "face-yaw", "hist", "hist-dissim", "hist-blur", "ssim", "brightness", "hue", "origname"), help="Method of sorting. 'origname' sort by original filename to recover original sequence." )
sort_parser.set_defaults (func=process_sort)
def process_train(arguments):
if 'DFL_TARGET_EPOCH' in os.environ.keys():
arguments.target_epoch = int ( os.environ['DFL_TARGET_EPOCH'] )
if 'DFL_BATCH_SIZE' in os.environ.keys():
arguments.batch_size = int ( os.environ['DFL_TARGET_EPOCH'] )
from mainscripts import Trainer
Trainer.main (
training_data_src_dir=arguments.training_data_src_dir,
training_data_dst_dir=arguments.training_data_dst_dir,
model_path=arguments.model_dir,
model_name=arguments.model_name,
debug = arguments.debug,
#**options
batch_size = arguments.batch_size,
write_preview_history = arguments.write_preview_history,
target_epoch = arguments.target_epoch,
save_interval_min = arguments.save_interval_min,
choose_worst_gpu = arguments.choose_worst_gpu,
force_best_gpu_idx = arguments.force_best_gpu_idx,
multi_gpu = arguments.multi_gpu,
force_gpu_idxs = arguments.force_gpu_idxs,
)
train_parser = subparsers.add_parser( "train", help="Trainer")
train_parser.add_argument('--training-data-src-dir', required=True, action=fixPathAction, dest="training_data_src_dir", help="Dir of src-set.")
train_parser.add_argument('--training-data-dst-dir', required=True, action=fixPathAction, dest="training_data_dst_dir", help="Dir of dst-set.")
train_parser.add_argument('--model-dir', required=True, action=fixPathAction, dest="model_dir", help="Model dir.")
train_parser.add_argument('--model', required=True, dest="model_name", choices=Path_utils.get_all_dir_names_startswith ( Path(__file__).parent / 'models' , 'Model_'), help="Type of model")
train_parser.add_argument('--write-preview-history', action="store_true", dest="write_preview_history", default=False, help="Enable write preview history.")
train_parser.add_argument('--debug', action="store_true", dest="debug", default=False, help="Debug training.")
train_parser.add_argument('--batch-size', type=int, dest="batch_size", default=0, help="Model batch size. Default - auto. Environment variable: ODFS_BATCH_SIZE.")
train_parser.add_argument('--target-epoch', type=int, dest="target_epoch", default=0, help="Train until target epoch. Default - unlimited. Environment variable: ODFS_TARGET_EPOCH.")
train_parser.add_argument('--save-interval-min', type=int, dest="save_interval_min", default=10, help="Save interval in minutes. Default 10.")
train_parser.add_argument('--choose-worst-gpu', action="store_true", dest="choose_worst_gpu", default=False, help="Choose worst GPU instead of best.")
train_parser.add_argument('--force-best-gpu-idx', type=int, dest="force_best_gpu_idx", default=-1, help="Force to choose this GPU idx as best(worst).")
train_parser.add_argument('--multi-gpu', action="store_true", dest="multi_gpu", default=False, help="MultiGPU option. It will select only same best(worst) GPU models.")
train_parser.add_argument('--force-gpu-idxs', type=str, dest="force_gpu_idxs", default=None, help="Override final GPU idxs. Example: 0,1,2.")
train_parser.set_defaults (func=process_train)
def process_convert(arguments):
if arguments.ask_for_params:
try:
mode = int ( input ("Choose mode: (1) hist match, (2) hist match bw, (3) seamless (default), (4) seamless hist match : ") )
except:
mode = 3
if mode == 1:
arguments.mode = 'hist-match'
elif mode == 2:
arguments.mode = 'hist-match-bw'
elif mode == 3:
arguments.mode = 'seamless'
elif mode == 4:
arguments.mode = 'seamless-hist-match'
if arguments.mode == 'hist-match' or arguments.mode == 'hist-match-bw':
try:
choice = int ( input ("Masked hist match? [0..1] (default - model choice) : ") )
arguments.masked_hist_match = (choice != 0)
except:
arguments.masked_hist_match = None
try:
arguments.erode_mask_modifier = int ( input ("Choose erode mask modifier [-100..100] (default 0) : ") )
except:
arguments.erode_mask_modifier = 0
try:
arguments.blur_mask_modifier = int ( input ("Choose blur mask modifier [-100..200] (default 0) : ") )
except:
arguments.blur_mask_modifier = 0
arguments.erode_mask_modifier = np.clip ( int(arguments.erode_mask_modifier), -100, 100)
arguments.blur_mask_modifier = np.clip ( int(arguments.blur_mask_modifier), -100, 200)
from mainscripts import Converter
Converter.main (
input_dir=arguments.input_dir,
output_dir=arguments.output_dir,
aligned_dir=arguments.aligned_dir,
model_dir=arguments.model_dir,
model_name=arguments.model_name,
debug = arguments.debug,
mode = arguments.mode,
masked_hist_match = arguments.masked_hist_match,
erode_mask_modifier = arguments.erode_mask_modifier,
blur_mask_modifier = arguments.blur_mask_modifier,
force_best_gpu_idx = arguments.force_best_gpu_idx
)
convert_parser = subparsers.add_parser( "convert", help="Converter")
convert_parser.add_argument('--input-dir', required=True, action=fixPathAction, dest="input_dir", help="Input directory. A directory containing the files you wish to process.")
convert_parser.add_argument('--output-dir', required=True, action=fixPathAction, dest="output_dir", help="Output directory. This is where the converted files will be stored.")
convert_parser.add_argument('--aligned-dir', action=fixPathAction, dest="aligned_dir", help="Aligned directory. This is where the aligned files stored. Not used in AVATAR model.")
convert_parser.add_argument('--model-dir', required=True, action=fixPathAction, dest="model_dir", help="Model dir.")
convert_parser.add_argument('--model', required=True, dest="model_name", choices=Path_utils.get_all_dir_names_startswith ( Path(__file__).parent / 'models' , 'Model_'), help="Type of model")
convert_parser.add_argument('--ask-for-params', action="store_true", dest="ask_for_params", default=False, help="Ask for params.")
convert_parser.add_argument('--mode', dest="mode", choices=['seamless','hist-match', 'hist-match-bw','seamless-hist-match'], default='seamless', help="Face overlaying mode. Seriously affects result.")
convert_parser.add_argument('--masked-hist-match', type=str2bool, nargs='?', const=True, default=None, help="True or False. Excludes background for hist match. Default - model decide.")
convert_parser.add_argument('--erode-mask-modifier', type=int, dest="erode_mask_modifier", default=0, help="Automatic erode mask modifier. Valid range [-100..100].")
convert_parser.add_argument('--blur-mask-modifier', type=int, dest="blur_mask_modifier", default=0, help="Automatic blur mask modifier. Valid range [-100..200].")
convert_parser.add_argument('--debug', action="store_true", dest="debug", default=False, help="Debug converter.")
convert_parser.add_argument('--force-best-gpu-idx', type=int, dest="force_best_gpu_idx", default=-1, help="Force to choose this GPU idx as best.")
convert_parser.set_defaults(func=process_convert)
def bad_args(arguments):
parser.print_help()
exit(0)
parser.set_defaults(func=bad_args)
arguments = parser.parse_args()
if arguments.tf_suppress_std:
os.environ['TF_SUPPRESS_STD'] = '1'
arguments.func(arguments)
'''
import code
code.interact(local=dict(globals(), **locals()))
'''

283
mainscripts/Converter.py Normal file
View file

@ -0,0 +1,283 @@
import traceback
from pathlib import Path
from utils import Path_utils
import cv2
from tqdm import tqdm
from utils.AlignedPNG import AlignedPNG
from utils import image_utils
import shutil
import numpy as np
import time
import multiprocessing
from models import ConverterBase
class model_process_predictor(object):
def __init__(self, sq, cq, lock):
self.sq = sq
self.cq = cq
self.lock = lock
def __call__(self, face):
self.lock.acquire()
self.sq.put ( {'op': 'predict', 'face' : face} )
while True:
if not self.cq.empty():
obj = self.cq.get()
obj_op = obj['op']
if obj_op == 'predict_result':
self.lock.release()
return obj['result']
time.sleep(0.005)
def model_process(model_name, model_dir, in_options, sq, cq):
try:
model_path = Path(model_dir)
import models
model = models.import_model(model_name)(model_path, **in_options)
converter = model.get_converter(**in_options)
converter.dummy_predict()
cq.put ( {'op':'init', 'converter' : converter.copy_and_set_predictor( None ) } )
closing = False
while not closing:
while not sq.empty():
obj = sq.get()
obj_op = obj['op']
if obj_op == 'predict':
result = converter.predictor ( obj['face'] )
cq.put ( {'op':'predict_result', 'result':result} )
elif obj_op == 'close':
closing = True
break
time.sleep(0.005)
model.finalize()
except Exception as e:
print ( 'Error: %s' % (str(e)))
traceback.print_exc()
from utils.SubprocessorBase import SubprocessorBase
class ConvertSubprocessor(SubprocessorBase):
#override
def __init__(self, converter, input_path_image_paths, output_path, alignments, debug):
super().__init__('Converter')
self.converter = converter
self.input_path_image_paths = input_path_image_paths
self.output_path = output_path
self.alignments = alignments
self.debug = debug
self.input_data = self.input_path_image_paths
self.files_processed = 0
self.faces_processed = 0
#override
def process_info_generator(self):
r = [0] if self.debug else range(multiprocessing.cpu_count())
for i in r:
yield 'CPU%d' % (i), {}, {'device_idx': i,
'device_name': 'CPU%d' % (i),
'converter' : self.converter,
'output_dir' : str(self.output_path),
'alignments' : self.alignments,
'debug': self.debug }
#override
def get_no_process_started_message(self):
return 'Unable to start CPU processes.'
#override
def onHostGetProgressBarDesc(self):
return "Converting"
#override
def onHostGetProgressBarLen(self):
return len (self.input_data)
#override
def onHostGetData(self):
if len (self.input_data) > 0:
return self.input_data.pop(0)
return None
#override
def onHostDataReturn (self, data):
self.input_data.insert(0, data)
#override
def onClientInitialize(self, client_dict):
print ('Running on %s.' % (client_dict['device_name']) )
self.device_idx = client_dict['device_idx']
self.device_name = client_dict['device_name']
self.converter = client_dict['converter']
self.output_path = Path(client_dict['output_dir']) if 'output_dir' in client_dict.keys() else None
self.alignments = client_dict['alignments']
self.debug = client_dict['debug']
return None
#override
def onClientFinalize(self):
pass
#override
def onClientProcessData(self, data):
filename_path = Path(data)
files_processed = 1
faces_processed = 0
output_filename_path = self.output_path / filename_path.name
if self.converter.get_mode() == ConverterBase.MODE_FACE and filename_path.stem not in self.alignments.keys():
if not self.debug:
print ( 'no faces found for %s, copying without faces' % (filename_path.name) )
shutil.copy ( str(filename_path), str(output_filename_path) )
else:
image = (cv2.imread(str(filename_path)) / 255.0).astype(np.float32)
if self.converter.get_mode() == ConverterBase.MODE_IMAGE:
image_landmarks = None
a_png = AlignedPNG.load( str(filename_path) )
if a_png is not None:
d = a_png.getFaceswapDictData()
if d is not None and 'landmarks' in d.keys():
image_landmarks = np.array(d['landmarks'])
image = self.converter.convert_image(image, image_landmarks, self.debug)
if self.debug:
for img in image:
cv2.imshow ('Debug convert', img )
cv2.waitKey(0)
faces_processed = 1
elif self.converter.get_mode() == ConverterBase.MODE_FACE:
faces = self.alignments[filename_path.stem]
for image_landmarks in faces:
image = self.converter.convert_face(image, image_landmarks, self.debug)
if self.debug:
for img in image:
cv2.imshow ('Debug convert', img )
cv2.waitKey(0)
faces_processed = len(faces)
if not self.debug:
cv2.imwrite (str(output_filename_path), (image*255).astype(np.uint8) )
return (files_processed, faces_processed)
#override
def onHostResult (self, data, result):
self.files_processed += result[0]
self.faces_processed += result[1]
return 1
#override
def get_start_return(self):
return self.files_processed, self.faces_processed
def main (input_dir, output_dir, aligned_dir, model_dir, model_name, **in_options):
print ("Running converter.\r\n")
debug = in_options['debug']
try:
input_path = Path(input_dir)
output_path = Path(output_dir)
aligned_path = Path(aligned_dir)
model_path = Path(model_dir)
if not input_path.exists():
print('Input directory not found. Please ensure it exists.')
return
if output_path.exists():
for filename in Path_utils.get_image_paths(output_path):
Path(filename).unlink()
else:
output_path.mkdir(parents=True, exist_ok=True)
if not aligned_path.exists():
print('Aligned directory not found. Please ensure it exists.')
return
if not model_path.exists():
print('Model directory not found. Please ensure it exists.')
return
model_sq = multiprocessing.Queue()
model_cq = multiprocessing.Queue()
model_lock = multiprocessing.Lock()
model_p = multiprocessing.Process(target=model_process, args=(model_name, model_dir, in_options, model_sq, model_cq))
model_p.start()
while True:
if not model_cq.empty():
obj = model_cq.get()
obj_op = obj['op']
if obj_op == 'init':
converter = obj['converter']
break
alignments = {}
if converter.get_mode() == ConverterBase.MODE_FACE:
aligned_path_image_paths = Path_utils.get_image_paths(aligned_path)
for filename in tqdm(aligned_path_image_paths, desc= "Collecting alignments" ):
a_png = AlignedPNG.load( str(filename) )
if a_png is None:
print ( "%s - no embedded data found." % (filename) )
continue
d = a_png.getFaceswapDictData()
if d is None or d['source_filename'] is None or d['source_rect'] is None or d['source_landmarks'] is None:
print ( "%s - no embedded data found." % (filename) )
continue
source_filename_stem = Path(d['source_filename']).stem
if source_filename_stem not in alignments.keys():
alignments[ source_filename_stem ] = []
alignments[ source_filename_stem ].append ( np.array(d['source_landmarks']) )
files_processed, faces_processed = ConvertSubprocessor (
converter = converter.copy_and_set_predictor( model_process_predictor(model_sq,model_cq,model_lock) ),
input_path_image_paths = Path_utils.get_image_paths(input_path),
output_path = output_path,
alignments = alignments,
debug = debug ).process()
model_sq.put ( {'op':'close'} )
model_p.join()
'''
if model_name == 'AVATAR':
output_path_image_paths = Path_utils.get_image_paths(output_path)
last_ok_frame = -1
for filename in output_path_image_paths:
filename_path = Path(filename)
stem = Path(filename).stem
try:
frame = int(stem)
except:
raise Exception ('Aligned avatars must be created from indexed sequence files.')
if frame-last_ok_frame > 1:
start = last_ok_frame + 1
end = frame - 1
print ("Filling gaps: [%d...%d]" % (start, end) )
for i in range (start, end+1):
shutil.copy ( str(filename), str( output_path / ('%.5d%s' % (i, filename_path.suffix )) ) )
last_ok_frame = frame
'''
except Exception as e:
print ( 'Error: %s' % (str(e)))
traceback.print_exc()

378
mainscripts/Extractor.py Normal file
View file

@ -0,0 +1,378 @@
import traceback
import os
import sys
import time
import multiprocessing
from tqdm import tqdm
from pathlib import Path
import numpy as np
import cv2
from utils import Path_utils
from utils.AlignedPNG import AlignedPNG
from utils import image_utils
from facelib import FaceType
import facelib
import gpufmkmgr
from utils.SubprocessorBase import SubprocessorBase
class ExtractSubprocessor(SubprocessorBase):
#override
def __init__(self, input_data, type, image_size, face_type, debug, multi_gpu=False, manual=False, manual_window_size=0, detector=None, output_path=None ):
self.input_data = input_data
self.type = type
self.image_size = image_size
self.face_type = face_type
self.debug = debug
self.multi_gpu = multi_gpu
self.detector = detector
self.output_path = output_path
self.manual = manual
self.manual_window_size = manual_window_size
self.result = []
no_response_time_sec = 60 if not self.manual else 999999
super().__init__('Extractor', no_response_time_sec)
#override
def onHostClientsInitialized(self):
if self.manual == True:
self.wnd_name = 'Manual pass'
cv2.namedWindow(self.wnd_name)
self.landmarks = None
self.param_x = -1
self.param_y = -1
self.param_rect_size = -1
self.param = {'x': 0, 'y': 0, 'rect_size' : 5}
def onMouse(event, x, y, flags, param):
if event == cv2.EVENT_MOUSEWHEEL:
mod = 1 if flags > 0 else -1
param['rect_size'] = max (5, param['rect_size'] + 10*mod)
else:
param['x'] = x
param['y'] = y
cv2.setMouseCallback(self.wnd_name, onMouse, self.param)
def get_devices_for_type (self, type, multi_gpu):
if (type == 'rects' or type == 'landmarks'):
if not multi_gpu:
devices = [gpufmkmgr.getBestDeviceIdx()]
else:
devices = gpufmkmgr.getDevicesWithAtLeastTotalMemoryGB(2)
devices = [ (idx, gpufmkmgr.getDeviceName(idx), gpufmkmgr.getDeviceVRAMTotalGb(idx) ) for idx in devices]
elif type == 'final':
devices = [ (i, 'CPU%d' % (i), 0 ) for i in range(0, multiprocessing.cpu_count()) ]
return devices
#override
def process_info_generator(self):
for (device_idx, device_name, device_total_vram_gb) in self.get_devices_for_type(self.type, self.multi_gpu):
num_processes = 1
if not self.manual and self.type == 'rects' and self.detector == 'mt':
num_processes = int ( max (1, device_total_vram_gb / 2) )
for i in range(0, num_processes ):
device_name_for_process = device_name if num_processes == 1 else '%s #%d' % (device_name,i)
yield device_name_for_process, {}, {'type' : self.type,
'device_idx' : device_idx,
'device_name' : device_name_for_process,
'image_size': self.image_size,
'face_type': self.face_type,
'debug': self.debug,
'output_dir': str(self.output_path),
'detector': self.detector}
#override
def get_no_process_started_message(self):
if (self.type == 'rects' or self.type == 'landmarks'):
print ( 'You have no capable GPUs. Try to close programs which can consume VRAM, and run again.')
elif self.type == 'final':
print ( 'Unable to start CPU processes.')
#override
def onHostGetProgressBarDesc(self):
return None
#override
def onHostGetProgressBarLen(self):
return len (self.input_data)
#override
def onHostGetData(self):
if not self.manual:
if len (self.input_data) > 0:
return self.input_data.pop(0)
else:
while len (self.input_data) > 0:
data = self.input_data[0]
filename, faces = data
is_frame_done = False
if len(faces) == 0:
self.original_image = cv2.imread(filename)
(h,w,c) = self.original_image.shape
self.view_scale = 1.0 if self.manual_window_size == 0 else self.manual_window_size / (w if w > h else h)
self.original_image = cv2.resize (self.original_image, ( int(w*self.view_scale), int(h*self.view_scale) ), interpolation=cv2.INTER_LINEAR)
self.text_lines_img = (image_utils.get_draw_text_lines ( self.original_image, (0,0, self.original_image.shape[1], min(100, self.original_image.shape[0]) ),
[ 'Match landmarks with face exactly.',
'[Enter] - confirm frame',
'[Space] - skip frame',
'[Mouse wheel] - change rect'
], (1, 1, 1) )*255).astype(np.uint8)
while True:
key = cv2.waitKey(1) & 0xFF
if key == ord('\r') or key == ord('\n'):
faces.append ( [(self.rect), self.landmarks] )
is_frame_done = True
break
elif key == ord(' '):
is_frame_done = True
break
if self.param_x != self.param['x'] / self.view_scale or \
self.param_y != self.param['y'] / self.view_scale or \
self.param_rect_size != self.param['rect_size']:
self.param_x = self.param['x'] / self.view_scale
self.param_y = self.param['y'] / self.view_scale
self.param_rect_size = self.param['rect_size']
self.rect = (self.param_x-self.param_rect_size, self.param_y-self.param_rect_size, self.param_x+self.param_rect_size, self.param_y+self.param_rect_size)
return [filename, [self.rect]]
else:
is_frame_done = True
if is_frame_done:
self.result.append ( data )
self.input_data.pop(0)
self.inc_progress_bar(1)
return None
#override
def onHostDataReturn (self, data):
if not self.manual:
self.input_data.insert(0, data)
#override
def onClientInitialize(self, client_dict):
self.safe_print ('Running on %s.' % (client_dict['device_name']) )
self.type = client_dict['type']
self.image_size = client_dict['image_size']
self.face_type = client_dict['face_type']
self.device_idx = client_dict['device_idx']
self.output_path = Path(client_dict['output_dir']) if 'output_dir' in client_dict.keys() else None
self.debug = client_dict['debug']
self.detector = client_dict['detector']
self.keras = None
self.tf = None
self.tf_session = None
self.e = None
if self.type == 'rects':
if self.detector is not None:
if self.detector == 'mt':
self.tf = gpufmkmgr.import_tf ([self.device_idx], allow_growth=True)
self.tf_session = gpufmkmgr.get_tf_session()
self.keras = gpufmkmgr.import_keras()
self.e = facelib.MTCExtractor(self.keras, self.tf, self.tf_session)
elif self.detector == 'dlib':
self.dlib = gpufmkmgr.import_dlib( self.device_idx )
self.e = facelib.DLIBExtractor(self.dlib)
self.e.__enter__()
elif self.type == 'landmarks':
self.tf = gpufmkmgr.import_tf([self.device_idx], allow_growth=True)
self.tf_session = gpufmkmgr.get_tf_session()
self.keras = gpufmkmgr.import_keras()
self.e = facelib.LandmarksExtractor(self.keras)
self.e.__enter__()
elif self.type == 'final':
pass
return None
#override
def onClientFinalize(self):
if self.e is not None:
self.e.__exit__()
#override
def onClientProcessData(self, data):
filename_path = Path( data[0] )
image = cv2.imread( str(filename_path) )
if image is None:
print ( 'Failed to extract %s, reason: cv2.imread() fail.' % ( str(filename_path) ) )
else:
if self.type == 'rects':
rects = self.e.extract_from_bgr (image)
return [str(filename_path), rects]
elif self.type == 'landmarks':
rects = data[1]
landmarks = self.e.extract_from_bgr (image, rects)
return [str(filename_path), landmarks]
elif self.type == 'final':
result = []
faces = data[1]
if self.debug:
debug_output_file = '{}_{}'.format( str(Path(str(self.output_path) + '_debug') / filename_path.stem), 'debug.png')
debug_image = image.copy()
for (face_idx, face) in enumerate(faces):
output_file = '{}_{}{}'.format(str(self.output_path / filename_path.stem), str(face_idx), '.png')
rect = face[0]
image_landmarks = np.array(face[1])
if self.debug:
facelib.LandmarksProcessor.draw_rect_landmarks (debug_image, rect, image_landmarks, self.image_size, self.face_type)
if self.face_type == FaceType.MARK_ONLY:
face_image = image
face_image_landmarks = image_landmarks
else:
image_to_face_mat = facelib.LandmarksProcessor.get_transform_mat (image_landmarks, self.image_size, self.face_type)
face_image = cv2.warpAffine(image, image_to_face_mat, (self.image_size, self.image_size), cv2.INTER_LANCZOS4)
face_image_landmarks = facelib.LandmarksProcessor.transform_points (image_landmarks, image_to_face_mat)
cv2.imwrite(output_file, face_image)
a_png = AlignedPNG.load (output_file)
d = {
'face_type': FaceType.toString(self.face_type),
'landmarks': face_image_landmarks.tolist(),
'yaw_value': facelib.LandmarksProcessor.calc_face_yaw (face_image_landmarks),
'pitch_value': facelib.LandmarksProcessor.calc_face_pitch (face_image_landmarks),
'source_filename': filename_path.name,
'source_rect': rect,
'source_landmarks': image_landmarks.tolist()
}
a_png.setFaceswapDictData (d)
a_png.save(output_file)
result.append (output_file)
if self.debug:
cv2.imwrite(debug_output_file, debug_image )
return result
return None
#overridable
def onClientGetDataName (self, data):
#return string identificator of your data
return data[0]
#override
def onHostResult (self, data, result):
if self.manual == True:
self.landmarks = result[1][0][1]
image = cv2.addWeighted (self.original_image,1.0,self.text_lines_img,1.0,0)
view_rect = (np.array(self.rect) * self.view_scale).astype(np.int).tolist()
view_landmarks = (np.array(self.landmarks) * self.view_scale).astype(np.int).tolist()
facelib.LandmarksProcessor.draw_rect_landmarks (image, view_rect, view_landmarks, self.image_size, self.face_type)
cv2.imshow (self.wnd_name, image)
return 0
else:
if self.type == 'rects':
self.result.append ( result )
elif self.type == 'landmarks':
self.result.append ( result )
elif self.type == 'final':
self.result += result
return 1
#override
def onHostProcessEnd(self):
if self.manual == True:
cv2.destroyAllWindows()
#override
def get_start_return(self):
return self.result
'''
detector
'dlib'
'mt'
'manual'
face_type
'full_face'
'avatar'
'''
def main (input_dir, output_dir, debug, detector='mt', multi_gpu=True, manual_fix=False, manual_window_size=0, image_size=256, face_type='full_face'):
print ("Running extractor.\r\n")
input_path = Path(input_dir)
output_path = Path(output_dir)
face_type = FaceType.fromString(face_type)
if not input_path.exists():
print('Input directory not found. Please ensure it exists.')
return
if output_path.exists():
for filename in Path_utils.get_image_paths(output_path):
Path(filename).unlink()
else:
output_path.mkdir(parents=True, exist_ok=True)
if debug:
debug_output_path = Path(str(output_path) + '_debug')
if debug_output_path.exists():
for filename in Path_utils.get_image_paths(debug_output_path):
Path(filename).unlink()
else:
debug_output_path.mkdir(parents=True, exist_ok=True)
input_path_image_paths = Path_utils.get_image_unique_filestem_paths(input_path, verbose=True)
images_found = len(input_path_image_paths)
faces_detected = 0
if images_found != 0:
if detector == 'manual':
print ('Performing manual extract...')
extracted_faces = ExtractSubprocessor ([ (filename,[]) for filename in input_path_image_paths ], 'landmarks', image_size, face_type, debug, manual=True, manual_window_size=manual_window_size).process()
else:
print ('Performing 1st pass...')
extracted_rects = ExtractSubprocessor ([ (x,) for x in input_path_image_paths ], 'rects', image_size, face_type, debug, multi_gpu=multi_gpu, manual=False, detector=detector).process()
print ('Performing 2nd pass...')
extracted_faces = ExtractSubprocessor (extracted_rects, 'landmarks', image_size, face_type, debug, multi_gpu=multi_gpu, manual=False).process()
if manual_fix:
print ('Performing manual fix...')
if all ( np.array ( [ len(data[1]) > 0 for data in extracted_faces] ) == True ):
print ('All faces are detected, manual fix not needed.')
else:
extracted_faces = ExtractSubprocessor (extracted_faces, 'landmarks', image_size, face_type, debug, manual=True, manual_window_size=manual_window_size).process()
if len(extracted_faces) > 0:
print ('Performing 3rd pass...')
final_imgs_paths = ExtractSubprocessor (extracted_faces, 'final', image_size, face_type, debug, multi_gpu=multi_gpu, manual=False, output_path=output_path).process()
faces_detected = len(final_imgs_paths)
print('-------------------------')
print('Images found: %d' % (images_found) )
print('Faces detected: %d' % (faces_detected) )
print('-------------------------')

351
mainscripts/Sorter.py Normal file
View file

@ -0,0 +1,351 @@
import os
import sys
import operator
import numpy as np
import cv2
from tqdm import tqdm
from shutil import copyfile
from pathlib import Path
from utils import Path_utils
from utils.AlignedPNG import AlignedPNG
from facelib import LandmarksProcessor
def estimate_blur(image):
if image.ndim == 3:
image = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
blur_map = cv2.Laplacian(image, cv2.CV_64F)
score = np.var(blur_map)
return score
def sort_by_brightness(input_path):
print ("Sorting by brightness...")
img_list = [ [x, np.mean ( cv2.cvtColor(cv2.imread(x), cv2.COLOR_BGR2HSV)[...,2].flatten() )] for x in tqdm( Path_utils.get_image_paths(input_path), desc="Loading") ]
print ("Sorting...")
img_list = sorted(img_list, key=operator.itemgetter(1), reverse=True)
return img_list
def sort_by_hue(input_path):
print ("Sorting by hue...")
img_list = [ [x, np.mean ( cv2.cvtColor(cv2.imread(x), cv2.COLOR_BGR2HSV)[...,0].flatten() )] for x in tqdm( Path_utils.get_image_paths(input_path), desc="Loading") ]
print ("Sorting...")
img_list = sorted(img_list, key=operator.itemgetter(1), reverse=True)
return img_list
def sort_by_blur(input_path):
img_list = []
print ("Sorting by blur...")
for filepath in tqdm( Path_utils.get_image_paths(input_path), desc="Loading"):
#never mask it by face hull, it worse than whole image blur estimate
img_list.append ( [filepath, estimate_blur (cv2.imread( filepath ))] )
print ("Sorting...")
img_list = sorted(img_list, key=operator.itemgetter(1), reverse=True)
return img_list
def sort_by_face(input_path):
print ("Sorting by face similarity...")
img_list = []
for filepath in tqdm( Path_utils.get_image_paths(input_path), desc="Loading"):
filepath = Path(filepath)
if filepath.suffix != '.png':
print ("%s is not a png file required for sort_by_face" % (filepath.name) )
continue
a_png = AlignedPNG.load (str(filepath))
if a_png is None:
print ("%s failed to load" % (filepath.name) )
continue
d = a_png.getFaceswapDictData()
if d is None or d['landmarks'] is None:
print ("%s - no embedded data found required for sort_by_face" % (filepath.name) )
continue
img_list.append( [str(filepath), np.array(d['landmarks']) ] )
img_list_len = len(img_list)
for i in tqdm ( range(0, img_list_len-1), desc="Sorting"):
min_score = float("inf")
j_min_score = i+1
for j in range(i+1,len(img_list)):
fl1 = img_list[i][1]
fl2 = img_list[j][1]
score = np.sum ( np.absolute ( (fl2 - fl1).flatten() ) )
if score < min_score:
min_score = score
j_min_score = j
img_list[i+1], img_list[j_min_score] = img_list[j_min_score], img_list[i+1]
return img_list
def sort_by_face_dissim(input_path):
print ("Sorting by face dissimilarity...")
img_list = []
for filepath in tqdm( Path_utils.get_image_paths(input_path), desc="Loading"):
filepath = Path(filepath)
if filepath.suffix != '.png':
print ("%s is not a png file required for sort_by_face_dissim" % (filepath.name) )
continue
a_png = AlignedPNG.load (str(filepath))
if a_png is None:
print ("%s failed to load" % (filepath.name) )
continue
d = a_png.getFaceswapDictData()
if d is None or d['landmarks'] is None:
print ("%s - no embedded data found required for sort_by_face_dissim" % (filepath.name) )
continue
img_list.append( [str(filepath), np.array(d['landmarks']), 0 ] )
img_list_len = len(img_list)
for i in tqdm( range(0, img_list_len-1), desc="Sorting"):
score_total = 0
for j in range(i+1,len(img_list)):
if i == j:
continue
fl1 = img_list[i][1]
fl2 = img_list[j][1]
score_total += np.sum ( np.absolute ( (fl2 - fl1).flatten() ) )
img_list[i][2] = score_total
print ("Sorting...")
img_list = sorted(img_list, key=operator.itemgetter(2), reverse=True)
return img_list
def sort_by_face_yaw(input_path):
print ("Sorting by face yaw...")
img_list = []
for filepath in tqdm( Path_utils.get_image_paths(input_path), desc="Loading"):
filepath = Path(filepath)
if filepath.suffix != '.png':
print ("%s is not a png file required for sort_by_face_dissim" % (filepath.name) )
continue
a_png = AlignedPNG.load (str(filepath))
if a_png is None:
print ("%s failed to load" % (filepath.name) )
continue
d = a_png.getFaceswapDictData()
if d is None or d['yaw_value'] is None:
print ("%s - no embedded data found required for sort_by_face_dissim" % (filepath.name) )
continue
img_list.append( [str(filepath), np.array(d['yaw_value']) ] )
print ("Sorting...")
img_list = sorted(img_list, key=operator.itemgetter(1), reverse=True)
return img_list
def sort_by_hist_blur(input_path):
print ("Sorting by histogram similarity and blur...")
img_list = []
for x in tqdm( Path_utils.get_image_paths(input_path), desc="Loading"):
img = cv2.imread(x)
img_list.append ([x, cv2.calcHist([img], [0], None, [256], [0, 256]),
cv2.calcHist([img], [1], None, [256], [0, 256]),
cv2.calcHist([img], [2], None, [256], [0, 256]),
estimate_blur(img)
])
img_list_len = len(img_list)
for i in tqdm( range(0, img_list_len-1), desc="Sorting"):
min_score = float("inf")
j_min_score = i+1
for j in range(i+1,len(img_list)):
score = cv2.compareHist(img_list[i][1], img_list[j][1], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][2], img_list[j][2], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][3], img_list[j][3], cv2.HISTCMP_BHATTACHARYYA)
if score < min_score:
min_score = score
j_min_score = j
img_list[i+1], img_list[j_min_score] = img_list[j_min_score], img_list[i+1]
l = []
for i in range(0, img_list_len-1):
score = cv2.compareHist(img_list[i][1], img_list[i+1][1], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][2], img_list[i+1][2], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][3], img_list[i+1][3], cv2.HISTCMP_BHATTACHARYYA)
l += [score]
l = np.array(l)
v = np.mean(l)
if v*2 < np.max(l):
v *= 2
new_img_list = []
start_group_i = 0
odd_counter = 0
for i in tqdm( range(0, img_list_len), desc="Sorting"):
end_group_i = -1
if i < img_list_len-1:
score = cv2.compareHist(img_list[i][1], img_list[i+1][1], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][2], img_list[i+1][2], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][3], img_list[i+1][3], cv2.HISTCMP_BHATTACHARYYA)
if score >= v:
end_group_i = i
elif i == img_list_len-1:
end_group_i = i
if end_group_i >= start_group_i:
odd_counter += 1
s = sorted(img_list[start_group_i:end_group_i+1] , key=operator.itemgetter(4), reverse=True)
if odd_counter % 2 == 0:
new_img_list = new_img_list + s
else:
new_img_list = s + new_img_list
start_group_i = i + 1
return new_img_list
def sort_by_hist(input_path):
print ("Sorting by histogram similarity...")
img_list = []
for x in tqdm( Path_utils.get_image_paths(input_path), desc="Loading"):
img = cv2.imread(x)
img_list.append ([x, cv2.calcHist([img], [0], None, [256], [0, 256]),
cv2.calcHist([img], [1], None, [256], [0, 256]),
cv2.calcHist([img], [2], None, [256], [0, 256])
])
img_list_len = len(img_list)
for i in tqdm( range(0, img_list_len-1), desc="Sorting"):
min_score = float("inf")
j_min_score = i+1
for j in range(i+1,len(img_list)):
score = cv2.compareHist(img_list[i][1], img_list[j][1], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][2], img_list[j][2], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][3], img_list[j][3], cv2.HISTCMP_BHATTACHARYYA)
if score < min_score:
min_score = score
j_min_score = j
img_list[i+1], img_list[j_min_score] = img_list[j_min_score], img_list[i+1]
return img_list
def sort_by_hist_dissim(input_path):
print ("Sorting by histogram dissimilarity...")
img_list = []
for x in tqdm( Path_utils.get_image_paths(input_path), desc="Loading"):
img = cv2.imread(x)
img_list.append ([x, cv2.calcHist([img], [0], None, [256], [0, 256]),
cv2.calcHist([img], [1], None, [256], [0, 256]),
cv2.calcHist([img], [2], None, [256], [0, 256]), 0
])
img_list_len = len(img_list)
for i in tqdm ( range(0, img_list_len), desc="Sorting"):
score_total = 0
for j in range( 0, img_list_len):
if i == j:
continue
score_total += cv2.compareHist(img_list[i][1], img_list[j][1], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][2], img_list[j][2], cv2.HISTCMP_BHATTACHARYYA) + \
cv2.compareHist(img_list[i][3], img_list[j][3], cv2.HISTCMP_BHATTACHARYYA)
img_list[i][4] = score_total
print ("Sorting...")
img_list = sorted(img_list, key=operator.itemgetter(4), reverse=True)
return img_list
def final_rename(input_path, img_list):
for i in tqdm( range(0,len(img_list)), desc="Renaming" , leave=False):
src = Path (img_list[i][0])
dst = input_path / ('%.5d_%s' % (i, src.name ))
try:
src.rename (dst)
except:
print ('fail to rename %s' % (src.name) )
for i in tqdm( range(0,len(img_list)) , desc="Renaming" ):
src = Path (img_list[i][0])
src = input_path / ('%.5d_%s' % (i, src.name))
dst = input_path / ('%.5d%s' % (i, src.suffix))
try:
src.rename (dst)
except:
print ('fail to rename %s' % (src.name) )
def sort_by_origname(input_path):
print ("Sort by original filename...")
img_list = []
for filepath in tqdm( Path_utils.get_image_paths(input_path), desc="Loading"):
filepath = Path(filepath)
if filepath.suffix != '.png':
print ("%s is not a png file required for sort_by_origname" % (filepath.name) )
continue
a_png = AlignedPNG.load (str(filepath))
if a_png is None:
print ("%s failed to load" % (filepath.name) )
continue
d = a_png.getFaceswapDictData()
if d is None or d['source_filename'] is None:
print ("%s - no embedded data found required for sort_by_origname" % (filepath.name) )
continue
img_list.append( [str(filepath), d['source_filename']] )
print ("Sorting...")
img_list = sorted(img_list, key=operator.itemgetter(1))
return img_list
def main (input_path, sort_by_method):
input_path = Path(input_path)
sort_by_method = sort_by_method.lower()
print ("Running sort tool.\r\n")
img_list = []
if sort_by_method == 'blur': img_list = sort_by_blur (input_path)
elif sort_by_method == 'face': img_list = sort_by_face (input_path)
elif sort_by_method == 'face-dissim': img_list = sort_by_face_dissim (input_path)
elif sort_by_method == 'face-yaw': img_list = sort_by_face_yaw (input_path)
elif sort_by_method == 'hist': img_list = sort_by_hist (input_path)
elif sort_by_method == 'hist-dissim': img_list = sort_by_hist_dissim (input_path)
elif sort_by_method == 'hist-blur': img_list = sort_by_hist_blur (input_path)
elif sort_by_method == 'brightness': img_list = sort_by_brightness (input_path)
elif sort_by_method == 'hue': img_list = sort_by_hue (input_path)
elif sort_by_method == 'origname': img_list = sort_by_origname (input_path)
final_rename (input_path, img_list)

289
mainscripts/Trainer.py Normal file
View file

@ -0,0 +1,289 @@
import sys
import traceback
import queue
import colorsys
import time
import numpy as np
import itertools
from pathlib import Path
from utils import Path_utils
from utils import image_utils
import cv2
def trainerThread (input_queue, output_queue, training_data_src_dir, training_data_dst_dir, model_path, model_name, save_interval_min=10, debug=False, target_epoch=0, **in_options):
while True:
try:
training_data_src_path = Path(training_data_src_dir)
training_data_dst_path = Path(training_data_dst_dir)
model_path = Path(model_path)
if not training_data_src_path.exists():
print( 'Training data src directory is not exists.')
return
if not training_data_dst_path.exists():
print( 'Training data dst directory is not exists.')
return
if not model_path.exists():
model_path.mkdir(exist_ok=True)
import models
model = models.import_model(model_name)(
model_path,
training_data_src_path=training_data_src_path,
training_data_dst_path=training_data_dst_path,
debug=debug,
**in_options)
is_reached_goal = (target_epoch > 0 and model.get_epoch() >= target_epoch)
def model_save():
if not debug and not is_reached_goal:
model.save()
def send_preview():
if not debug:
previews = model.get_previews()
output_queue.put ( {'op':'show', 'previews': previews, 'epoch':model.get_epoch(), 'loss_history': model.get_loss_history().copy() } )
else:
previews = [( 'debug, press update for new', model.debug_one_epoch())]
output_queue.put ( {'op':'show', 'previews': previews} )
if model.is_first_run():
model_save()
if target_epoch != 0:
if is_reached_goal:
print ('Model already trained to target epoch. You can use preview.')
else:
print('Starting. Target epoch: %d. Press "Enter" to stop training and save model.' % (target_epoch) )
else:
print('Starting. Press "Enter" to stop training and save model.')
last_save_time = time.time()
for i in itertools.count(0,1):
if not debug:
if not is_reached_goal:
loss_string = model.train_one_epoch()
print (loss_string, end='\r')
if target_epoch != 0 and model.get_epoch() >= target_epoch:
print ('Reached target epoch.')
model_save()
is_reached_goal = True
print ('You can use preview now.')
if not is_reached_goal and (time.time() - last_save_time) >= save_interval_min*60:
last_save_time = time.time()
model_save()
send_preview()
if i==0:
if is_reached_goal:
model.pass_one_epoch()
send_preview()
if debug:
time.sleep(0.005)
while not input_queue.empty():
input = input_queue.get()
op = input['op']
if op == 'save':
model_save()
elif op == 'preview':
if is_reached_goal:
model.pass_one_epoch()
send_preview()
elif op == 'close':
model_save()
i = -1
break
if i == -1:
break
model.finalize()
except Exception as e:
print ('Error: %s' % (str(e)))
traceback.print_exc()
break
output_queue.put ( {'op':'close'} )
def previewThread (input_queue, output_queue):
previews = None
loss_history = None
selected_preview = 0
update_preview = False
is_showing = False
is_waiting_preview = False
epoch = 0
while True:
if not input_queue.empty():
input = input_queue.get()
op = input['op']
if op == 'show':
is_waiting_preview = False
loss_history = input['loss_history'] if 'loss_history' in input.keys() else None
previews = input['previews'] if 'previews' in input.keys() else None
epoch = input['epoch'] if 'epoch' in input.keys() else 0
if previews is not None:
max_w = 0
max_h = 0
for (preview_name, preview_rgb) in previews:
(h, w, c) = preview_rgb.shape
max_h = max (max_h, h)
max_w = max (max_w, w)
max_size = 800
if max_h > max_size:
max_w = int( max_w / (max_h / max_size) )
max_h = max_size
#make all previews size equal
for preview in previews[:]:
(preview_name, preview_rgb) = preview
(h, w, c) = preview_rgb.shape
if h != max_h or w != max_w:
previews.remove(preview)
previews.append ( (preview_name, cv2.resize(preview_rgb, (max_w, max_h))) )
selected_preview = selected_preview % len(previews)
update_preview = True
elif op == 'close':
break
if update_preview:
update_preview = False
(h,w,c) = previews[0][1].shape
selected_preview_name = previews[selected_preview][0]
selected_preview_rgb = previews[selected_preview][1]
# HEAD
head_text_color = [0.8]*c
head_lines = [
'[s]:save [enter]:exit',
'[p]:update [space]:next preview',
'Preview: "%s" [%d/%d]' % (selected_preview_name,selected_preview+1, len(previews) )
]
head_line_height = 15
head_height = len(head_lines) * head_line_height
head = np.ones ( (head_height,w,c) ) * 0.1
for i in range(0, len(head_lines)):
t = i*head_line_height
b = (i+1)*head_line_height
head[t:b, 0:w] += image_utils.get_text_image ( (w,head_line_height,c) , head_lines[i], color=head_text_color )
final = head
if loss_history is not None:
# LOSS HISTORY
loss_history = np.array (loss_history)
lh_height = 100
lh_img = np.ones ( (lh_height,w,c) ) * 0.1
loss_count = len(loss_history[0])
lh_len = len(loss_history)
l_per_col = lh_len / w
plist_max = [ [ max (0.0, 0.0, *[ loss_history[i_ab][p]
for i_ab in range( int(col*l_per_col), int((col+1)*l_per_col) )
]
)
for p in range(0,loss_count)
]
for col in range(0, w)
]
plist_min = [ [ min (plist_max[col][p],
plist_max[col][p],
*[ loss_history[i_ab][p]
for i_ab in range( int(col*l_per_col), int((col+1)*l_per_col) )
]
)
for p in range(0,loss_count)
]
for col in range(0, w)
]
plist_abs_max = np.mean(loss_history[ len(loss_history) // 5 : ]) * 2
if l_per_col >= 1.0:
for col in range(0, w):
for p in range(0,loss_count):
point_color = [1.0]*c
point_color[0:3] = colorsys.hsv_to_rgb ( p * (1.0/loss_count), 1.0, 1.0 )
ph_max = int ( (plist_max[col][p] / plist_abs_max) * (lh_height-1) )
ph_max = np.clip( ph_max, 0, lh_height-1 )
ph_min = int ( (plist_min[col][p] / plist_abs_max) * (lh_height-1) )
ph_min = np.clip( ph_min, 0, lh_height-1 )
for ph in range(ph_min, ph_max+1):
lh_img[ (lh_height-ph-1), col ] = point_color
lh_lines = 5
lh_line_height = (lh_height-1)/lh_lines
for i in range(0,lh_lines+1):
lh_img[ int(i*lh_line_height), : ] = (0.8,)*c
last_line_t = int((lh_lines-1)*lh_line_height)
last_line_b = int(lh_lines*lh_line_height)
if epoch != 0:
lh_text = 'Loss history. Epoch: %d' % (epoch)
else:
lh_text = 'Loss history.'
lh_img[last_line_t:last_line_b, 0:w] += image_utils.get_text_image ( (w,last_line_b-last_line_t,c), lh_text, color=head_text_color )
final = np.concatenate ( [final, lh_img], axis=0 )
final = np.concatenate ( [final, selected_preview_rgb], axis=0 )
cv2.imshow ( 'Training preview', final)
is_showing = True
if is_showing:
key = cv2.waitKey(100)
else:
time.sleep(0.1)
key = 0
if key == ord('\n') or key == ord('\r'):
output_queue.put ( {'op': 'close'} )
elif key == ord('s'):
output_queue.put ( {'op': 'save'} )
elif key == ord('p'):
if not is_waiting_preview:
is_waiting_preview = True
output_queue.put ( {'op': 'preview'} )
elif key == ord(' '):
selected_preview = (selected_preview + 1) % len(previews)
update_preview = True
cv2.destroyAllWindows()
def main (training_data_src_dir, training_data_dst_dir, model_path, model_name, **in_options):
print ("Running trainer.\r\n")
output_queue = queue.Queue()
input_queue = queue.Queue()
import threading
thread = threading.Thread(target=trainerThread, args=(output_queue, input_queue, training_data_src_dir, training_data_dst_dir, model_path, model_name), kwargs=in_options )
thread.start()
previewThread (input_queue, output_queue)

71
mathlib/umeyama.py Normal file
View file

@ -0,0 +1,71 @@
import numpy as np
def umeyama(src, dst, estimate_scale):
"""Estimate N-D similarity transformation with or without scaling.
Parameters
----------
src : (M, N) array
Source coordinates.
dst : (M, N) array
Destination coordinates.
estimate_scale : bool
Whether to estimate scaling factor.
Returns
-------
T : (N + 1, N + 1)
The homogeneous similarity transformation matrix. The matrix contains
NaN values only if the problem is not well-conditioned.
References
----------
.. [1] "Least-squares estimation of transformation parameters between two
point patterns", Shinji Umeyama, PAMI 1991, DOI: 10.1109/34.88573
"""
num = src.shape[0]
dim = src.shape[1]
# Compute mean of src and dst.
src_mean = src.mean(axis=0)
dst_mean = dst.mean(axis=0)
# Subtract mean from src and dst.
src_demean = src - src_mean
dst_demean = dst - dst_mean
# Eq. (38).
A = np.dot(dst_demean.T, src_demean) / num
# Eq. (39).
d = np.ones((dim,), dtype=np.double)
if np.linalg.det(A) < 0:
d[dim - 1] = -1
T = np.eye(dim + 1, dtype=np.double)
U, S, V = np.linalg.svd(A)
# Eq. (40) and (43).
rank = np.linalg.matrix_rank(A)
if rank == 0:
return np.nan * T
elif rank == dim - 1:
if np.linalg.det(U) * np.linalg.det(V) > 0:
T[:dim, :dim] = np.dot(U, V)
else:
s = d[dim - 1]
d[dim - 1] = -1
T[:dim, :dim] = np.dot(U, np.dot(np.diag(d), V))
d[dim - 1] = s
else:
T[:dim, :dim] = np.dot(U, np.dot(np.diag(d), V.T))
if estimate_scale:
# Eq. (41) and (42).
scale = 1.0 / src_demean.var(axis=0).sum() * np.dot(S, d)
else:
scale = 1.0
T[:dim, dim] = dst_mean - scale * np.dot(T[:dim, :dim], src_mean.T)
T[:dim, :dim] *= scale
return T

50
models/BaseTypes.py Normal file
View file

@ -0,0 +1,50 @@
from enum import IntEnum
import cv2
import numpy as np
from random import randint
from facelib import FaceType
class TrainingDataType(IntEnum):
IMAGE = 0 #raw image
FACE_BEGIN = 1
FACE = 1 #aligned face unsorted
FACE_YAW_SORTED = 2 #sorted by yaw
FACE_YAW_SORTED_AS_TARGET = 3 #sorted by yaw and included only yaws which exist in TARGET also automatic mirrored
FACE_END = 3
QTY = 4
class TrainingDataSample(object):
def __init__(self, filename=None, face_type=None, shape=None, landmarks=None, yaw=None, mirror=None, nearest_target_list=None):
self.filename = filename
self.face_type = face_type
self.shape = shape
self.landmarks = np.array(landmarks) if landmarks is not None else None
self.yaw = yaw
self.mirror = mirror
self.nearest_target_list = nearest_target_list
def copy_and_set(self, filename=None, face_type=None, shape=None, landmarks=None, yaw=None, mirror=None, nearest_target_list=None):
return TrainingDataSample(
filename=filename if filename is not None else self.filename,
face_type=face_type if face_type is not None else self.face_type,
shape=shape if shape is not None else self.shape,
landmarks=landmarks if landmarks is not None else self.landmarks.copy(),
yaw=yaw if yaw is not None else self.yaw,
mirror=mirror if mirror is not None else self.mirror,
nearest_target_list=nearest_target_list if nearest_target_list is not None else self.nearest_target_list)
def load_bgr(self):
img = cv2.imread (self.filename).astype(np.float32) / 255.0
if self.mirror:
img = img[:,::-1].copy()
return img
def get_random_nearest_target_sample(self):
if self.nearest_target_list is None:
return None
return self.nearest_target_list[randint (0, len(self.nearest_target_list)-1)]

44
models/ConverterBase.py Normal file
View file

@ -0,0 +1,44 @@
import copy
'''
You can implement your own Converter, check example ConverterMasked.py
'''
class ConverterBase(object):
MODE_FACE = 0
MODE_IMAGE = 1
#overridable
def __init__(self, predictor):
self.predictor = predictor
#overridable
def get_mode(self):
#MODE_FACE calls convert_face
#MODE_IMAGE calls convert_image
return ConverterBase.MODE_FACE
#overridable
def convert_face (self, img_bgr, img_face_landmarks, debug):
#return float32 image
#if debug , return tuple ( images of any size and channels, ...)
return image
#overridable
def convert_image (self, img_bgr, img_landmarks, debug):
#img_landmarks not None, if input image is png with embedded data
#return float32 image
#if debug , return tuple ( images of any size and channels, ...)
return image
#overridable
def dummy_predict(self):
#do dummy predict here
pass
def copy(self):
return copy.copy(self)
def copy_and_set_predictor(self, predictor):
result = self.copy()
result.predictor = predictor
return result

46
models/ConverterImage.py Normal file
View file

@ -0,0 +1,46 @@
from models import ConverterBase
from facelib import LandmarksProcessor
from facelib import FaceType
import cv2
import numpy as np
from utils import image_utils
'''
predictor:
input: [predictor_input_size, predictor_input_size, BGR]
output: [predictor_input_size, predictor_input_size, BGR]
'''
class ConverterImage(ConverterBase):
#override
def __init__(self, predictor,
predictor_input_size=0,
output_size=0,
**in_options):
super().__init__(predictor)
self.predictor_input_size = predictor_input_size
self.output_size = output_size
#override
def get_mode(self):
return ConverterBase.MODE_IMAGE
#override
def dummy_predict(self):
self.predictor ( np.zeros ( (self.predictor_input_size, self.predictor_input_size,3), dtype=np.float32) )
#override
def convert_image (self, img_bgr, img_landmarks, debug):
img_size = img_bgr.shape[1], img_bgr.shape[0]
predictor_input_bgr = cv2.resize ( img_bgr, (self.predictor_input_size, self.predictor_input_size), cv2.INTER_LANCZOS4 )
predicted_bgr = self.predictor ( predictor_input_bgr )
output = cv2.resize ( predicted_bgr, (self.output_size, self.output_size), cv2.INTER_LANCZOS4 )
if debug:
return (img_bgr,output,)
return output

194
models/ConverterMasked.py Normal file
View file

@ -0,0 +1,194 @@
from models import ConverterBase
from facelib import LandmarksProcessor
from facelib import FaceType
import cv2
import numpy as np
from utils import image_utils
'''
predictor:
input: [predictor_input_size, predictor_input_size, BGRA]
output: [predictor_input_size, predictor_input_size, BGRA]
'''
class ConverterMasked(ConverterBase):
#override
def __init__(self, predictor,
predictor_input_size=0,
output_size=0,
face_type=FaceType.FULL,
erode_mask = True,
blur_mask = True,
clip_border_mask_per = 0,
masked_hist_match = False,
mode='seamless',
erode_mask_modifier=0,
blur_mask_modifier=0,
**in_options):
super().__init__(predictor)
self.predictor_input_size = predictor_input_size
self.output_size = output_size
self.face_type = face_type
self.erode_mask = erode_mask
self.blur_mask = blur_mask
self.clip_border_mask_per = clip_border_mask_per
self.masked_hist_match = masked_hist_match
self.mode = mode
self.erode_mask_modifier = erode_mask_modifier
self.blur_mask_modifier = blur_mask_modifier
if self.erode_mask_modifier != 0 and not self.erode_mask:
print ("Erode mask modifier not used in this model.")
if self.blur_mask_modifier != 0 and not self.blur_mask:
print ("Blur modifier not used in this model.")
#override
def get_mode(self):
return ConverterBase.MODE_FACE
#override
def dummy_predict(self):
self.predictor ( np.zeros ( (self.predictor_input_size,self.predictor_input_size,4), dtype=np.float32 ) )
#override
def convert_face (self, img_bgr, img_face_landmarks, debug):
if debug:
debugs = [img_bgr.copy()]
img_size = img_bgr.shape[1], img_bgr.shape[0]
img_face_mask_a = LandmarksProcessor.get_image_hull_mask (img_bgr, img_face_landmarks)
face_mat = LandmarksProcessor.get_transform_mat (img_face_landmarks, self.output_size, face_type=self.face_type)
dst_face_bgr = cv2.warpAffine( img_bgr , face_mat, (self.output_size, self.output_size), flags=cv2.INTER_LANCZOS4 )
dst_face_mask_a_0 = cv2.warpAffine( img_face_mask_a, face_mat, (self.output_size, self.output_size), flags=cv2.INTER_LANCZOS4 )
predictor_input_bgr = cv2.resize (dst_face_bgr, (self.predictor_input_size,self.predictor_input_size))
predictor_input_mask_a_0 = cv2.resize (dst_face_mask_a_0, (self.predictor_input_size,self.predictor_input_size))
predictor_input_mask_a = np.expand_dims (predictor_input_mask_a_0, -1)
predicted_bgra = self.predictor ( np.concatenate( (predictor_input_bgr, predictor_input_mask_a), -1) )
prd_face_bgr = np.clip (predicted_bgra[:,:,0:3], 0, 1.0 )
prd_face_mask_a_0 = np.clip (predicted_bgra[:,:,3], 0.0, 1.0)
prd_face_mask_a_0[ prd_face_mask_a_0 < 0.001 ] = 0.0
prd_face_mask_a = np.expand_dims (prd_face_mask_a_0, axis=-1)
prd_face_mask_aaa = np.repeat (prd_face_mask_a, (3,), axis=-1)
img_prd_face_mask_aaa = cv2.warpAffine( prd_face_mask_aaa, face_mat, img_size, np.zeros(img_bgr.shape, dtype=float), flags=cv2.WARP_INVERSE_MAP | cv2.INTER_LANCZOS4 )
img_prd_face_mask_aaa = np.clip (img_prd_face_mask_aaa, 0.0, 1.0)
img_face_mask_aaa = img_prd_face_mask_aaa
if debug:
debugs += [img_face_mask_aaa.copy()]
img_face_mask_aaa [ img_face_mask_aaa <= 0.1 ] = 0.0
img_face_mask_flatten_aaa = img_face_mask_aaa.copy()
img_face_mask_flatten_aaa[img_face_mask_flatten_aaa > 0.9] = 1.0
maxregion = np.argwhere(img_face_mask_flatten_aaa==1.0)
out_img = img_bgr.copy()
if maxregion.size != 0:
miny,minx = maxregion.min(axis=0)[:2]
maxy,maxx = maxregion.max(axis=0)[:2]
lenx = maxx - minx
leny = maxy - miny
masky = int(minx+(lenx//2))
maskx = int(miny+(leny//2))
lowest_len = min (lenx, leny)
if debug:
print ("lowest_len = %f" % (lowest_len) )
ero = int( lowest_len * ( 0.126 - lowest_len * 0.00004551365 ) * 0.01*self.erode_mask_modifier )
blur = int( lowest_len * 0.10 * 0.01*self.blur_mask_modifier )
if debug:
print ("ero = %d, blur = %d" % (ero, blur) )
img_mask_blurry_aaa = img_face_mask_aaa
if self.erode_mask:
if ero > 0:
img_mask_blurry_aaa = cv2.erode(img_mask_blurry_aaa, cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(ero,ero)), iterations = 1 )
elif ero < 0:
img_mask_blurry_aaa = cv2.dilate(img_mask_blurry_aaa, cv2.getStructuringElement(cv2.MORPH_ELLIPSE,(-ero,-ero)), iterations = 1 )
if self.blur_mask and blur > 0:
img_mask_blurry_aaa = cv2.blur(img_mask_blurry_aaa, (blur, blur) )
img_mask_blurry_aaa = np.clip( img_mask_blurry_aaa, 0, 1.0 )
if self.clip_border_mask_per > 0:
prd_border_rect_mask_a = np.ones ( prd_face_mask_a.shape, dtype=prd_face_mask_a.dtype)
prd_border_size = int ( prd_border_rect_mask_a.shape[1] * self.clip_border_mask_per )
prd_border_rect_mask_a[0:prd_border_size,:,:] = 0
prd_border_rect_mask_a[-prd_border_size:,:,:] = 0
prd_border_rect_mask_a[:,0:prd_border_size,:] = 0
prd_border_rect_mask_a[:,-prd_border_size:,:] = 0
prd_border_rect_mask_a = np.expand_dims(cv2.blur(prd_border_rect_mask_a, (prd_border_size, prd_border_size) ),-1)
if self.mode == 'hist-match-bw':
prd_face_bgr = cv2.cvtColor(prd_face_bgr, cv2.COLOR_BGR2GRAY)
prd_face_bgr = np.repeat( np.expand_dims (prd_face_bgr, -1), (3,), -1 )
if self.mode == 'hist-match' or self.mode == 'hist-match-bw':
if debug:
debugs += [ cv2.warpAffine( prd_face_bgr, face_mat, img_size, np.zeros(img_bgr.shape, dtype=np.float32), cv2.WARP_INVERSE_MAP | cv2.INTER_LANCZOS4, cv2.BORDER_TRANSPARENT ) ]
hist_mask_a = np.ones ( prd_face_bgr.shape[:2] + (1,) , dtype=prd_face_bgr.dtype)
if self.masked_hist_match:
hist_mask_a *= prd_face_mask_a
new_prd_face_bgr = image_utils.color_hist_match(prd_face_bgr*hist_mask_a, dst_face_bgr*hist_mask_a )
prd_face_bgr = new_prd_face_bgr
if self.mode == 'hist-match-bw':
prd_face_bgr = prd_face_bgr.astype(np.float32)
out_img = cv2.warpAffine( prd_face_bgr, face_mat, img_size, out_img, cv2.WARP_INVERSE_MAP | cv2.INTER_LANCZOS4, cv2.BORDER_TRANSPARENT )
if debug:
debugs += [out_img.copy()]
debugs += [img_mask_blurry_aaa.copy()]
if self.mode == 'seamless' or self.mode == 'seamless-hist-match':
out_img = np.clip( img_bgr*(1-img_face_mask_aaa) + (out_img*img_face_mask_aaa) , 0, 1.0 )
if debug:
debugs += [out_img.copy()]
out_img = cv2.seamlessClone( (out_img*255).astype(np.uint8), (img_bgr*255).astype(np.uint8), (img_face_mask_flatten_aaa*255).astype(np.uint8), (masky,maskx) , cv2.NORMAL_CLONE )
out_img = out_img.astype(np.float32) / 255.0
if debug:
debugs += [out_img.copy()]
if self.clip_border_mask_per > 0:
img_prd_border_rect_mask_a = cv2.warpAffine( prd_border_rect_mask_a, face_mat, img_size, np.zeros(img_bgr.shape, dtype=np.float32), cv2.WARP_INVERSE_MAP | cv2.INTER_LANCZOS4, cv2.BORDER_TRANSPARENT )
img_prd_border_rect_mask_a = np.expand_dims (img_prd_border_rect_mask_a, -1)
out_img = out_img * img_prd_border_rect_mask_a + img_bgr * (1.0 - img_prd_border_rect_mask_a)
img_mask_blurry_aaa *= img_prd_border_rect_mask_a
out_img = np.clip( img_bgr*(1-img_mask_blurry_aaa) + (out_img*img_mask_blurry_aaa) , 0, 1.0 )
if self.mode == 'seamless-hist-match':
out_face_bgr = cv2.warpAffine( out_img, face_mat, (self.output_size, self.output_size) )
new_out_face_bgr = image_utils.color_hist_match(out_face_bgr, dst_face_bgr )
new_out = cv2.warpAffine( new_out_face_bgr, face_mat, img_size, img_bgr.copy(), cv2.WARP_INVERSE_MAP | cv2.INTER_LANCZOS4, cv2.BORDER_TRANSPARENT )
out_img = np.clip( img_bgr*(1-img_mask_blurry_aaa) + (new_out*img_mask_blurry_aaa) , 0, 1.0 )
if debug:
debugs += [out_img.copy()]
return debugs if debug else out_img

332
models/ModelBase.py Normal file
View file

@ -0,0 +1,332 @@
import os
import time
import inspect
import operator
import pickle
from pathlib import Path
from utils import Path_utils
from utils import std_utils
from utils import image_utils
import numpy as np
import cv2
import gpufmkmgr
from .TrainingDataGeneratorBase import TrainingDataGeneratorBase
'''
You can implement your own model. Check examples.
'''
class ModelBase(object):
#DONT OVERRIDE
def __init__(self, model_path, training_data_src_path=None, training_data_dst_path=None,
batch_size=0,
multi_gpu = False,
choose_worst_gpu = False,
force_best_gpu_idx = -1,
force_gpu_idxs = None,
write_preview_history = False,
debug = False, **in_options
):
print ("Loading model...")
self.model_path = model_path
self.model_data_path = Path( self.get_strpath_storage_for_file('data.dat') )
self.training_data_src_path = training_data_src_path
self.training_data_dst_path = training_data_dst_path
self.src_images_paths = None
self.dst_images_paths = None
self.src_yaw_images_paths = None
self.dst_yaw_images_paths = None
self.src_data_generator = None
self.dst_data_generator = None
self.is_training_mode = (training_data_src_path is not None and training_data_dst_path is not None)
self.batch_size = batch_size
self.write_preview_history = write_preview_history
self.debug = debug
self.supress_std_once = ('TF_SUPPRESS_STD' in os.environ.keys() and os.environ['TF_SUPPRESS_STD'] == '1')
if self.model_data_path.exists():
model_data = pickle.loads ( self.model_data_path.read_bytes() )
self.epoch = model_data['epoch']
self.options = model_data['options']
self.loss_history = model_data['loss_history'] if 'loss_history' in model_data.keys() else []
self.generator_dict_states = model_data['generator_dict_states'] if 'generator_dict_states' in model_data.keys() else None
self.sample_for_preview = model_data['sample_for_preview'] if 'sample_for_preview' in model_data.keys() else None
else:
self.epoch = 0
self.options = {}
self.loss_history = []
self.generator_dict_states = None
self.sample_for_preview = None
if self.write_preview_history:
self.preview_history_path = self.model_path / ( '%s_history' % (self.get_model_name()) )
if not self.preview_history_path.exists():
self.preview_history_path.mkdir(exist_ok=True)
else:
if self.epoch == 0:
for filename in Path_utils.get_image_paths(self.preview_history_path):
Path(filename).unlink()
self.multi_gpu = multi_gpu
gpu_idx = force_best_gpu_idx if (force_best_gpu_idx >= 0 and gpufmkmgr.isValidDeviceIdx(force_best_gpu_idx)) else gpufmkmgr.getBestDeviceIdx() if not choose_worst_gpu else gpufmkmgr.getWorstDeviceIdx()
gpu_total_vram_gb = gpufmkmgr.getDeviceVRAMTotalGb (gpu_idx)
is_gpu_low_mem = (gpu_total_vram_gb < 4)
self.gpu_total_vram_gb = gpu_total_vram_gb
if self.epoch == 0:
#first run
self.options['created_vram_gb'] = gpu_total_vram_gb
self.created_vram_gb = gpu_total_vram_gb
else:
#not first run
if 'created_vram_gb' in self.options.keys():
self.created_vram_gb = self.options['created_vram_gb']
else:
self.options['created_vram_gb'] = gpu_total_vram_gb
self.created_vram_gb = gpu_total_vram_gb
if force_gpu_idxs is not None:
self.gpu_idxs = [ int(x) for x in force_gpu_idxs.split(',') ]
else:
if self.multi_gpu:
self.gpu_idxs = gpufmkmgr.getDeviceIdxsEqualModel( gpu_idx )
if len(self.gpu_idxs) <= 1:
self.multi_gpu = False
else:
self.gpu_idxs = [gpu_idx]
self.tf = gpufmkmgr.import_tf(self.gpu_idxs,allow_growth=False)
self.tf_sess = gpufmkmgr.get_tf_session()
self.keras = gpufmkmgr.import_keras()
self.keras_contrib = gpufmkmgr.import_keras_contrib()
self.onInitialize(**in_options)
if self.debug or self.batch_size == 0:
self.batch_size = 1
if self.is_training_mode:
if self.generator_list is None:
raise Exception( 'You didnt set_training_data_generators()')
else:
for i, generator in enumerate(self.generator_list):
if not isinstance(generator, TrainingDataGeneratorBase):
raise Exception('training data generator is not subclass of TrainingDataGeneratorBase')
if self.generator_dict_states is not None and i < len(self.generator_dict_states):
generator.set_dict_state ( self.generator_dict_states[i] )
if self.sample_for_preview is None:
self.sample_for_preview = self.generate_next_sample()
print ("===== Model summary =====")
print ("== Model name: " + self.get_model_name())
print ("==")
print ("== Current epoch: " + str(self.epoch) )
print ("==")
print ("== Options:")
print ("== |== batch_size : %s " % (self.batch_size) )
print ("== |== multi_gpu : %s " % (self.multi_gpu) )
for key in self.options.keys():
print ("== |== %s : %s" % (key, self.options[key]) )
print ("== Running on:")
for idx in self.gpu_idxs:
print ("== |== [%d : %s]" % (idx, gpufmkmgr.getDeviceName(idx)) )
if self.gpu_total_vram_gb == 2:
print ("==")
print ("== WARNING: You are using 2GB GPU. Result quality may be significantly decreased.")
print ("== If training does not start, close all programs and try again.")
print ("== Also you can disable Windows Aero Desktop to get extra free VRAM.")
print ("==")
print ("=========================")
#overridable
def onInitialize(self, **in_options):
'''
initialize your keras models
store and retrieve your model options in self.options['']
check example
'''
pass
#overridable
def onSave(self):
#save your keras models here
pass
#overridable
def onTrainOneEpoch(self, sample):
#train your keras models here
#return array of losses
return ( ('loss_src', 0), ('loss_dst', 0) )
#overridable
def onGetPreview(self, sample):
#you can return multiple previews
#return [ ('preview_name',preview_rgb), ... ]
return []
#overridable if you want model name differs from folder name
def get_model_name(self):
return Path(inspect.getmodule(self).__file__).parent.name.rsplit("_", 1)[1]
#overridable
def get_converter(self, **in_options):
#return existing or your own converter which derived from base
from .ConverterBase import ConverterBase
return ConverterBase(self, **in_options)
def to_multi_gpu_model_if_possible (self, models_list):
if len(self.gpu_idxs) > 1:
#make batch_size to divide on GPU count without remainder
self.batch_size = int( self.batch_size / len(self.gpu_idxs) )
if self.batch_size == 0:
self.batch_size = 1
self.batch_size *= len(self.gpu_idxs)
result = []
for model in models_list:
for i in range( len(model.output_names) ):
model.output_names = 'output_%d' % (i)
result += [ self.keras.utils.multi_gpu_model( model, self.gpu_idxs ) ]
return result
else:
return models_list
def get_previews(self):
return self.onGetPreview ( self.last_sample )
def get_static_preview(self):
return self.onGetPreview (self.sample_for_preview)[0][1] #first preview, and bgr
def save(self):
print ("Saving...")
if self.supress_std_once:
supressor = std_utils.suppress_stdout_stderr()
supressor.__enter__()
self.onSave()
if self.supress_std_once:
supressor.__exit__()
model_data = {
'epoch': self.epoch,
'options': self.options,
'loss_history': self.loss_history,
'generator_dict_states' : [generator.get_dict_state() for generator in self.generator_list],
'sample_for_preview' : self.sample_for_preview
}
self.model_data_path.write_bytes( pickle.dumps(model_data) )
def save_weights_safe(self, model_filename_list):
for model, filename in model_filename_list:
model.save_weights( filename + '.tmp' )
for model, filename in model_filename_list:
source_filename = Path(filename+'.tmp')
target_filename = Path(filename)
if target_filename.exists():
target_filename.unlink()
source_filename.rename ( str(target_filename) )
def debug_one_epoch(self):
images = []
for generator in self.generator_list:
for i,batch in enumerate(next(generator)):
images.append( batch[0] )
return image_utils.equalize_and_stack_square (images)
def generate_next_sample(self):
return [next(generator) for generator in self.generator_list]
def train_one_epoch(self):
if self.supress_std_once:
supressor = std_utils.suppress_stdout_stderr()
supressor.__enter__()
self.last_sample = self.generate_next_sample()
epoch_time = time.time()
losses = self.onTrainOneEpoch(self.last_sample)
epoch_time = time.time() - epoch_time
self.loss_history.append ( [float(loss[1]) for loss in losses] )
if self.supress_std_once:
supressor.__exit__()
self.supress_std_once = False
if self.write_preview_history:
if self.epoch % 10 == 0:
img = (self.get_static_preview() * 255).astype(np.uint8)
cv2.imwrite ( str (self.preview_history_path / ('%.6d.jpg' %( self.epoch) )), img )
self.epoch += 1
#............."Saving...
loss_string = "Training [#{0:06d}][{1:04d}ms]".format ( self.epoch, int(epoch_time*1000) % 10000 )
for (loss_name, loss_value) in losses:
loss_string += " %s:%.3f" % (loss_name, loss_value)
return loss_string
def pass_one_epoch(self):
self.last_sample = self.generate_next_sample()
def finalize(self):
gpufmkmgr.finalize_keras()
def is_first_run(self):
return self.epoch == 0
def is_debug(self):
return self.debug
def get_epoch(self):
return self.epoch
def get_loss_history(self):
return self.loss_history
def set_training_data_generators (self, generator_list):
self.generator_list = generator_list
def get_training_data_generators (self):
return self.generator_list
def get_strpath_storage_for_file(self, filename):
return str( self.model_path / (self.get_model_name() + '_' + filename) )
def set_vram_batch_requirements (self, d):
#example d = {2:2,3:4,4:8,5:16,6:32,7:32,8:32,9:48}
keys = [x for x in d.keys()]
if self.gpu_total_vram_gb < keys[0]:
raise Exception ('Sorry, this model works only on %dGB+ GPU' % ( keys[0] ) )
if self.batch_size == 0:
for x in keys:
if self.gpu_total_vram_gb <= x:
self.batch_size = d[x]
break
if self.batch_size == 0:
self.batch_size = d[ keys[-1] ]

View file

@ -0,0 +1,223 @@
from models import ModelBase
from models import TrainingDataType
import numpy as np
import cv2
from nnlib import tf_dssim
from nnlib import conv
from nnlib import upscale
class Model(ModelBase):
encoder64H5 = 'encoder64.h5'
decoder64_srcH5 = 'decoder64_src.h5'
decoder64_dstH5 = 'decoder64_dst.h5'
encoder128H5 = 'encoder128.h5'
decoder128_srcH5 = 'decoder128_src.h5'
#override
def onInitialize(self, **in_options):
tf = self.tf
keras = self.keras
K = keras.backend
self.set_vram_batch_requirements( {4:8,5:16,6:20,7:24,8:32,9:48} )
self.encoder64, self.decoder64_src, self.decoder64_dst, self.encoder128, self.decoder128_src = self.BuildAE()
img_shape64 = (64,64,1)
img_shape128 = (256,256,3)
if not self.is_first_run():
self.encoder64.load_weights (self.get_strpath_storage_for_file(self.encoder64H5))
self.decoder64_src.load_weights (self.get_strpath_storage_for_file(self.decoder64_srcH5))
self.decoder64_dst.load_weights (self.get_strpath_storage_for_file(self.decoder64_dstH5))
self.encoder128.load_weights (self.get_strpath_storage_for_file(self.encoder128H5))
self.decoder128_src.load_weights (self.get_strpath_storage_for_file(self.decoder128_srcH5))
if self.is_training_mode:
self.encoder64, self.decoder64_src, self.decoder64_dst, self.encoder128, self.decoder128_src = self.to_multi_gpu_model_if_possible ( [self.encoder64, self.decoder64_src, self.decoder64_dst, self.encoder128, self.decoder128_src] )
input_src_64 = keras.layers.Input(img_shape64)
input_src_target64 = keras.layers.Input(img_shape64)
input_src_target128 = keras.layers.Input(img_shape128)
input_dst_64 = keras.layers.Input(img_shape64)
input_dst_target64 = keras.layers.Input(img_shape64)
src_code64 = self.encoder64(input_src_64)
dst_code64 = self.encoder64(input_dst_64)
rec_src64 = self.decoder64_src(src_code64)
rec_dst64 = self.decoder64_dst(dst_code64)
src64_loss = tf_dssim(tf, input_src_target64, rec_src64)
dst64_loss = tf_dssim(tf, input_dst_target64, rec_dst64)
total64_loss = src64_loss + dst64_loss
self.ed64_train = K.function ([input_src_64, input_src_target64, input_dst_64, input_dst_target64],[K.mean(total64_loss)],
self.keras.optimizers.Adam(lr=5e-5, beta_1=0.5, beta_2=0.999).get_updates(total64_loss, self.encoder64.trainable_weights + self.decoder64_src.trainable_weights + self.decoder64_dst.trainable_weights)
)
src_code128 = self.encoder128(input_src_64)
rec_src128 = self.decoder128_src(src_code128)
src128_loss = tf_dssim(tf, input_src_target128, rec_src128)
self.ed128_train = K.function ([input_src_64, input_src_target128],[K.mean(src128_loss)],
self.keras.optimizers.Adam(lr=5e-5, beta_1=0.5, beta_2=0.999).get_updates(src128_loss, self.encoder128.trainable_weights + self.decoder128_src.trainable_weights)
)
src_code128 = self.encoder128(rec_src64)
rec_src128 = self.decoder128_src(src_code128)
self.src128_view = K.function ([input_src_64], [rec_src128])
if self.is_training_mode:
from models import TrainingDataGenerator
f = TrainingDataGenerator.SampleTypeFlags
self.set_training_data_generators ([
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_src_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[
[f.WARPED_TRANSFORMED | f.HALF_FACE | f.MODE_G, 64],
[f.TRANSFORMED | f.HALF_FACE | f.MODE_G, 64],
[f.TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 256],
[f.SOURCE | f.HALF_FACE | f.MODE_G, 64],
[f.SOURCE | f.HALF_FACE | f.MODE_GGG, 256] ] ),
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_dst_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[
[f.WARPED_TRANSFORMED | f.HALF_FACE | f.MODE_G, 64],
[f.TRANSFORMED | f.HALF_FACE | f.MODE_G, 64],
[f.SOURCE | f.HALF_FACE | f.MODE_G, 64],
[f.SOURCE | f.HALF_FACE | f.MODE_GGG, 256] ] )
])
#override
def onSave(self):
self.save_weights_safe( [[self.encoder64, self.get_strpath_storage_for_file(self.encoder64H5)],
[self.decoder64_src, self.get_strpath_storage_for_file(self.decoder64_srcH5)],
[self.decoder64_dst, self.get_strpath_storage_for_file(self.decoder64_dstH5)],
[self.encoder128, self.get_strpath_storage_for_file(self.encoder128H5)],
[self.decoder128_src, self.get_strpath_storage_for_file(self.decoder128_srcH5)],
] )
#override
def onTrainOneEpoch(self, sample):
warped_src64, target_src64, target_src128, target_src_source64_G, target_src_source128_GGG = sample[0]
warped_dst64, target_dst64, target_dst_source64_G, target_dst_source128_GGG = sample[1]
loss64, = self.ed64_train ([warped_src64, target_src64, warped_dst64, target_dst64])
loss256, = self.ed128_train ([warped_src64, target_src128])
return ( ('loss64', loss64), ('loss256', loss256) )
#override
def onGetPreview(self, sample):
n_samples = 4
test_B = sample[1][2][0:n_samples]
test_B128 = sample[1][3][0:n_samples]
BB, = self.src128_view ([test_B])
st = []
for i in range(n_samples // 2):
st.append ( np.concatenate ( (
test_B128[i*2+0], BB[i*2+0], test_B128[i*2+1], BB[i*2+1],
), axis=1) )
return [ ('AVATAR', np.concatenate ( st, axis=0 ) ) ]
def predictor_func (self, img):
x, = self.src128_view ([ np.expand_dims(img, 0) ])[0]
return x
#override
def get_converter(self, **in_options):
return ConverterAvatar(self.predictor_func, predictor_input_size=64, output_size=256, **in_options)
def BuildAE(self):
keras, K = self.keras, self.keras.backend
def Encoder(_input):
x = keras.layers.convolutional.Conv2D(90, kernel_size=5, strides=1, padding='same')(_input)
x = keras.layers.convolutional.Conv2D(90, kernel_size=5, strides=1, padding='same')(x)
x = keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2, padding='same')(x)
x = keras.layers.convolutional.Conv2D(180, kernel_size=3, strides=1, padding='same')(x)
x = keras.layers.convolutional.Conv2D(180, kernel_size=3, strides=1, padding='same')(x)
x = keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2, padding='same')(x)
x = keras.layers.convolutional.Conv2D(360, kernel_size=3, strides=1, padding='same')(x)
x = keras.layers.convolutional.Conv2D(360, kernel_size=3, strides=1, padding='same')(x)
x = keras.layers.MaxPooling2D(pool_size=(3, 3), strides=2, padding='same')(x)
x = keras.layers.Dense (1024)(x)
x = keras.layers.advanced_activations.LeakyReLU(0.1)(x)
x = keras.layers.Dropout(0.5)(x)
x = keras.layers.Dense (1024)(x)
x = keras.layers.advanced_activations.LeakyReLU(0.1)(x)
x = keras.layers.Dropout(0.5)(x)
x = keras.layers.Flatten()(x)
x = keras.layers.Dense (64)(x)
return keras.models.Model (_input, x)
encoder128 = Encoder( keras.layers.Input ( (64, 64, 1) ) )
encoder64 = Encoder( keras.layers.Input ( (64, 64, 1) ) )
def decoder128_3(encoder):
decoder_input = keras.layers.Input ( K.int_shape(encoder.outputs[0])[1:] )
x = decoder_input
x = self.keras.layers.Dense(16 * 16 * 720)(x)
x = keras.layers.Reshape ( (16, 16, 720) )(x)
x = upscale(keras, x, 720)
x = upscale(keras, x, 360)
x = upscale(keras, x, 180)
x = upscale(keras, x, 90)
x = keras.layers.convolutional.Conv2D(3, kernel_size=5, padding='same', activation='sigmoid')(x)
return keras.models.Model(decoder_input, x)
def decoder64_1(encoder):
decoder_input = keras.layers.Input ( K.int_shape(encoder.outputs[0])[1:] )
x = decoder_input
x = self.keras.layers.Dense(8 * 8 * 720)(x)
x = keras.layers.Reshape ( (8,8,720) )(x)
x = upscale(keras, x, 360)
x = upscale(keras, x, 180)
x = upscale(keras, x, 90)
x = keras.layers.convolutional.Conv2D(1, kernel_size=5, padding='same', activation='sigmoid')(x)
return keras.models.Model(decoder_input, x)
return encoder64, decoder64_1(encoder64), decoder64_1(encoder64), encoder128, decoder128_3(encoder128)
from models import ConverterBase
from facelib import FaceType
from facelib import LandmarksProcessor
class ConverterAvatar(ConverterBase):
#override
def __init__(self, predictor,
predictor_input_size=0,
output_size=0,
**in_options):
super().__init__(predictor)
self.predictor_input_size = predictor_input_size
self.output_size = output_size
#override
def get_mode(self):
return ConverterBase.MODE_IMAGE
#override
def dummy_predict(self):
self.predictor ( np.zeros ( (self.predictor_input_size, self.predictor_input_size,1), dtype=np.float32) )
#override
def convert_image (self, img_bgr, img_face_landmarks, debug):
img_size = img_bgr.shape[1], img_bgr.shape[0]
face_mat = LandmarksProcessor.get_transform_mat (img_face_landmarks, self.predictor_input_size, face_type=FaceType.HALF )
predictor_input_bgr = cv2.warpAffine( img_bgr, face_mat, (self.predictor_input_size, self.predictor_input_size), flags=cv2.INTER_LANCZOS4 )
predictor_input_g = np.expand_dims(cv2.cvtColor(predictor_input_bgr, cv2.COLOR_BGR2GRAY),-1)
predicted_bgr = self.predictor ( predictor_input_g )
output = cv2.resize ( predicted_bgr, (self.output_size, self.output_size), cv2.INTER_LANCZOS4 )
if debug:
return (img_bgr,output,)
return output

View file

@ -0,0 +1 @@
from .Model import Model

153
models/Model_DF/Model.py Normal file
View file

@ -0,0 +1,153 @@
from models import ModelBase
from models import TrainingDataType
import numpy as np
import cv2
from nnlib import DSSIMMaskLossClass
from nnlib import conv
from nnlib import upscale
from facelib import FaceType
class Model(ModelBase):
encoderH5 = 'encoder.h5'
decoder_srcH5 = 'decoder_src.h5'
decoder_dstH5 = 'decoder_dst.h5'
#override
def onInitialize(self, **in_options):
self.set_vram_batch_requirements( {5:16,6:16,7:16,8:24,9:24,10:32,11:32,12:32,13:48} )
ae_input_layer = self.keras.layers.Input(shape=(128, 128, 3))
mask_layer = self.keras.layers.Input(shape=(128, 128, 1)) #same as output
self.encoder = self.Encoder(ae_input_layer)
self.decoder_src = self.Decoder()
self.decoder_dst = self.Decoder()
if not self.is_first_run():
self.encoder.load_weights (self.get_strpath_storage_for_file(self.encoderH5))
self.decoder_src.load_weights (self.get_strpath_storage_for_file(self.decoder_srcH5))
self.decoder_dst.load_weights (self.get_strpath_storage_for_file(self.decoder_dstH5))
self.autoencoder_src = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder_src(self.encoder(ae_input_layer)))
self.autoencoder_dst = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder_dst(self.encoder(ae_input_layer)))
if self.is_training_mode:
self.autoencoder_src, self.autoencoder_dst = self.to_multi_gpu_model_if_possible ( [self.autoencoder_src, self.autoencoder_dst] )
optimizer = self.keras.optimizers.Adam(lr=5e-5, beta_1=0.5, beta_2=0.999)
dssimloss = DSSIMMaskLossClass(self.tf)([mask_layer])
self.autoencoder_src.compile(optimizer=optimizer, loss=[dssimloss, 'mse'] )
self.autoencoder_dst.compile(optimizer=optimizer, loss=[dssimloss, 'mse'] )
if self.is_training_mode:
from models import TrainingDataGenerator
f = TrainingDataGenerator.SampleTypeFlags
self.set_training_data_generators ([
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_src_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_M | f.MASK_FULL, 128] ], random_flip=True ),
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_dst_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_M | f.MASK_FULL, 128] ], random_flip=True )
])
#override
def onSave(self):
self.save_weights_safe( [[self.encoder, self.get_strpath_storage_for_file(self.encoderH5)],
[self.decoder_src, self.get_strpath_storage_for_file(self.decoder_srcH5)],
[self.decoder_dst, self.get_strpath_storage_for_file(self.decoder_dstH5)]] )
#override
def onTrainOneEpoch(self, sample):
warped_src, target_src, target_src_mask = sample[0]
warped_dst, target_dst, target_dst_mask = sample[1]
loss_src = self.autoencoder_src.train_on_batch( [warped_src, target_src_mask], [target_src, target_src_mask] )
loss_dst = self.autoencoder_dst.train_on_batch( [warped_dst, target_dst_mask], [target_dst, target_dst_mask] )
return ( ('loss_src', loss_src[0]), ('loss_dst', loss_dst[0]) )
#override
def onGetPreview(self, sample):
test_A = sample[0][1][0:4] #first 4 samples
test_A_m = sample[0][2][0:4] #first 4 samples
test_B = sample[1][1][0:4]
test_B_m = sample[1][2][0:4]
AA, mAA = self.autoencoder_src.predict([test_A, test_A_m])
AB, mAB = self.autoencoder_src.predict([test_B, test_B_m])
BB, mBB = self.autoencoder_dst.predict([test_B, test_B_m])
mAA = np.repeat ( mAA, (3,), -1)
mAB = np.repeat ( mAB, (3,), -1)
mBB = np.repeat ( mBB, (3,), -1)
st = []
for i in range(0, len(test_A)):
st.append ( np.concatenate ( (
test_A[i,:,:,0:3],
AA[i],
#mAA[i],
test_B[i,:,:,0:3],
BB[i],
#mBB[i],
AB[i],
#mAB[i]
), axis=1) )
return [ ('DF', np.concatenate ( st, axis=0 ) ) ]
def predictor_func (self, face):
face_128_bgr = face[...,0:3]
face_128_mask = np.expand_dims(face[...,3],-1)
x, mx = self.autoencoder_src.predict ( [ np.expand_dims(face_128_bgr,0), np.expand_dims(face_128_mask,0) ] )
x, mx = x[0], mx[0]
return np.concatenate ( (x,mx), -1 )
#override
def get_converter(self, **in_options):
from models import ConverterMasked
if 'masked_hist_match' not in in_options.keys() or in_options['masked_hist_match'] is None:
in_options['masked_hist_match'] = True
if 'erode_mask_modifier' not in in_options.keys():
in_options['erode_mask_modifier'] = 0
in_options['erode_mask_modifier'] += 30
if 'blur_mask_modifier' not in in_options.keys():
in_options['blur_mask_modifier'] = 0
return ConverterMasked(self.predictor_func, predictor_input_size=128, output_size=128, face_type=FaceType.FULL, clip_border_mask_per=0.046875, **in_options)
def Encoder(self, input_layer):
x = input_layer
x = conv(self.keras, x, 128)
x = conv(self.keras, x, 256)
x = conv(self.keras, x, 512)
x = conv(self.keras, x, 1024)
x = self.keras.layers.Dense(512)(self.keras.layers.Flatten()(x))
x = self.keras.layers.Dense(8 * 8 * 512)(x)
x = self.keras.layers.Reshape((8, 8, 512))(x)
x = upscale(self.keras, x, 512)
return self.keras.models.Model(input_layer, x)
def Decoder(self):
input_ = self.keras.layers.Input(shape=(16, 16, 512))
x = input_
x = upscale(self.keras, x, 512)
x = upscale(self.keras, x, 256)
x = upscale(self.keras, x, 128)
y = input_ #mask decoder
y = upscale(self.keras, y, 512)
y = upscale(self.keras, y, 256)
y = upscale(self.keras, y, 128)
x = self.keras.layers.convolutional.Conv2D(3, kernel_size=5, padding='same', activation='sigmoid')(x)
y = self.keras.layers.convolutional.Conv2D(1, kernel_size=5, padding='same', activation='sigmoid')(y)
return self.keras.models.Model(input_, [x,y])

View file

@ -0,0 +1 @@
from .Model import Model

174
models/Model_H128/Model.py Normal file
View file

@ -0,0 +1,174 @@
from models import ModelBase
from models import TrainingDataType
import numpy as np
from nnlib import DSSIMMaskLossClass
from nnlib import conv
from nnlib import upscale
from facelib import FaceType
import cv2
class Model(ModelBase):
encoderH5 = 'encoder.h5'
decoder_srcH5 = 'decoder_src.h5'
decoder_dstH5 = 'decoder_dst.h5'
#override
def onInitialize(self, **in_options):
self.set_vram_batch_requirements( {3:2,4:2,4:4,5:8,6:8,7:16,8:16,9:24,10:24,11:32,12:32,13:48} )
ae_input_layer = self.keras.layers.Input(shape=(128, 128, 3))
mask_layer = self.keras.layers.Input(shape=(128, 128, 1)) #same as output
self.encoder = self.Encoder(ae_input_layer, self.created_vram_gb)
self.decoder_src = self.Decoder(self.created_vram_gb)
self.decoder_dst = self.Decoder(self.created_vram_gb)
if not self.is_first_run():
self.encoder.load_weights (self.get_strpath_storage_for_file(self.encoderH5))
self.decoder_src.load_weights (self.get_strpath_storage_for_file(self.decoder_srcH5))
self.decoder_dst.load_weights (self.get_strpath_storage_for_file(self.decoder_dstH5))
self.autoencoder_src = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder_src(self.encoder(ae_input_layer)))
self.autoencoder_dst = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder_dst(self.encoder(ae_input_layer)))
if self.is_training_mode:
self.autoencoder_src, self.autoencoder_dst = self.to_multi_gpu_model_if_possible ( [self.autoencoder_src, self.autoencoder_dst] )
optimizer = self.keras.optimizers.Adam(lr=5e-5, beta_1=0.5, beta_2=0.999)
dssimloss = DSSIMMaskLossClass(self.tf)([mask_layer])
self.autoencoder_src.compile(optimizer=optimizer, loss=[dssimloss, 'mae'])
self.autoencoder_dst.compile(optimizer=optimizer, loss=[dssimloss, 'mae'])
if self.is_training_mode:
from models import TrainingDataGenerator
f = TrainingDataGenerator.SampleTypeFlags
self.set_training_data_generators ([
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_src_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.HALF_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.HALF_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.HALF_FACE | f.MODE_M | f.MASK_FULL, 128] ], random_flip=True ),
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_dst_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.HALF_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.HALF_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.HALF_FACE | f.MODE_M | f.MASK_FULL, 128] ], random_flip=True )
])
#override
def onSave(self):
self.save_weights_safe( [[self.encoder, self.get_strpath_storage_for_file(self.encoderH5)],
[self.decoder_src, self.get_strpath_storage_for_file(self.decoder_srcH5)],
[self.decoder_dst, self.get_strpath_storage_for_file(self.decoder_dstH5)]])
#override
def onTrainOneEpoch(self, sample):
warped_src, target_src, target_src_mask = sample[0]
warped_dst, target_dst, target_dst_mask = sample[1]
loss_src = self.autoencoder_src.train_on_batch( [warped_src, target_src_mask], [target_src, target_src_mask] )
loss_dst = self.autoencoder_dst.train_on_batch( [warped_dst, target_dst_mask], [target_dst, target_dst_mask] )
return ( ('loss_src', loss_src[0]), ('loss_dst', loss_dst[0]) )
#override
def onGetPreview(self, sample):
test_A = sample[0][1][0:4] #first 4 samples
test_A_m = sample[0][2][0:4] #first 4 samples
test_B = sample[1][1][0:4]
test_B_m = sample[1][2][0:4]
AA, mAA = self.autoencoder_src.predict([test_A, test_A_m])
AB, mAB = self.autoencoder_src.predict([test_B, test_B_m])
BB, mBB = self.autoencoder_dst.predict([test_B, test_B_m])
mAA = np.repeat ( mAA, (3,), -1)
mAB = np.repeat ( mAB, (3,), -1)
mBB = np.repeat ( mBB, (3,), -1)
st = []
for i in range(0, len(test_A)):
st.append ( np.concatenate ( (
test_A[i,:,:,0:3],
AA[i],
#mAA[i],
test_B[i,:,:,0:3],
BB[i],
#mBB[i],
AB[i],
#mAB[i]
), axis=1) )
return [ ('H128', np.concatenate ( st, axis=0 ) ) ]
def predictor_func (self, face):
face_128_bgr = face[...,0:3]
face_128_mask = np.expand_dims(face[...,3],-1)
x, mx = self.autoencoder_src.predict ( [ np.expand_dims(face_128_bgr,0), np.expand_dims(face_128_mask,0) ] )
x, mx = x[0], mx[0]
return np.concatenate ( (x,mx), -1 )
#override
def get_converter(self, **in_options):
from models import ConverterMasked
if 'masked_hist_match' not in in_options.keys() or in_options['masked_hist_match'] is None:
in_options['masked_hist_match'] = True
if 'erode_mask_modifier' not in in_options.keys():
in_options['erode_mask_modifier'] = 0
in_options['erode_mask_modifier'] += 100
if 'blur_mask_modifier' not in in_options.keys():
in_options['blur_mask_modifier'] = 0
in_options['blur_mask_modifier'] += 100
return ConverterMasked(self.predictor_func, predictor_input_size=128, output_size=128, face_type=FaceType.HALF, **in_options)
def Encoder(self, input_layer, created_vram_gb):
x = input_layer
if created_vram_gb >= 5:
x = conv(self.keras, x, 128)
x = conv(self.keras, x, 256)
x = conv(self.keras, x, 512)
x = conv(self.keras, x, 1024)
x = self.keras.layers.Dense(512)(self.keras.layers.Flatten()(x))
x = self.keras.layers.Dense(8 * 8 * 512)(x)
x = self.keras.layers.Reshape((8, 8, 512))(x)
x = upscale(self.keras, x, 512)
else:
x = conv(self.keras, x, 128)
x = conv(self.keras, x, 256)
x = conv(self.keras, x, 512)
x = conv(self.keras, x, 1024)
x = self.keras.layers.Dense(256)(self.keras.layers.Flatten()(x))
x = self.keras.layers.Dense(8 * 8 * 256)(x)
x = self.keras.layers.Reshape((8, 8, 256))(x)
x = upscale(self.keras, x, 256)
return self.keras.models.Model(input_layer, x)
def Decoder(self, created_vram_gb):
if created_vram_gb >= 5:
input_ = self.keras.layers.Input(shape=(16, 16, 512))
x = input_
x = upscale(self.keras, x, 512)
x = upscale(self.keras, x, 256)
x = upscale(self.keras, x, 128)
y = input_ #mask decoder
y = upscale(self.keras, y, 512)
y = upscale(self.keras, y, 256)
y = upscale(self.keras, y, 128)
else:
input_ = self.keras.layers.Input(shape=(16, 16, 256))
x = input_
x = upscale(self.keras, x, 256)
x = upscale(self.keras, x, 128)
x = upscale(self.keras, x, 64)
y = input_ #mask decoder
y = upscale(self.keras, y, 256)
y = upscale(self.keras, y, 128)
y = upscale(self.keras, y, 64)
x = self.keras.layers.convolutional.Conv2D(3, kernel_size=5, padding='same', activation='sigmoid')(x)
y = self.keras.layers.convolutional.Conv2D(1, kernel_size=5, padding='same', activation='sigmoid')(y)
return self.keras.models.Model(input_, [x,y])

View file

@ -0,0 +1 @@
from .Model import Model

167
models/Model_H64/Model.py Normal file
View file

@ -0,0 +1,167 @@
from models import ModelBase
from models import TrainingDataType
import numpy as np
from nnlib import DSSIMMaskLossClass
from nnlib import conv
from nnlib import upscale
from facelib import FaceType
class Model(ModelBase):
encoderH5 = 'encoder.h5'
decoder_srcH5 = 'decoder_src.h5'
decoder_dstH5 = 'decoder_dst.h5'
#override
def onInitialize(self, **in_options):
self.set_vram_batch_requirements( {2:2,3:4,4:8,5:16,6:32,7:32,8:32,9:48} )
ae_input_layer = self.keras.layers.Input(shape=(64, 64, 3))
mask_layer = self.keras.layers.Input(shape=(64, 64, 1)) #same as output
self.encoder = self.Encoder(ae_input_layer, self.created_vram_gb)
self.decoder_src = self.Decoder(self.created_vram_gb)
self.decoder_dst = self.Decoder(self.created_vram_gb)
if not self.is_first_run():
self.encoder.load_weights (self.get_strpath_storage_for_file(self.encoderH5))
self.decoder_src.load_weights (self.get_strpath_storage_for_file(self.decoder_srcH5))
self.decoder_dst.load_weights (self.get_strpath_storage_for_file(self.decoder_dstH5))
self.autoencoder_src = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder_src(self.encoder(ae_input_layer)))
self.autoencoder_dst = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder_dst(self.encoder(ae_input_layer)))
if self.is_training_mode:
self.autoencoder_src, self.autoencoder_dst = self.to_multi_gpu_model_if_possible ( [self.autoencoder_src, self.autoencoder_dst] )
optimizer = self.keras.optimizers.Adam(lr=5e-5, beta_1=0.5, beta_2=0.999)
dssimloss = DSSIMMaskLossClass(self.tf)([mask_layer])
self.autoencoder_src.compile(optimizer=optimizer, loss=[dssimloss, 'mae'])
self.autoencoder_dst.compile(optimizer=optimizer, loss=[dssimloss, 'mae'])
if self.is_training_mode:
from models import TrainingDataGenerator
f = TrainingDataGenerator.SampleTypeFlags
self.set_training_data_generators ([
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_src_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.HALF_FACE | f.MODE_BGR, 64], [f.TRANSFORMED | f.HALF_FACE | f.MODE_BGR, 64], [f.TRANSFORMED | f.HALF_FACE | f.MODE_M | f.MASK_FULL, 64] ], random_flip=True ),
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_dst_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.HALF_FACE | f.MODE_BGR, 64], [f.TRANSFORMED | f.HALF_FACE | f.MODE_BGR, 64], [f.TRANSFORMED | f.HALF_FACE | f.MODE_M | f.MASK_FULL, 64] ], random_flip=True )
])
#override
def onSave(self):
self.save_weights_safe( [[self.encoder, self.get_strpath_storage_for_file(self.encoderH5)],
[self.decoder_src, self.get_strpath_storage_for_file(self.decoder_srcH5)],
[self.decoder_dst, self.get_strpath_storage_for_file(self.decoder_dstH5)]] )
#override
def onTrainOneEpoch(self, sample):
warped_src, target_src, target_src_full_mask = sample[0]
warped_dst, target_dst, target_dst_full_mask = sample[1]
loss_src = self.autoencoder_src.train_on_batch( [warped_src, target_src_full_mask], [target_src, target_src_full_mask] )
loss_dst = self.autoencoder_dst.train_on_batch( [warped_dst, target_dst_full_mask], [target_dst, target_dst_full_mask] )
return ( ('loss_src', loss_src[0]), ('loss_dst', loss_dst[0]) )
#override
def onGetPreview(self, sample):
test_A = sample[0][1][0:4] #first 4 samples
test_A_m = sample[0][2][0:4]
test_B = sample[1][1][0:4]
test_B_m = sample[1][2][0:4]
AA, mAA = self.autoencoder_src.predict([test_A, test_A_m])
AB, mAB = self.autoencoder_src.predict([test_B, test_B_m])
BB, mBB = self.autoencoder_dst.predict([test_B, test_B_m])
mAA = np.repeat ( mAA, (3,), -1)
mAB = np.repeat ( mAB, (3,), -1)
mBB = np.repeat ( mBB, (3,), -1)
st = []
for i in range(0, len(test_A)):
st.append ( np.concatenate ( (
test_A[i,:,:,0:3],
AA[i],
#mAA[i],
test_B[i,:,:,0:3],
BB[i],
#mBB[i],
AB[i],
#mAB[i]
), axis=1) )
return [ ('H64', np.concatenate ( st, axis=0 ) ) ]
def predictor_func (self, face):
face_64_bgr = face[...,0:3]
face_64_mask = np.expand_dims(face[...,3],-1)
x, mx = self.autoencoder_src.predict ( [ np.expand_dims(face_64_bgr,0), np.expand_dims(face_64_mask,0) ] )
x, mx = x[0], mx[0]
return np.concatenate ( (x,mx), -1 )
#override
def get_converter(self, **in_options):
from models import ConverterMasked
if 'masked_hist_match' not in in_options.keys() or in_options['masked_hist_match'] is None:
in_options['masked_hist_match'] = True
if 'erode_mask_modifier' not in in_options.keys():
in_options['erode_mask_modifier'] = 0
in_options['erode_mask_modifier'] += 100
if 'blur_mask_modifier' not in in_options.keys():
in_options['blur_mask_modifier'] = 0
in_options['blur_mask_modifier'] += 100
return ConverterMasked(self.predictor_func, predictor_input_size=64, output_size=64, face_type=FaceType.HALF, **in_options)
def Encoder(self, input_layer, created_vram_gb):
x = input_layer
if created_vram_gb >= 4:
x = conv(self.keras, x, 128)
x = conv(self.keras, x, 256)
x = conv(self.keras, x, 512)
x = conv(self.keras, x, 1024)
x = self.keras.layers.Dense(1024)(self.keras.layers.Flatten()(x))
x = self.keras.layers.Dense(4 * 4 * 1024)(x)
x = self.keras.layers.Reshape((4, 4, 1024))(x)
x = upscale(self.keras, x, 512)
else:
x = conv(self.keras, x, 128 )
x = conv(self.keras, x, 256 )
x = conv(self.keras, x, 512 )
x = conv(self.keras, x, 768 )
x = self.keras.layers.Dense(512)(self.keras.layers.Flatten()(x))
x = self.keras.layers.Dense(4 * 4 * 512)(x)
x = self.keras.layers.Reshape((4, 4, 512))(x)
x = upscale(self.keras, x, 256)
return self.keras.models.Model(input_layer, x)
def Decoder(self, created_vram_gb):
if created_vram_gb >= 4:
input_ = self.keras.layers.Input(shape=(8, 8, 512))
else:
input_ = self.keras.layers.Input(shape=(8, 8, 256))
x = input_
x = upscale(self.keras, x, 256)
x = upscale(self.keras, x, 128)
x = upscale(self.keras, x, 64)
y = input_ #mask decoder
y = upscale(self.keras, y, 256)
y = upscale(self.keras, y, 128)
y = upscale(self.keras, y, 64)
x = self.keras.layers.convolutional.Conv2D(3, kernel_size=5, padding='same', activation='sigmoid')(x)
y = self.keras.layers.convolutional.Conv2D(1, kernel_size=5, padding='same', activation='sigmoid')(y)
return self.keras.models.Model(input_, [x,y])

View file

@ -0,0 +1 @@
from .Model import Model

View file

@ -0,0 +1,164 @@
from models import ModelBase
from models import TrainingDataType
import numpy as np
import cv2
from nnlib import DSSIMMaskLossClass
from nnlib import conv
from nnlib import upscale
from facelib import FaceType
class Model(ModelBase):
encoderH5 = 'encoder.h5'
decoderH5 = 'decoder.h5'
inter_BH5 = 'inter_B.h5'
inter_ABH5 = 'inter_AB.h5'
#override
def onInitialize(self, **in_options):
self.set_vram_batch_requirements( {5:4,6:8,7:12,8:16,9:20,10:24,11:24,12:32,13:48} )
ae_input_layer = self.keras.layers.Input(shape=(128, 128, 3))
mask_layer = self.keras.layers.Input(shape=(128, 128, 1)) #same as output
self.encoder = self.Encoder(ae_input_layer)
self.decoder = self.Decoder()
self.inter_B = self.Intermediate ()
self.inter_AB = self.Intermediate ()
if not self.is_first_run():
self.encoder.load_weights (self.get_strpath_storage_for_file(self.encoderH5))
self.decoder.load_weights (self.get_strpath_storage_for_file(self.decoderH5))
self.inter_B.load_weights (self.get_strpath_storage_for_file(self.inter_BH5))
self.inter_AB.load_weights (self.get_strpath_storage_for_file(self.inter_ABH5))
code = self.encoder(ae_input_layer)
AB = self.inter_AB(code)
B = self.inter_B(code)
self.autoencoder_src = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder(self.keras.layers.Concatenate()([AB, AB])) )
self.autoencoder_dst = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder(self.keras.layers.Concatenate()([B, AB])) )
if self.is_training_mode:
self.autoencoder_src, self.autoencoder_dst = self.to_multi_gpu_model_if_possible ( [self.autoencoder_src, self.autoencoder_dst] )
optimizer = self.keras.optimizers.Adam(lr=5e-5, beta_1=0.5, beta_2=0.999)
dssimloss = DSSIMMaskLossClass(self.tf)([mask_layer])
self.autoencoder_src.compile(optimizer=optimizer, loss=[dssimloss, 'mse'] )
self.autoencoder_dst.compile(optimizer=optimizer, loss=[dssimloss, 'mse'] )
if self.is_training_mode:
from models import TrainingDataGenerator
f = TrainingDataGenerator.SampleTypeFlags
self.set_training_data_generators ([
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_src_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_M | f.MASK_FULL, 128] ], random_flip=True ),
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_dst_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_M | f.MASK_FULL, 128] ], random_flip=True )
])
#override
def onSave(self):
self.save_weights_safe( [[self.encoder, self.get_strpath_storage_for_file(self.encoderH5)],
[self.decoder, self.get_strpath_storage_for_file(self.decoderH5)],
[self.inter_B, self.get_strpath_storage_for_file(self.inter_BH5)],
[self.inter_AB, self.get_strpath_storage_for_file(self.inter_ABH5)]] )
#override
def onTrainOneEpoch(self, sample):
warped_src, target_src, target_src_mask = sample[0]
warped_dst, target_dst, target_dst_mask = sample[1]
loss_src = self.autoencoder_src.train_on_batch( [warped_src, target_src_mask], [target_src, target_src_mask] )
loss_dst = self.autoencoder_dst.train_on_batch( [warped_dst, target_dst_mask], [target_dst, target_dst_mask] )
return ( ('loss_src', loss_src[0]), ('loss_dst', loss_dst[0]) )
#override
def onGetPreview(self, sample):
test_A = sample[0][1][0:4] #first 4 samples
test_A_m = sample[0][2][0:4] #first 4 samples
test_B = sample[1][1][0:4]
test_B_m = sample[1][2][0:4]
AA, mAA = self.autoencoder_src.predict([test_A, test_A_m])
AB, mAB = self.autoencoder_src.predict([test_B, test_B_m])
BB, mBB = self.autoencoder_dst.predict([test_B, test_B_m])
mAA = np.repeat ( mAA, (3,), -1)
mAB = np.repeat ( mAB, (3,), -1)
mBB = np.repeat ( mBB, (3,), -1)
st = []
for i in range(0, len(test_A)):
st.append ( np.concatenate ( (
test_A[i,:,:,0:3],
AA[i],
#mAA[i],
test_B[i,:,:,0:3],
BB[i],
#mBB[i],
AB[i],
#mAB[i]
), axis=1) )
return [ ('LIAEF128', np.concatenate ( st, axis=0 ) ) ]
def predictor_func (self, face):
face_128_bgr = face[...,0:3]
face_128_mask = np.expand_dims(face[...,3],-1)
x, mx = self.autoencoder_src.predict ( [ np.expand_dims(face_128_bgr,0), np.expand_dims(face_128_mask,0) ] )
x, mx = x[0], mx[0]
return np.concatenate ( (x,mx), -1 )
#override
def get_converter(self, **in_options):
from models import ConverterMasked
if 'masked_hist_match' not in in_options.keys() or in_options['masked_hist_match'] is None:
in_options['masked_hist_match'] = True
if 'erode_mask_modifier' not in in_options.keys():
in_options['erode_mask_modifier'] = 0
in_options['erode_mask_modifier'] += 30
if 'blur_mask_modifier' not in in_options.keys():
in_options['blur_mask_modifier'] = 0
return ConverterMasked(self.predictor_func, predictor_input_size=128, output_size=128, face_type=FaceType.FULL, clip_border_mask_per=0.046875, **in_options)
def Encoder(self, input_layer,):
x = input_layer
x = conv(self.keras, x, 128)
x = conv(self.keras, x, 256)
x = conv(self.keras, x, 512)
x = conv(self.keras, x, 1024)
x = self.keras.layers.Flatten()(x)
return self.keras.models.Model(input_layer, x)
def Intermediate(self):
input_layer = self.keras.layers.Input(shape=(None, 8 * 8 * 1024))
x = input_layer
x = self.keras.layers.Dense(256)(x)
x = self.keras.layers.Dense(8 * 8 * 512)(x)
x = self.keras.layers.Reshape((8, 8, 512))(x)
x = upscale(self.keras, x, 512)
return self.keras.models.Model(input_layer, x)
def Decoder(self):
input_ = self.keras.layers.Input(shape=(16, 16, 1024))
x = input_
x = upscale(self.keras, x, 512)
x = upscale(self.keras, x, 256)
x = upscale(self.keras, x, 128)
x = self.keras.layers.convolutional.Conv2D(3, kernel_size=5, padding='same', activation='sigmoid')(x)
y = input_ #mask decoder
y = upscale(self.keras, y, 512)
y = upscale(self.keras, y, 256)
y = upscale(self.keras, y, 128)
y = self.keras.layers.convolutional.Conv2D(1, kernel_size=5, padding='same', activation='sigmoid' )(y)
return self.keras.models.Model(input_, [x,y])

View file

@ -0,0 +1 @@
from .Model import Model

View file

@ -0,0 +1,164 @@
from models import ModelBase
from models import TrainingDataType
import numpy as np
import cv2
from nnlib import DSSIMMaskLossClass
from nnlib import conv
from nnlib import upscale
from facelib import FaceType
class Model(ModelBase):
encoderH5 = 'encoder.h5'
decoderH5 = 'decoder.h5'
inter_BH5 = 'inter_B.h5'
inter_ABH5 = 'inter_AB.h5'
#override
def onInitialize(self, **in_options):
self.set_vram_batch_requirements( {5:4,6:8,7:12,8:16,9:20,10:24,11:24,12:32,13:48} )
ae_input_layer = self.keras.layers.Input(shape=(128, 128, 3))
mask_layer = self.keras.layers.Input(shape=(128, 128, 1)) #same as output
self.encoder = self.Encoder(ae_input_layer)
self.decoder = self.Decoder()
self.inter_B = self.Intermediate ()
self.inter_AB = self.Intermediate ()
if not self.is_first_run():
self.encoder.load_weights (self.get_strpath_storage_for_file(self.encoderH5))
self.decoder.load_weights (self.get_strpath_storage_for_file(self.decoderH5))
self.inter_B.load_weights (self.get_strpath_storage_for_file(self.inter_BH5))
self.inter_AB.load_weights (self.get_strpath_storage_for_file(self.inter_ABH5))
code = self.encoder(ae_input_layer)
AB = self.inter_AB(code)
B = self.inter_B(code)
self.autoencoder_src = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder(self.keras.layers.Concatenate()([AB, AB])) )
self.autoencoder_dst = self.keras.models.Model([ae_input_layer,mask_layer], self.decoder(self.keras.layers.Concatenate()([B, AB])) )
if self.is_training_mode:
self.autoencoder_src, self.autoencoder_dst = self.to_multi_gpu_model_if_possible ( [self.autoencoder_src, self.autoencoder_dst] )
optimizer = self.keras.optimizers.Adam(lr=5e-5, beta_1=0.5, beta_2=0.999)
dssimloss = DSSIMMaskLossClass(self.tf)([mask_layer])
self.autoencoder_src.compile(optimizer=optimizer, loss=[dssimloss, 'mse'] )
self.autoencoder_dst.compile(optimizer=optimizer, loss=[dssimloss, 'mse'] )
if self.is_training_mode:
from models import TrainingDataGenerator
f = TrainingDataGenerator.SampleTypeFlags
self.set_training_data_generators ([
TrainingDataGenerator(TrainingDataType.FACE_YAW_SORTED_AS_TARGET, self.training_data_src_path, target_training_data_path=self.training_data_dst_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_M | f.MASK_FULL, 128] ], random_flip=True ),
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_dst_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_M | f.MASK_FULL, 128] ], random_flip=True )
])
#override
def onSave(self):
self.save_weights_safe( [[self.encoder, self.get_strpath_storage_for_file(self.encoderH5)],
[self.decoder, self.get_strpath_storage_for_file(self.decoderH5)],
[self.inter_B, self.get_strpath_storage_for_file(self.inter_BH5)],
[self.inter_AB, self.get_strpath_storage_for_file(self.inter_ABH5)]] )
#override
def onTrainOneEpoch(self, sample):
warped_src, target_src, target_src_mask = sample[0]
warped_dst, target_dst, target_dst_mask = sample[1]
loss_src = self.autoencoder_src.train_on_batch( [warped_src, target_src_mask], [target_src, target_src_mask] )
loss_dst = self.autoencoder_dst.train_on_batch( [warped_dst, target_dst_mask], [target_dst, target_dst_mask] )
return ( ('loss_src', loss_src[0]), ('loss_dst', loss_dst[0]) )
#override
def onGetPreview(self, sample):
test_A = sample[0][1][0:4] #first 4 samples
test_A_m = sample[0][2][0:4] #first 4 samples
test_B = sample[1][1][0:4]
test_B_m = sample[1][2][0:4]
AA, mAA = self.autoencoder_src.predict([test_A, test_A_m])
AB, mAB = self.autoencoder_src.predict([test_B, test_B_m])
BB, mBB = self.autoencoder_dst.predict([test_B, test_B_m])
mAA = np.repeat ( mAA, (3,), -1)
mAB = np.repeat ( mAB, (3,), -1)
mBB = np.repeat ( mBB, (3,), -1)
st = []
for i in range(0, len(test_A)):
st.append ( np.concatenate ( (
test_A[i,:,:,0:3],
AA[i],
#mAA[i],
test_B[i,:,:,0:3],
BB[i],
#mBB[i],
AB[i],
#mAB[i]
), axis=1) )
return [ ('LIAEF128YAW', np.concatenate ( st, axis=0 ) ) ]
def predictor_func (self, face):
face_128_bgr = face[...,0:3]
face_128_mask = np.expand_dims(face[...,3],-1)
x, mx = self.autoencoder_src.predict ( [ np.expand_dims(face_128_bgr,0), np.expand_dims(face_128_mask,0) ] )
x, mx = x[0], mx[0]
return np.concatenate ( (x,mx), -1 )
#override
def get_converter(self, **in_options):
from models import ConverterMasked
if 'masked_hist_match' not in in_options.keys() or in_options['masked_hist_match'] is None:
in_options['masked_hist_match'] = True
if 'erode_mask_modifier' not in in_options.keys():
in_options['erode_mask_modifier'] = 0
in_options['erode_mask_modifier'] += 30
if 'blur_mask_modifier' not in in_options.keys():
in_options['blur_mask_modifier'] = 0
return ConverterMasked(self.predictor_func, predictor_input_size=128, output_size=128, face_type=FaceType.FULL, clip_border_mask_per=0.046875, **in_options)
def Encoder(self, input_layer,):
x = input_layer
x = conv(self.keras, x, 128)
x = conv(self.keras, x, 256)
x = conv(self.keras, x, 512)
x = conv(self.keras, x, 1024)
x = self.keras.layers.Flatten()(x)
return self.keras.models.Model(input_layer, x)
def Intermediate(self):
input_layer = self.keras.layers.Input(shape=(None, 8 * 8 * 1024))
x = input_layer
x = self.keras.layers.Dense(256)(x)
x = self.keras.layers.Dense(8 * 8 * 512)(x)
x = self.keras.layers.Reshape((8, 8, 512))(x)
x = upscale(self.keras, x, 512)
return self.keras.models.Model(input_layer, x)
def Decoder(self):
input_ = self.keras.layers.Input(shape=(16, 16, 1024))
x = input_
x = upscale(self.keras, x, 512)
x = upscale(self.keras, x, 256)
x = upscale(self.keras, x, 128)
x = self.keras.layers.convolutional.Conv2D(3, kernel_size=5, padding='same', activation='sigmoid')(x)
y = input_ #mask decoder
y = upscale(self.keras, y, 512)
y = upscale(self.keras, y, 256)
y = upscale(self.keras, y, 128)
y = self.keras.layers.convolutional.Conv2D(1, kernel_size=5, padding='same', activation='sigmoid' )(y)
return self.keras.models.Model(input_, [x,y])

View file

@ -0,0 +1 @@
from .Model import Model

View file

@ -0,0 +1,217 @@
from models import ModelBase
from models import TrainingDataType
import numpy as np
import cv2
from nnlib import DSSIMMaskLossClass
from nnlib import conv
from nnlib import upscale
from facelib import FaceType
class Model(ModelBase):
encoderH5 = 'encoder.h5'
decoderMaskH5 = 'decoderMask.h5'
decoderCommonAH5 = 'decoderCommonA.h5'
decoderCommonBH5 = 'decoderCommonB.h5'
decoderRGBH5 = 'decoderRGB.h5'
decoderBWH5 = 'decoderBW.h5'
inter_BH5 = 'inter_B.h5'
inter_AH5 = 'inter_A.h5'
#override
def onInitialize(self, **in_options):
self.set_vram_batch_requirements( {5:4,6:8,7:12,8:16,9:20,10:24,11:24,12:32,13:48} )
ae_input_layer = self.keras.layers.Input(shape=(128, 128, 3))
mask_layer = self.keras.layers.Input(shape=(128, 128, 1)) #same as output
self.encoder = self.Encoder(ae_input_layer)
self.decoderMask = self.DecoderMask()
self.decoderCommonA = self.DecoderCommon()
self.decoderCommonB = self.DecoderCommon()
self.decoderRGB = self.DecoderRGB()
self.decoderBW = self.DecoderBW()
self.inter_A = self.Intermediate ()
self.inter_B = self.Intermediate ()
if not self.is_first_run():
self.encoder.load_weights (self.get_strpath_storage_for_file(self.encoderH5))
self.decoderMask.load_weights (self.get_strpath_storage_for_file(self.decoderMaskH5))
self.decoderCommonA.load_weights (self.get_strpath_storage_for_file(self.decoderCommonAH5))
self.decoderCommonB.load_weights (self.get_strpath_storage_for_file(self.decoderCommonBH5))
self.decoderRGB.load_weights (self.get_strpath_storage_for_file(self.decoderRGBH5))
self.decoderBW.load_weights (self.get_strpath_storage_for_file(self.decoderBWH5))
self.inter_A.load_weights (self.get_strpath_storage_for_file(self.inter_AH5))
self.inter_B.load_weights (self.get_strpath_storage_for_file(self.inter_BH5))
code = self.encoder(ae_input_layer)
A = self.inter_A(code)
B = self.inter_B(code)
inter_A_A = self.keras.layers.Concatenate()([A, A])
inter_B_A = self.keras.layers.Concatenate()([B, A])
x1,m1 = self.decoderCommonA (inter_A_A)
x2,m2 = self.decoderCommonA (inter_A_A)
self.autoencoder_src = self.keras.models.Model([ae_input_layer,mask_layer],
[ self.decoderBW (self.keras.layers.Concatenate()([x1,x2]) ),
self.decoderMask(self.keras.layers.Concatenate()([m1,m2]) )
])
x1,m1 = self.decoderCommonA (inter_A_A)
x2,m2 = self.decoderCommonB (inter_A_A)
self.autoencoder_src_RGB = self.keras.models.Model([ae_input_layer,mask_layer],
[ self.decoderRGB (self.keras.layers.Concatenate()([x1,x2]) ),
self.decoderMask (self.keras.layers.Concatenate()([m1,m2]) )
])
x1,m1 = self.decoderCommonA (inter_B_A)
x2,m2 = self.decoderCommonB (inter_B_A)
self.autoencoder_dst = self.keras.models.Model([ae_input_layer,mask_layer],
[ self.decoderRGB (self.keras.layers.Concatenate()([x1,x2]) ),
self.decoderMask (self.keras.layers.Concatenate()([m1,m2]) )
])
if self.is_training_mode:
self.autoencoder_src, self.autoencoder_dst = self.to_multi_gpu_model_if_possible ( [self.autoencoder_src, self.autoencoder_dst] )
optimizer = self.keras.optimizers.Adam(lr=5e-5, beta_1=0.5, beta_2=0.999)
dssimloss = DSSIMMaskLossClass(self.tf)([mask_layer])
self.autoencoder_src.compile(optimizer=optimizer, loss=[dssimloss, 'mse'] )
self.autoencoder_dst.compile(optimizer=optimizer, loss=[dssimloss, 'mse'] )
if self.is_training_mode:
from models import TrainingDataGenerator
f = TrainingDataGenerator.SampleTypeFlags
self.set_training_data_generators ([
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_src_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.FULL_FACE | f.MODE_GGG, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_G , 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_M | f.MASK_FULL, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_GGG, 128] ], random_flip=True ),
TrainingDataGenerator(TrainingDataType.FACE, self.training_data_dst_path, debug=self.is_debug(), batch_size=self.batch_size, output_sample_types=[ [f.WARPED_TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_BGR, 128], [f.TRANSFORMED | f.FULL_FACE | f.MODE_M | f.MASK_FULL, 128]], random_flip=True )
])
#override
def onSave(self):
self.save_weights_safe( [[self.encoder, self.get_strpath_storage_for_file(self.encoderH5)],
[self.decoderMask, self.get_strpath_storage_for_file(self.decoderMaskH5)],
[self.decoderCommonA, self.get_strpath_storage_for_file(self.decoderCommonAH5)],
[self.decoderCommonB, self.get_strpath_storage_for_file(self.decoderCommonBH5)],
[self.decoderRGB, self.get_strpath_storage_for_file(self.decoderRGBH5)],
[self.decoderBW, self.get_strpath_storage_for_file(self.decoderBWH5)],
[self.inter_A, self.get_strpath_storage_for_file(self.inter_AH5)],
[self.inter_B, self.get_strpath_storage_for_file(self.inter_BH5)]] )
#override
def onTrainOneEpoch(self, sample):
warped_src, target_src, target_src_mask, target_src_GGG = sample[0]
warped_dst, target_dst, target_dst_mask = sample[1]
loss_src = self.autoencoder_src.train_on_batch( [ warped_src, target_src_mask], [ target_src, target_src_mask] )
loss_dst = self.autoencoder_dst.train_on_batch( [ warped_dst, target_dst_mask], [ target_dst, target_dst_mask] )
return ( ('loss_src', loss_src[0]), ('loss_dst', loss_dst[0]) )
#override
def onGetPreview(self, sample):
test_A = sample[0][3][0:4] #first 4 samples
test_A_m = sample[0][2][0:4] #first 4 samples
test_B = sample[1][1][0:4]
test_B_m = sample[1][2][0:4]
AA, mAA = self.autoencoder_src.predict([test_A, test_A_m])
AB, mAB = self.autoencoder_src_RGB.predict([test_B, test_B_m])
BB, mBB = self.autoencoder_dst.predict([test_B, test_B_m])
mAA = np.repeat ( mAA, (3,), -1)
mAB = np.repeat ( mAB, (3,), -1)
mBB = np.repeat ( mBB, (3,), -1)
st = []
for i in range(0, len(test_A)):
st.append ( np.concatenate ( (
np.repeat (np.expand_dims (test_A[i,:,:,0],-1), (3,), -1) ,
np.repeat (AA[i], (3,), -1),
#mAA[i],
test_B[i,:,:,0:3],
BB[i],
#mBB[i],
AB[i],
#mAB[i]
), axis=1) )
return [ ('MIAEF128', np.concatenate ( st, axis=0 ) ) ]
def predictor_func (self, face):
face_128_bgr = face[...,0:3]
face_128_mask = np.expand_dims(face[...,-1],-1)
x, mx = self.autoencoder_src_RGB.predict ( [ np.expand_dims(face_128_bgr,0), np.expand_dims(face_128_mask,0) ] )
x, mx = x[0], mx[0]
return np.concatenate ( (x,mx), -1 )
#override
def get_converter(self, **in_options):
from models import ConverterMasked
if 'masked_hist_match' not in in_options.keys() or in_options['masked_hist_match'] is None:
in_options['masked_hist_match'] = False
if 'erode_mask_modifier' not in in_options.keys():
in_options['erode_mask_modifier'] = 0
in_options['erode_mask_modifier'] += 30
if 'blur_mask_modifier' not in in_options.keys():
in_options['blur_mask_modifier'] = 0
return ConverterMasked(self.predictor_func, predictor_input_size=128, output_size=128, face_type=FaceType.FULL, clip_border_mask_per=0.046875, **in_options)
def Encoder(self, input_layer,):
x = input_layer
x = conv(self.keras, x, 128)
x = conv(self.keras, x, 256)
x = conv(self.keras, x, 512)
x = conv(self.keras, x, 1024)
x = self.keras.layers.Flatten()(x)
return self.keras.models.Model(input_layer, x)
def Intermediate(self):
input_layer = self.keras.layers.Input(shape=(None, 8 * 8 * 1024))
x = input_layer
x = self.keras.layers.Dense(256)(x)
x = self.keras.layers.Dense(8 * 8 * 512)(x)
x = self.keras.layers.Reshape((8, 8, 512))(x)
x = upscale(self.keras, x, 512)
return self.keras.models.Model(input_layer, x)
def DecoderCommon(self):
input_ = self.keras.layers.Input(shape=(16, 16, 1024))
x = input_
x = upscale(self.keras, x, 512)
x = upscale(self.keras, x, 256)
x = upscale(self.keras, x, 128)
y = input_
y = upscale(self.keras, y, 256)
y = upscale(self.keras, y, 128)
y = upscale(self.keras, y, 64)
return self.keras.models.Model(input_, [x,y])
def DecoderRGB(self):
input_ = self.keras.layers.Input(shape=(128, 128, 256))
x = input_
x = self.keras.layers.convolutional.Conv2D(3, kernel_size=5, padding='same', activation='sigmoid')(x)
return self.keras.models.Model(input_, [x])
def DecoderBW(self):
input_ = self.keras.layers.Input(shape=(128, 128, 256))
x = input_
x = self.keras.layers.convolutional.Conv2D(1, kernel_size=5, padding='same', activation='sigmoid')(x)
return self.keras.models.Model(input_, [x])
def DecoderMask(self):
input_ = self.keras.layers.Input(shape=(128, 128, 128))
y = input_
y = self.keras.layers.convolutional.Conv2D(1, kernel_size=5, padding='same', activation='sigmoid')(y)
return self.keras.models.Model(input_, [y])

View file

@ -0,0 +1 @@
from .Model import Model

View file

@ -0,0 +1,149 @@
from facelib import FaceType
from facelib import LandmarksProcessor
import cv2
import numpy as np
from models import TrainingDataGeneratorBase
from utils import image_utils
from utils import random_utils
from enum import IntEnum
from models import TrainingDataType
class TrainingDataGenerator(TrainingDataGeneratorBase):
class SampleTypeFlags(IntEnum):
SOURCE = 0x000001,
WARPED = 0x000002,
WARPED_TRANSFORMED = 0x000004,
TRANSFORMED = 0x000008,
HALF_FACE = 0x000010,
FULL_FACE = 0x000020,
HEAD_FACE = 0x000040,
AVATAR_FACE = 0x000080,
MARK_ONLY_FACE = 0x000100,
MODE_BGR = 0x001000, #BGR
MODE_G = 0x002000, #Grayscale
MODE_GGG = 0x004000, #3xGrayscale
MODE_M = 0x008000, #mask only
MODE_BGR_SHUFFLE = 0x010000, #BGR shuffle
MASK_FULL = 0x100000,
MASK_EYES = 0x200000,
#overrided
def onInitialize(self, random_flip=False, normalize_tanh=False, rotation_range=[-10,10], scale_range=[-0.05, 0.05], tx_range=[-0.05, 0.05], ty_range=[-0.05, 0.05], output_sample_types=[], **kwargs):
self.random_flip = random_flip
self.normalize_tanh = normalize_tanh
self.output_sample_types = output_sample_types
self.rotation_range = rotation_range
self.scale_range = scale_range
self.tx_range = tx_range
self.ty_range = ty_range
#overrided
def onProcessSample(self, sample, debug):
source = sample.load_bgr()
h,w,c = source.shape
is_face_sample = self.trainingdatatype >= TrainingDataType.FACE_BEGIN and self.trainingdatatype <= TrainingDataType.FACE_END
if debug and is_face_sample:
LandmarksProcessor.draw_landmarks (source, sample.landmarks, (0, 1, 0))
params = image_utils.gen_warp_params(source, self.random_flip, rotation_range=self.rotation_range, scale_range=self.scale_range, tx_range=self.tx_range, ty_range=self.ty_range )
images = [[None]*3 for _ in range(4)]
outputs = []
for t,size in self.output_sample_types:
if t & self.SampleTypeFlags.SOURCE != 0:
img_type = 0
elif t & self.SampleTypeFlags.WARPED != 0:
img_type = 1
elif t & self.SampleTypeFlags.WARPED_TRANSFORMED != 0:
img_type = 2
elif t & self.SampleTypeFlags.TRANSFORMED != 0:
img_type = 3
else:
raise ValueError ('expected SampleTypeFlags type')
mask_type = 0
if t & self.SampleTypeFlags.MASK_FULL != 0:
mask_type = 1
elif t & self.SampleTypeFlags.MASK_EYES != 0:
mask_type = 2
if images[img_type][mask_type] is None:
img = source
if is_face_sample:
if mask_type == 1:
img = np.concatenate( (img, LandmarksProcessor.get_image_hull_mask (source, sample.landmarks) ), -1 )
elif mask_type == 2:
mask = LandmarksProcessor.get_image_eye_mask (source, sample.landmarks)
mask = np.expand_dims (cv2.blur (mask, ( w // 32, w // 32 ) ), -1)
mask[mask > 0.0] = 1.0
img = np.concatenate( (img, mask ), -1 )
images[img_type][mask_type] = image_utils.warp_by_params (params, img, (img_type==1 or img_type==2), (img_type==2 or img_type==3), img_type != 0)
img = images[img_type][mask_type]
target_face_type = -1
if t & self.SampleTypeFlags.HALF_FACE != 0:
target_face_type = FaceType.HALF
elif t & self.SampleTypeFlags.FULL_FACE != 0:
target_face_type = FaceType.FULL
elif t & self.SampleTypeFlags.HEAD_FACE != 0:
target_face_type = FaceType.HEAD
elif t & self.SampleTypeFlags.AVATAR_FACE != 0:
target_face_type = FaceType.AVATAR
elif t & self.SampleTypeFlags.MARK_ONLY_FACE != 0:
target_face_type = FaceType.MARK_ONLY
if is_face_sample and target_face_type != -1 and target_face_type != FaceType.MARK_ONLY:
if target_face_type > sample.face_type:
raise Exception ('sample %s type %s does not match model requirement %s. Consider extract necessary type of faces.' % (sample.filename, sample.face_type, target_face_type) )
img = cv2.warpAffine( img, LandmarksProcessor.get_transform_mat (sample.landmarks, size, target_face_type), (size,size), flags=cv2.INTER_LANCZOS4 )
else:
img = cv2.resize( img, (size,size), cv2.INTER_LANCZOS4 )
img_bgr = img[...,0:3]
img_mask = img[...,3:4]
if t & self.SampleTypeFlags.MODE_BGR != 0:
img = img
elif t & self.SampleTypeFlags.MODE_BGR_SHUFFLE != 0:
img_bgr = np.take (img_bgr, np.random.permutation(img_bgr.shape[-1]), axis=-1)
img = np.concatenate ( (img_bgr,img_mask) , -1 )
elif t & self.SampleTypeFlags.MODE_G != 0:
img = np.concatenate ( (np.expand_dims(cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY),-1),img_mask) , -1 )
elif t & self.SampleTypeFlags.MODE_GGG != 0:
img = np.concatenate ( ( np.repeat ( np.expand_dims(cv2.cvtColor(img_bgr, cv2.COLOR_BGR2GRAY),-1), (3,), -1), img_mask), -1)
elif is_face_sample and t & self.SampleTypeFlags.MODE_M != 0:
if mask_type== 0:
raise ValueError ('no mask mode defined')
img = img_mask
else:
raise ValueError ('expected SampleTypeFlags mode')
if not debug and self.normalize_tanh:
img = img * 2.0 - 1.0
outputs.append ( img )
if debug:
result = ()
for output in outputs:
if output.shape[2] < 4:
result += (output,)
elif output.shape[2] == 4:
result += (output[...,0:3]*output[...,3:4],)
return result
else:
return outputs

View file

@ -0,0 +1,245 @@
import traceback
import random
from pathlib import Path
from tqdm import tqdm
import numpy as np
import cv2
from utils.AlignedPNG import AlignedPNG
from utils import iter_utils
from utils import Path_utils
from .BaseTypes import TrainingDataType
from .BaseTypes import TrainingDataSample
from facelib import FaceType
from facelib import LandmarksProcessor
'''
You can implement your own TrainingDataGenerator
'''
class TrainingDataGeneratorBase(object):
cache = dict()
#DONT OVERRIDE
#use YourOwnTrainingDataGenerator (..., your_opt=1)
#and then this opt will be passed in YourOwnTrainingDataGenerator.onInitialize ( your_opt )
def __init__ (self, trainingdatatype, training_data_path, target_training_data_path=None, debug=False, batch_size=1, **kwargs):
if not isinstance(trainingdatatype, TrainingDataType):
raise Exception('TrainingDataGeneratorBase() trainingdatatype is not TrainingDataType')
if training_data_path is None:
raise Exception('training_data_path is None')
self.training_data_path = Path(training_data_path)
self.target_training_data_path = Path(target_training_data_path) if target_training_data_path is not None else None
self.debug = debug
self.batch_size = 1 if self.debug else batch_size
self.trainingdatatype = trainingdatatype
self.data = TrainingDataGeneratorBase.load (trainingdatatype, self.training_data_path, self.target_training_data_path)
if self.debug:
self.generators = [iter_utils.ThisThreadGenerator ( self.batch_func, self.data)]
else:
if len(self.data) > 1:
self.generators = [iter_utils.SubprocessGenerator ( self.batch_func, self.data[0::2] ),
iter_utils.SubprocessGenerator ( self.batch_func, self.data[1::2] )]
else:
self.generators = [iter_utils.SubprocessGenerator ( self.batch_func, self.data )]
self.generator_counter = -1
self.onInitialize(**kwargs)
#overridable
def onInitialize(self, **kwargs):
#your TrainingDataGenerator initialization here
pass
#overridable
def onProcessSample(self, sample, debug):
#process sample and return tuple of images for your model in onTrainOneEpoch
return ( np.zeros( (64,64,4), dtype=np.float32 ), )
def __iter__(self):
return self
def __next__(self):
self.generator_counter += 1
generator = self.generators[self.generator_counter % len(self.generators) ]
x = next(generator)
return x
def batch_func(self, data):
data_len = len(data)
if data_len == 0:
raise ValueError('No training data provided.')
if self.trainingdatatype == TrainingDataType.FACE_YAW_SORTED or self.trainingdatatype == TrainingDataType.FACE_YAW_SORTED_AS_TARGET:
if all ( [ x == None for x in data] ):
raise ValueError('Not enough training data. Gather more faces!')
if self.trainingdatatype == TrainingDataType.IMAGE or self.trainingdatatype == TrainingDataType.FACE:
shuffle_idxs = []
elif self.trainingdatatype == TrainingDataType.FACE_YAW_SORTED or self.trainingdatatype == TrainingDataType.FACE_YAW_SORTED_AS_TARGET:
shuffle_idxs = []
shuffle_idxs_2D = [[]]*data_len
while True:
batches = None
for n_batch in range(0, self.batch_size):
while True:
sample = None
if self.trainingdatatype == TrainingDataType.IMAGE or self.trainingdatatype == TrainingDataType.FACE:
if len(shuffle_idxs) == 0:
shuffle_idxs = [ i for i in range(0, data_len) ]
random.shuffle(shuffle_idxs)
idx = shuffle_idxs.pop()
sample = data[ idx ]
elif self.trainingdatatype == TrainingDataType.FACE_YAW_SORTED or self.trainingdatatype == TrainingDataType.FACE_YAW_SORTED_AS_TARGET:
if len(shuffle_idxs) == 0:
shuffle_idxs = [ i for i in range(0, data_len) ]
random.shuffle(shuffle_idxs)
idx = shuffle_idxs.pop()
if data[idx] != None:
if len(shuffle_idxs_2D[idx]) == 0:
shuffle_idxs_2D[idx] = [ i for i in range(0, len(data[idx])) ]
random.shuffle(shuffle_idxs_2D[idx])
idx2 = shuffle_idxs_2D[idx].pop()
sample = data[idx][idx2]
if sample is not None:
try:
x = self.onProcessSample (sample, self.debug)
except:
raise Exception ("Exception occured in sample %s. Error: %s" % (sample.filename, traceback.format_exc() ) )
if type(x) != tuple and type(x) != list:
raise Exception('TrainingDataGenerator.onProcessSample() returns NOT tuple/list')
x_len = len(x)
if batches is None:
batches = [ [] for _ in range(0,x_len) ]
for i in range(0,x_len):
batches[i].append ( x[i] )
break
yield [ np.array(batch) for batch in batches]
def get_dict_state(self):
return {}
def set_dict_state(self, state):
pass
@staticmethod
def load(trainingdatatype, training_data_path, target_training_data_path=None):
cache = TrainingDataGeneratorBase.cache
if str(training_data_path) not in cache.keys():
cache[str(training_data_path)] = [None]*TrainingDataType.QTY
if target_training_data_path is not None and str(target_training_data_path) not in cache.keys():
cache[str(target_training_data_path)] = [None]*TrainingDataType.QTY
datas = cache[str(training_data_path)]
if trainingdatatype == TrainingDataType.IMAGE:
if datas[trainingdatatype] is None:
datas[trainingdatatype] = [ TrainingDataSample(filename=filename) for filename in tqdm( Path_utils.get_image_paths(training_data_path), desc="Loading" ) ]
elif trainingdatatype == TrainingDataType.FACE:
if datas[trainingdatatype] is None:
datas[trainingdatatype] = X_LOAD( [ TrainingDataSample(filename=filename) for filename in Path_utils.get_image_paths(training_data_path) ] )
elif trainingdatatype == TrainingDataType.FACE_YAW_SORTED:
if datas[trainingdatatype] is None:
datas[trainingdatatype] = X_YAW_SORTED( TrainingDataGeneratorBase.load(TrainingDataType.FACE, training_data_path) )
elif trainingdatatype == TrainingDataType.FACE_YAW_SORTED_AS_TARGET:
if datas[trainingdatatype] is None:
if target_training_data_path is None:
raise Exception('target_training_data_path is None for FACE_YAW_SORTED_AS_TARGET')
datas[trainingdatatype] = X_YAW_AS_Y_SORTED( TrainingDataGeneratorBase.load(TrainingDataType.FACE_YAW_SORTED, training_data_path), TrainingDataGeneratorBase.load(TrainingDataType.FACE_YAW_SORTED, target_training_data_path) )
return datas[trainingdatatype]
def X_LOAD ( RAWS ):
sample_list = []
for s in tqdm( RAWS, desc="Loading" ):
s_filename_path = Path(s.filename)
if s_filename_path.suffix != '.png':
print ("%s is not a png file required for training" % (s_filename_path.name) )
continue
a_png = AlignedPNG.load ( str(s_filename_path) )
if a_png is None:
print ("%s failed to load" % (s_filename_path.name) )
continue
d = a_png.getFaceswapDictData()
if d is None or d['landmarks'] is None or d['yaw_value'] is None:
print ("%s - no embedded faceswap info found required for training" % (s_filename_path.name) )
continue
face_type = d['face_type'] if 'face_type' in d.keys() else 'full_face'
face_type = FaceType.fromString (face_type)
sample_list.append( s.copy_and_set(face_type=face_type, shape=a_png.get_shape(), landmarks=d['landmarks'], yaw=d['yaw_value']) )
return sample_list
def X_YAW_SORTED( YAW_RAWS ):
lowest_yaw, highest_yaw = -32, +32
gradations = 64
diff_rot_per_grad = abs(highest_yaw-lowest_yaw) / gradations
yaws_sample_list = [None]*gradations
for i in tqdm( range(0, gradations), desc="Sorting" ):
yaw = lowest_yaw + i*diff_rot_per_grad
next_yaw = lowest_yaw + (i+1)*diff_rot_per_grad
yaw_samples = []
for s in YAW_RAWS:
s_yaw = s.yaw
if (i == 0 and s_yaw < next_yaw) or \
(i < gradations-1 and s_yaw >= yaw and s_yaw < next_yaw) or \
(i == gradations-1 and s_yaw >= yaw):
yaw_samples.append ( s )
if len(yaw_samples) > 0:
yaws_sample_list[i] = yaw_samples
return yaws_sample_list
def X_YAW_AS_Y_SORTED (s, t):
l = len(s)
if l != len(t):
raise Exception('X_YAW_AS_Y_SORTED() s_len != t_len')
b = l // 2
s_idxs = np.argwhere ( np.array ( [ 1 if x != None else 0 for x in s] ) == 1 )[:,0]
t_idxs = np.argwhere ( np.array ( [ 1 if x != None else 0 for x in t] ) == 1 )[:,0]
new_s = [None]*l
for t_idx in t_idxs:
search_idxs = []
for i in range(0,l):
search_idxs += [t_idx - i, (l-t_idx-1) - i, t_idx + i, (l-t_idx-1) + i]
for search_idx in search_idxs:
if search_idx in s_idxs:
mirrored = ( t_idx != search_idx and ((t_idx < b and search_idx >= b) or (search_idx < b and t_idx >= b)) )
new_s[t_idx] = [ sample.copy_and_set(mirror=True, yaw=-sample.yaw, landmarks=LandmarksProcessor.mirror_landmarks (sample.landmarks, sample.shape[1] ))
for sample in s[search_idx]
] if mirrored else s[search_idx]
break
return new_s

13
models/__init__.py Normal file
View file

@ -0,0 +1,13 @@
from .BaseTypes import TrainingDataType
from .BaseTypes import TrainingDataSample
from .ModelBase import ModelBase
from .ConverterBase import ConverterBase
from .ConverterMasked import ConverterMasked
from .ConverterImage import ConverterImage
from .TrainingDataGeneratorBase import TrainingDataGeneratorBase
from .TrainingDataGenerator import TrainingDataGenerator
def import_model(name):
module = __import__('Model_'+name, globals(), locals(), [], 1)
return getattr(module, 'Model')

198
nnlib/__init__.py Normal file
View file

@ -0,0 +1,198 @@
def tf_image_histogram (tf, input):
x = input
x += 1 / 255.0
output = []
for i in range(256, 0, -1):
v = i / 255.0
y = (x - v) * 1000
y = tf.clip_by_value (y, -1.0, 0.0) + 1
output.append ( tf.reduce_sum (y) )
x -= y*v
return tf.stack ( output[::-1] )
def tf_dssim(tf, t1, t2):
return (1.0 - tf.image.ssim (t1, t2, 1.0)) / 2.0
def tf_ssim(tf, t1, t2):
return tf.image.ssim (t1, t2, 1.0)
def DSSIMMaskLossClass(tf):
class DSSIMMaskLoss(object):
def __init__(self, mask_list, is_tanh=False):
self.mask_list = mask_list
self.is_tanh = is_tanh
def __call__(self,y_true, y_pred):
total_loss = None
for mask in self.mask_list:
if not self.is_tanh:
loss = (1.0 - tf.image.ssim (y_true*mask, y_pred*mask, 1.0)) / 2.0
else:
loss = (1.0 - tf.image.ssim ( (y_true/2+0.5)*(mask/2+0.5), (y_pred/2+0.5)*(mask/2+0.5), 1.0)) / 2.0
if total_loss is None:
total_loss = loss
else:
total_loss += loss
return total_loss
return DSSIMMaskLoss
def MSEMaskLossClass(keras):
class MSEMaskLoss(object):
def __init__(self, mask_list, is_tanh=False):
self.mask_list = mask_list
self.is_tanh = is_tanh
def __call__(self,y_true, y_pred):
K = keras.backend
total_loss = None
for mask in self.mask_list:
if not self.is_tanh:
loss = K.mean(K.square(y_true*mask - y_pred*mask))
else:
loss = K.mean(K.square( (y_true/2+0.5)*(mask/2+0.5) - (y_pred/2+0.5)*(mask/2+0.5) ))
if total_loss is None:
total_loss = loss
else:
total_loss += loss
return total_loss
return MSEMaskLoss
def PixelShufflerClass(keras):
class PixelShuffler(keras.engine.topology.Layer):
def __init__(self, size=(2, 2), data_format=None, **kwargs):
super(PixelShuffler, self).__init__(**kwargs)
self.data_format = keras.utils.conv_utils.normalize_data_format(data_format)
self.size = keras.utils.conv_utils.normalize_tuple(size, 2, 'size')
def call(self, inputs):
input_shape = keras.backend.int_shape(inputs)
if len(input_shape) != 4:
raise ValueError('Inputs should have rank ' +
str(4) +
'; Received input shape:', str(input_shape))
if self.data_format == 'channels_first':
batch_size, c, h, w = input_shape
if batch_size is None:
batch_size = -1
rh, rw = self.size
oh, ow = h * rh, w * rw
oc = c // (rh * rw)
out = keras.backend.reshape(inputs, (batch_size, rh, rw, oc, h, w))
out = keras.backend.permute_dimensions(out, (0, 3, 4, 1, 5, 2))
out = keras.backend.reshape(out, (batch_size, oc, oh, ow))
return out
elif self.data_format == 'channels_last':
batch_size, h, w, c = input_shape
if batch_size is None:
batch_size = -1
rh, rw = self.size
oh, ow = h * rh, w * rw
oc = c // (rh * rw)
out = keras.backend.reshape(inputs, (batch_size, h, w, rh, rw, oc))
out = keras.backend.permute_dimensions(out, (0, 1, 3, 2, 4, 5))
out = keras.backend.reshape(out, (batch_size, oh, ow, oc))
return out
def compute_output_shape(self, input_shape):
if len(input_shape) != 4:
raise ValueError('Inputs should have rank ' +
str(4) +
'; Received input shape:', str(input_shape))
if self.data_format == 'channels_first':
height = input_shape[2] * self.size[0] if input_shape[2] is not None else None
width = input_shape[3] * self.size[1] if input_shape[3] is not None else None
channels = input_shape[1] // self.size[0] // self.size[1]
if channels * self.size[0] * self.size[1] != input_shape[1]:
raise ValueError('channels of input and size are incompatible')
return (input_shape[0],
channels,
height,
width)
elif self.data_format == 'channels_last':
height = input_shape[1] * self.size[0] if input_shape[1] is not None else None
width = input_shape[2] * self.size[1] if input_shape[2] is not None else None
channels = input_shape[3] // self.size[0] // self.size[1]
if channels * self.size[0] * self.size[1] != input_shape[3]:
raise ValueError('channels of input and size are incompatible')
return (input_shape[0],
height,
width,
channels)
def get_config(self):
config = {'size': self.size,
'data_format': self.data_format}
base_config = super(PixelShuffler, self).get_config()
return dict(list(base_config.items()) + list(config.items()))
return PixelShuffler
def conv(keras, input_tensor, filters):
x = input_tensor
x = keras.layers.convolutional.Conv2D(filters, kernel_size=5, strides=2, padding='same')(x)
x = keras.layers.advanced_activations.LeakyReLU(0.1)(x)
return x
def upscale(keras, input_tensor, filters, k_size=3):
x = input_tensor
x = keras.layers.convolutional.Conv2D(filters * 4, kernel_size=k_size, padding='same')(x)
x = keras.layers.advanced_activations.LeakyReLU(0.1)(x)
x = PixelShufflerClass(keras)()(x)
return x
def upscale4(keras, input_tensor, filters):
x = input_tensor
x = keras.layers.convolutional.Conv2D(filters * 16, kernel_size=3, padding='same')(x)
x = keras.layers.advanced_activations.LeakyReLU(0.1)(x)
x = PixelShufflerClass(keras)(size=(4, 4))(x)
return x
def res(keras, input_tensor, filters):
x = input_tensor
x = keras.layers.convolutional.Conv2D(filters, kernel_size=3, kernel_initializer=keras.initializers.RandomNormal(0, 0.02), use_bias=False, padding="same")(x)
x = keras.layers.advanced_activations.LeakyReLU(alpha=0.2)(x)
x = keras.layers.convolutional.Conv2D(filters, kernel_size=3, kernel_initializer=keras.initializers.RandomNormal(0, 0.02), use_bias=False, padding="same")(x)
x = keras.layers.Add()([x, input_tensor])
x = keras.layers.advanced_activations.LeakyReLU(alpha=0.2)(x)
return x
def resize_like(tf, keras, ref_tensor, input_tensor):
def func(input_tensor, ref_tensor):
H, W = ref_tensor.get_shape()[1], ref_tensor.get_shape()[2]
return tf.image.resize_bilinear(input_tensor, [H.value, W.value])
return keras.layers.Lambda(func, arguments={'ref_tensor':ref_tensor})(input_tensor)
def total_variation_loss(keras, x):
K = keras.backend
assert K.ndim(x) == 4
B,H,W,C = K.int_shape(x)
a = K.square(x[:, :H - 1, :W - 1, :] - x[:, 1:, :W - 1, :])
b = K.square(x[:, :H - 1, :W - 1, :] - x[:, :H - 1, 1:, :])
return K.mean (a+b)

View file

@ -0,0 +1,10 @@
pathlib==1.0.1
scandir==1.6
h5py==2.7.1
Keras==2.1.6
opencv-python==3.4.0.12
tensorflow-gpu==1.8.0
scikit-image
dlib==19.10.0
tqdm
git+https://www.github.com/keras-team/keras-contrib.git

296
utils/AlignedPNG.py Normal file
View file

@ -0,0 +1,296 @@
PNG_HEADER = b"\x89PNG\r\n\x1a\n"
import string
import struct
import zlib
import pickle
class Chunk(object):
def __init__(self, name=None, data=None):
self.length = 0
self.crc = 0
self.name = name if name else "noNe"
self.data = data if data else b""
@classmethod
def load(cls, data):
"""Load a chunk including header and footer"""
inst = cls()
if len(data) < 12:
msg = "Chunk-data too small"
raise ValueError(msg)
# chunk header & data
(inst.length, raw_name) = struct.unpack("!I4s", data[0:8])
inst.data = data[8:-4]
inst.verify_length()
inst.name = raw_name.decode("ascii")
inst.verify_name()
# chunk crc
inst.crc = struct.unpack("!I", data[8+inst.length:8+inst.length+4])[0]
inst.verify_crc()
return inst
def dump(self, auto_crc=True, auto_length=True):
"""Return the chunk including header and footer"""
if auto_length: self.update_length()
if auto_crc: self.update_crc()
self.verify_name()
return struct.pack("!I", self.length) + self.get_raw_name() + self.data + struct.pack("!I", self.crc)
def verify_length(self):
if len(self.data) != self.length:
msg = "Data length ({}) does not match length in chunk header ({})".format(len(self.data), self.length)
raise ValueError(msg)
return True
def verify_name(self):
for c in self.name:
if c not in string.ascii_letters:
msg = "Invalid character in chunk name: {}".format(repr(self.name))
raise ValueError(msg)
return True
def verify_crc(self):
calculated_crc = self.get_crc()
if self.crc != calculated_crc:
msg = "CRC mismatch: {:08X} (header), {:08X} (calculated)".format(self.crc, calculated_crc)
raise ValueError(msg)
return True
def update_length(self):
self.length = len(self.data)
def update_crc(self):
self.crc = self.get_crc()
def get_crc(self):
return zlib.crc32(self.get_raw_name() + self.data)
def get_raw_name(self):
return self.name if isinstance(self.name, bytes) else self.name.encode("ascii")
# name helper methods
def ancillary(self, set=None):
"""Set and get ancillary=True/critical=False bit"""
if set is True:
self.name[0] = self.name[0].lower()
elif set is False:
self.name[0] = self.name[0].upper()
return self.name[0].islower()
def private(self, set=None):
"""Set and get private=True/public=False bit"""
if set is True:
self.name[1] = self.name[1].lower()
elif set is False:
self.name[1] = self.name[1].upper()
return self.name[1].islower()
def reserved(self, set=None):
"""Set and get reserved_valid=True/invalid=False bit"""
if set is True:
self.name[2] = self.name[2].upper()
elif set is False:
self.name[2] = self.name[2].lower()
return self.name[2].isupper()
def safe_to_copy(self, set=None):
"""Set and get save_to_copy=True/unsafe=False bit"""
if set is True:
self.name[3] = self.name[3].lower()
elif set is False:
self.name[3] = self.name[3].upper()
return self.name[3].islower()
def __str__(self):
return "<Chunk '{name}' length={length} crc={crc:08X}>".format(**self.__dict__)
class IHDR(Chunk):
"""IHDR Chunk
width, height, bit_depth, color_type, compression_method,
filter_method, interlace_method contain the data extracted
from the chunk. Modify those and use and build() to recreate
the chunk. Valid values for bit_depth depend on the color_type
and can be looked up in color_types or in the PNG specification
See:
http://www.libpng.org/pub/png/spec/1.2/PNG-Chunks.html#C.IHDR
"""
# color types with name & allowed bit depths
COLOR_TYPE_GRAY = 0
COLOR_TYPE_RGB = 2
COLOR_TYPE_PLTE = 3
COLOR_TYPE_GRAYA = 4
COLOR_TYPE_RGBA = 6
color_types = {
COLOR_TYPE_GRAY: ("Grayscale", (1,2,4,8,16)),
COLOR_TYPE_RGB: ("RGB", (8,16)),
COLOR_TYPE_PLTE: ("Palette", (1,2,4,8)),
COLOR_TYPE_GRAYA: ("Greyscale+Alpha", (8,16)),
COLOR_TYPE_RGBA: ("RGBA", (8,16)),
}
def __init__(self, width=0, height=0, bit_depth=8, color_type=2, \
compression_method=0, filter_method=0, interlace_method=0):
self.width = width
self.height = height
self.bit_depth = bit_depth
self.color_type = color_type
self.compression_method = compression_method
self.filter_method = filter_method
self.interlace_method = interlace_method
super().__init__("IHDR")
@classmethod
def load(cls, data):
inst = super().load(data)
fields = struct.unpack("!IIBBBBB", inst.data)
inst.width = fields[0]
inst.height = fields[1]
inst.bit_depth = fields[2] # per channel
inst.color_type = fields[3] # see specs
inst.compression_method = fields[4] # always 0(=deflate/inflate)
inst.filter_method = fields[5] # always 0(=adaptive filtering with 5 methods)
inst.interlace_method = fields[6] # 0(=no interlace) or 1(=Adam7 interlace)
return inst
def dump(self):
self.data = struct.pack("!IIBBBBB", \
self.width, self.height, self.bit_depth, self.color_type, \
self.compression_method, self.filter_method, self.interlace_method)
return super().dump()
def __str__(self):
return "<Chunk:IHDR geometry={width}x{height} bit_depth={bit_depth} color_type={}>" \
.format(self.color_types[self.color_type][0], **self.__dict__)
class IEND(Chunk):
def __init__(self):
super().__init__("IEND")
def dump(self):
if len(self.data) != 0:
msg = "IEND has data which is not allowed"
raise ValueError(msg)
if self.length != 0:
msg = "IEND data lenght is not 0 which is not allowed"
raise ValueError(msg)
return super().dump()
def __str__(self):
return "<Chunk:IEND>".format(**self.__dict__)
class FaceswapChunk(Chunk):
def __init__(self, dict_data=None):
super().__init__("fcWp")
self.dict_data = dict_data
def setDictData(self, dict_data):
self.dict_data = dict_data
def getDictData(self):
return self.dict_data
@classmethod
def load(cls, data):
inst = super().load(data)
inst.dict_data = pickle.loads( inst.data )
return inst
def dump(self):
self.data = pickle.dumps (self.dict_data)
return super().dump()
chunk_map = {
b"IHDR": IHDR,
b"fcWp": FaceswapChunk,
b"IEND": IEND
}
class AlignedPNG(object):
def __init__(self):
self.data = b""
self.length = 0
self.chunks = []
@staticmethod
def load(data):
try:
with open(data, "rb") as f:
data = f.read()
except:
raise FileNotFoundError(data)
inst = AlignedPNG()
inst.data = data
inst.length = len(data)
if data[0:8] != PNG_HEADER:
msg = "No Valid PNG header"
raise ValueError(msg)
chunk_start = 8
while chunk_start < inst.length:
(chunk_length, chunk_name) = struct.unpack("!I4s", data[chunk_start:chunk_start+8])
chunk_end = chunk_start + chunk_length + 12
chunk = chunk_map.get(chunk_name, Chunk).load(data[chunk_start:chunk_end])
inst.chunks.append(chunk)
chunk_start = chunk_end
return inst
def save(self, filename):
try:
with open(filename, "wb") as f:
f.write ( self.dump() )
except:
raise Exception( 'cannot save %s' % (filename) )
def dump(self):
data = PNG_HEADER
for chunk in self.chunks:
data += chunk.dump()
return data
def get_shape(self):
for chunk in self.chunks:
if type(chunk) == IHDR:
c = 3 if chunk.color_type == IHDR.COLOR_TYPE_RGB else 4
w = chunk.width
h = chunk.height
return (h,w,c)
return (0,0,0)
def get_height(self):
for chunk in self.chunks:
if type(chunk) == IHDR:
return chunk.height
return 0
def getFaceswapDictData(self):
for chunk in self.chunks:
if type(chunk) == FaceswapChunk:
return chunk.getDictData()
return None
def setFaceswapDictData (self, dict_data=None):
for chunk in self.chunks:
if type(chunk) == FaceswapChunk:
self.chunks.remove(chunk)
break
if not dict_data is None:
chunk = FaceswapChunk(dict_data)
self.chunks.insert(-1, chunk)
def __str__(self):
return "<PNG length={length} chunks={}>".format(len(self.chunks), **self.__dict__)

40
utils/Path_utils.py Normal file
View file

@ -0,0 +1,40 @@
from pathlib import Path
from scandir import scandir
image_extensions = [".jpg", ".jpeg", ".png", ".tif", ".tiff"]
def get_image_paths(dir_path):
dir_path = Path (dir_path)
result = []
if dir_path.exists():
for x in list(scandir(str(dir_path))):
if any([x.name.lower().endswith(ext) for ext in image_extensions]):
result.append(x.path)
return result
def get_image_unique_filestem_paths(dir_path, verbose=False):
result = get_image_paths(dir_path)
result_dup = set()
for f in result[:]:
f_stem = Path(f).stem
if f_stem in result_dup:
result.remove(f)
if verbose:
print ("Duplicate filenames are not allowed, skipping: %s" % Path(f).name )
continue
result_dup.add(f_stem)
return result
def get_all_dir_names_startswith (dir_path, startswith):
dir_path = Path (dir_path)
startswith = startswith.lower()
result = []
if dir_path.exists():
for x in list(scandir(str(dir_path))):
if x.name.lower().startswith(startswith):
result.append ( x.name[len(startswith):] )
return result

246
utils/SubprocessorBase.py Normal file
View file

@ -0,0 +1,246 @@
import traceback
from tqdm import tqdm
import multiprocessing
import time
import sys
class SubprocessorBase(object):
#overridable
def __init__(self, name, no_response_time_sec = 60):
self.name = name
self.no_response_time_sec = no_response_time_sec
#overridable
def process_info_generator(self):
#yield name, host_dict, client_dict - per process
yield 'first process', {}, {}
#overridable
def get_no_process_started_message(self):
return "No process started."
#overridable
def onHostGetProgressBarDesc(self):
return "Processing"
#overridable
def onHostGetProgressBarLen(self):
return 0
#overridable
def onHostGetData(self):
#return data here
return None
#overridable
def onHostDataReturn (self, data):
#input_data.insert(0, obj['data'])
pass
#overridable
def onClientInitialize(self, client_dict):
#return fail message or None if ok
return None
#overridable
def onClientFinalize(self):
pass
#overridable
def onClientProcessData(self, data):
#return result object
return None
#overridable
def onClientGetDataName (self, data):
#return string identificator of your data
return "undefined"
#overridable
def onHostClientsInitialized(self):
pass
#overridable
def onHostResult (self, data, result):
#return count of progress bar update
return 1
#overridable
def onHostProcessEnd(self):
pass
#overridable
def get_start_return(self):
return None
def inc_progress_bar(self, c):
self.progress_bar.update(c)
def safe_print(self, msg):
self.print_lock.acquire()
print (msg)
self.print_lock.release()
def process(self):
#returns start_return
self.processes = []
self.print_lock = multiprocessing.Lock()
for name, host_dict, client_dict in self.process_info_generator():
sq = multiprocessing.Queue()
cq = multiprocessing.Queue()
client_dict.update ( {'print_lock' : self.print_lock} )
p = multiprocessing.Process(target=self.subprocess, args=(sq,cq,client_dict))
p.daemon = True
p.start()
self.processes.append ( { 'process' : p,
'sq' : sq,
'cq' : cq,
'state' : 'busy',
'sent_time': time.time(),
'name': name,
'host_dict' : host_dict
} )
while True:
for p in self.processes[:]:
while not p['cq'].empty():
obj = p['cq'].get()
obj_op = obj['op']
if obj_op == 'init_ok':
p['state'] = 'free'
elif obj_op == 'error':
if obj['close'] == True:
p['process'].terminate()
p['process'].join()
self.processes.remove(p)
break
if all ([ p['state'] == 'free' for p in self.processes ] ):
break
if len(self.processes) == 0:
print ( self.get_no_process_started_message() )
return self.get_start_return()
self.onHostClientsInitialized()
self.progress_bar = tqdm( total=self.onHostGetProgressBarLen(), desc=self.onHostGetProgressBarDesc() )
try:
while True:
for p in self.processes[:]:
while not p['cq'].empty():
obj = p['cq'].get()
obj_op = obj['op']
if obj_op == 'success':
data = obj['data']
result = obj['result']
c = self.onHostResult (data, result)
if c > 0:
self.progress_bar.update(c)
p['state'] = 'free'
elif obj_op == 'error':
if 'data' in obj.keys():
self.onHostDataReturn ( obj['data'] )
if obj['close'] == True:
p['sq'].put ( {'op': 'close'} )
p['process'].join()
self.processes.remove(p)
break
p['state'] = 'free'
for p in self.processes[:]:
if p['state'] == 'free':
data = self.onHostGetData()
if data is not None:
p['sq'].put ( {'op': 'data', 'data' : data} )
p['sent_time'] = time.time()
p['sent_data'] = data
p['state'] = 'busy'
elif p['state'] == 'busy':
if (time.time() - p['sent_time']) > self.no_response_time_sec:
print ( '%s doesnt response, terminating it.' % (p['name']) )
self.onHostDataReturn ( p['sent_data'] )
p['sq'].put ( {'op': 'close'} )
p['process'].join()
self.processes.remove(p)
if all ([p['state'] == 'free' for p in self.processes]):
break
time.sleep(0.005)
except:
print ("Exception occured in Subprocessor.start(): %s" % (traceback.format_exc()) )
self.progress_bar.close()
for p in self.processes[:]:
p['sq'].put ( {'op': 'close'} )
while True:
for p in self.processes[:]:
while not p['cq'].empty():
obj = p['cq'].get()
obj_op = obj['op']
if obj_op == 'finalized':
p['state'] = 'finalized'
if all ([p['state'] == 'finalized' for p in self.processes]):
break
for p in self.processes[:]:
p['process'].terminate()
self.onHostProcessEnd()
return self.get_start_return()
def subprocess(self, sq, cq, client_dict):
self.print_lock = client_dict['print_lock']
try:
fail_message = self.onClientInitialize(client_dict)
except:
fail_message = 'Exception while initialization: %s' % (traceback.format_exc())
if fail_message is None:
cq.put ( {'op': 'init_ok'} )
else:
print (fail_message)
cq.put ( {'op': 'error', 'close': True} )
return
while True:
obj = sq.get()
obj_op = obj['op']
if obj_op == 'data':
data = obj['data']
try:
result = self.onClientProcessData (data)
cq.put ( {'op': 'success', 'data' : data, 'result' : result} )
except:
print ( 'Exception while process data [%s]: %s' % (self.onClientGetDataName(data), traceback.format_exc()) )
cq.put ( {'op': 'error', 'close': True, 'data' : data } )
elif obj_op == 'close':
break
time.sleep(0.005)
self.onClientFinalize()
cq.put ( {'op': 'finalized'} )
while True:
time.sleep(0.1)

264
utils/image_utils.py Normal file
View file

@ -0,0 +1,264 @@
import sys
from utils import random_utils
import numpy as np
import cv2
import localization
from scipy.spatial import Delaunay
from PIL import Image, ImageDraw, ImageFont
def channel_hist_match(source, template, mask=None):
# Code borrowed from:
# https://stackoverflow.com/questions/32655686/histogram-matching-of-two-images-in-python-2-x
masked_source = source
masked_template = template
if mask is not None:
masked_source = source * mask
masked_template = template * mask
oldshape = source.shape
source = source.ravel()
template = template.ravel()
masked_source = masked_source.ravel()
masked_template = masked_template.ravel()
s_values, bin_idx, s_counts = np.unique(source, return_inverse=True,
return_counts=True)
t_values, t_counts = np.unique(template, return_counts=True)
ms_values, mbin_idx, ms_counts = np.unique(source, return_inverse=True,
return_counts=True)
mt_values, mt_counts = np.unique(template, return_counts=True)
s_quantiles = np.cumsum(s_counts).astype(np.float64)
s_quantiles /= s_quantiles[-1]
t_quantiles = np.cumsum(t_counts).astype(np.float64)
t_quantiles /= t_quantiles[-1]
interp_t_values = np.interp(s_quantiles, t_quantiles, t_values)
return interp_t_values[bin_idx].reshape(oldshape)
def color_hist_match(src_im, tar_im, mask=None):
h,w,c = src_im.shape
matched_R = channel_hist_match(src_im[:,:,0], tar_im[:,:,0], mask)
matched_G = channel_hist_match(src_im[:,:,1], tar_im[:,:,1], mask)
matched_B = channel_hist_match(src_im[:,:,2], tar_im[:,:,2], mask)
to_stack = (matched_R, matched_G, matched_B)
for i in range(3, c):
to_stack += ( src_im[:,:,i],)
matched = np.stack(to_stack, axis=-1).astype(src_im.dtype)
return matched
pil_fonts = {}
def _get_pil_font (font, size):
global pil_fonts
try:
font_str_id = '%s_%d' % (font, size)
if font_str_id not in pil_fonts.keys():
pil_fonts[font_str_id] = ImageFont.truetype(font + ".ttf", size=size, encoding="unic")
pil_font = pil_fonts[font_str_id]
return pil_font
except:
return ImageFont.load_default()
def get_text_image( shape, text, color=(1,1,1), border=0.2, font=None):
try:
size = shape[1]
pil_font = _get_pil_font( localization.get_default_ttf_font_name() , size)
text_width, text_height = pil_font.getsize(text)
canvas = Image.new('RGB', shape[0:2], (0,0,0) )
draw = ImageDraw.Draw(canvas)
offset = ( 0, 0)
draw.text(offset, text, font=pil_font, fill=tuple((np.array(color)*255).astype(np.int)) )
result = np.asarray(canvas) / 255
if shape[2] != 3:
result = np.concatenate ( (result, np.ones ( (shape[1],) + (shape[0],) + (shape[2]-3,)) ), axis=2 )
return result
except:
return np.zeros ( (shape[1], shape[0], shape[2]), dtype=np.float32 )
def draw_text( image, rect, text, color=(1,1,1), border=0.2, font=None):
h,w,c = image.shape
l,t,r,b = rect
l = np.clip (l, 0, w-1)
r = np.clip (r, 0, w-1)
t = np.clip (t, 0, h-1)
b = np.clip (b, 0, h-1)
image[t:b, l:r] += get_text_image ( (r-l,b-t,c) , text, color, border, font )
def draw_text_lines (image, rect, text_lines, color=(1,1,1), border=0.2, font=None):
text_lines_len = len(text_lines)
if text_lines_len == 0:
return
l,t,r,b = rect
h = b-t
h_per_line = h // text_lines_len
for i in range(0, text_lines_len):
draw_text (image, (l, i*h_per_line, r, (i+1)*h_per_line), text_lines[i], color, border, font)
def get_draw_text_lines ( image, rect, text_lines, color=(1,1,1), border=0.2, font=None):
image = np.zeros ( image.shape, dtype=np.float )
draw_text_lines ( image, rect, text_lines, color, border, font)
return image
def draw_polygon (image, points, color, thickness = 1):
points_len = len(points)
for i in range (0, points_len):
p0 = tuple( points[i] )
p1 = tuple( points[ (i+1) % points_len] )
cv2.line (image, p0, p1, color, thickness=thickness)
def draw_rect(image, rect, color, thickness=1):
l,t,r,b = rect
draw_polygon (image, [ (l,t), (r,t), (r,b), (l,b ) ], color, thickness)
def rectContains(rect, point) :
return not (point[0] < rect[0] or point[0] >= rect[2] or point[1] < rect[1] or point[1] >= rect[3])
def applyAffineTransform(src, srcTri, dstTri, size) :
warpMat = cv2.getAffineTransform( np.float32(srcTri), np.float32(dstTri) )
return cv2.warpAffine( src, warpMat, (size[0], size[1]), None, flags=cv2.INTER_LINEAR, borderMode=cv2.BORDER_REFLECT_101 )
def morphTriangle(dst_img, src_img, st, dt) :
(h,w,c) = dst_img.shape
sr = np.array( cv2.boundingRect(np.float32(st)) )
dr = np.array( cv2.boundingRect(np.float32(dt)) )
sRect = st - sr[0:2]
dRect = dt - dr[0:2]
d_mask = np.zeros((dr[3], dr[2], c), dtype = np.float32)
cv2.fillConvexPoly(d_mask, np.int32(dRect), (1.0,)*c, 8, 0);
imgRect = src_img[sr[1]:sr[1] + sr[3], sr[0]:sr[0] + sr[2]]
size = (dr[2], dr[3])
warpImage1 = applyAffineTransform(imgRect, sRect, dRect, size)
dst_img[dr[1]:dr[1]+dr[3], dr[0]:dr[0]+dr[2]] = dst_img[dr[1]:dr[1]+dr[3], dr[0]:dr[0]+dr[2]]*(1-d_mask) + warpImage1 * d_mask
def morph_by_points (image, sp, dp):
if sp.shape != dp.shape:
raise ValueError ('morph_by_points() sp.shape != dp.shape')
(h,w,c) = image.shape
result_image = np.zeros(image.shape, dtype = image.dtype)
for tri in Delaunay(dp).simplices:
morphTriangle(result_image, image, sp[tri], dp[tri])
return result_image
def equalize_and_stack_square (images, axis=1):
max_c = max ([ 1 if len(image.shape) == 2 else image.shape[2] for image in images ] )
target_wh = 99999
for i,image in enumerate(images):
if len(image.shape) == 2:
h,w = image.shape
c = 1
else:
h,w,c = image.shape
if h < target_wh:
target_wh = h
if w < target_wh:
target_wh = w
for i,image in enumerate(images):
if len(image.shape) == 2:
h,w = image.shape
c = 1
else:
h,w,c = image.shape
if c < max_c:
if c == 1:
if len(image.shape) == 2:
image = np.expand_dims ( image, -1 )
image = np.concatenate ( (image,)*max_c, -1 )
elif c == 2: #GA
image = np.expand_dims ( image[...,0], -1 )
image = np.concatenate ( (image,)*max_c, -1 )
else:
image = np.concatenate ( (image, np.ones((h,w,max_c - c))), -1 )
if h != target_wh or w != target_wh:
image = cv2.resize ( image, (target_wh, target_wh) )
h,w,c = image.shape
images[i] = image
return np.concatenate ( images, axis = 1 )
def bgr2hsv (img):
return cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
def hsv2bgr (img):
return cv2.cvtColor(img, cv2.COLOR_HSV2BGR)
def bgra2hsva (img):
return np.concatenate ( (cv2.cvtColor(img[...,0:3], cv2.COLOR_BGR2HSV ), np.expand_dims (img[...,3], -1)), -1 )
def bgra2hsva_list (imgs):
return [ bgra2hsva(img) for img in imgs ]
def hsva2bgra (img):
return np.concatenate ( (cv2.cvtColor(img[...,0:3], cv2.COLOR_HSV2BGR ), np.expand_dims (img[...,3], -1)), -1 )
def hsva2bgra_list (imgs):
return [ hsva2bgra(img) for img in imgs ]
def gen_warp_params (source, flip, rotation_range=[-10,10], scale_range=[-0.5, 0.5], tx_range=[-0.05, 0.05], ty_range=[-0.05, 0.05] ):
h,w,c = source.shape
if (h != w) or (w != 64 and w != 128 and w != 256 and w != 512 and w != 1024):
raise ValueError ('TrainingDataGenerator accepts only square power of 2 images.')
rotation = np.random.uniform( rotation_range[0], rotation_range[1] )
scale = np.random.uniform(1 +scale_range[0], 1 +scale_range[1])
tx = np.random.uniform( tx_range[0], tx_range[1] )
ty = np.random.uniform( ty_range[0], ty_range[1] )
#random warp by grid
cell_size = [ w // (2**i) for i in range(1,4) ] [ np.random.randint(3) ]
cell_count = w // cell_size + 1
grid_points = np.linspace( 0, w, cell_count)
mapx = np.broadcast_to(grid_points, (cell_count, cell_count)).copy()
mapy = mapx.T
mapx[1:-1,1:-1] = mapx[1:-1,1:-1] + random_utils.random_normal( size=(cell_count-2, cell_count-2) )*(cell_size*0.24)
mapy[1:-1,1:-1] = mapy[1:-1,1:-1] + random_utils.random_normal( size=(cell_count-2, cell_count-2) )*(cell_size*0.24)
half_cell_size = cell_size // 2
mapx = cv2.resize(mapx, (w+cell_size,)*2 )[half_cell_size:-half_cell_size-1,half_cell_size:-half_cell_size-1].astype(np.float32)
mapy = cv2.resize(mapy, (w+cell_size,)*2 )[half_cell_size:-half_cell_size-1,half_cell_size:-half_cell_size-1].astype(np.float32)
#random transform
random_transform_mat = cv2.getRotationMatrix2D((w // 2, w // 2), rotation, scale)
random_transform_mat[:, 2] += (tx*w, ty*w)
params = dict()
params['mapx'] = mapx
params['mapy'] = mapy
params['rmat'] = random_transform_mat
params['w'] = w
params['flip'] = flip and np.random.randint(10) < 4
return params
def warp_by_params (params, img, warp, transform, flip):
if warp:
img = cv2.remap(img, params['mapx'], params['mapy'], cv2.INTER_LANCZOS4 )
if transform:
img = cv2.warpAffine( img, params['rmat'], (params['w'], params['w']), borderMode=cv2.BORDER_CONSTANT, flags=cv2.INTER_LANCZOS4 )
if flip and params['flip']:
img = img[:,::-1,:]
return img

63
utils/iter_utils.py Normal file
View file

@ -0,0 +1,63 @@
import threading
import queue as Queue
import multiprocessing
import time
class ThisThreadGenerator(object):
def __init__(self, generator_func, user_param=None):
super().__init__()
self.generator_func = generator_func
self.user_param = user_param
self.initialized = False
def __iter__(self):
return self
def __next__(self):
if not self.initialized:
self.initialized = True
self.generator_func = self.generator_func(self.user_param)
return next(self.generator_func)
class SubprocessGenerator(object):
def __init__(self, generator_func, user_param=None, prefetch=2):
super().__init__()
self.prefetch = prefetch
self.generator_func = generator_func
self.user_param = user_param
self.sc_queue = multiprocessing.Queue()
self.cs_queue = multiprocessing.Queue()
self.p = None
def process_func(self):
self.generator_func = self.generator_func(self.user_param)
while True:
while self.prefetch > -1:
try:
gen_data = next (self.generator_func)
except StopIteration:
self.cs_queue.put (None)
return
self.cs_queue.put (gen_data)
self.prefetch -= 1
self.sc_queue.get()
self.prefetch += 1
def __iter__(self):
return self
def __next__(self):
if self.p == None:
self.p = multiprocessing.Process(target=self.process_func, args=())
self.p.daemon = True
self.p.start()
gen_data = self.cs_queue.get()
if gen_data is None:
self.p.terminate()
self.p.join()
raise StopIteration()
self.sc_queue.put (1)
return gen_data

18
utils/os_utils.py Normal file
View file

@ -0,0 +1,18 @@
import sys
if sys.platform[0:3] == 'win':
from ctypes import windll
from ctypes import wintypes
def set_process_lowest_prio():
if sys.platform[0:3] == 'win':
GetCurrentProcess = windll.kernel32.GetCurrentProcess
GetCurrentProcess.restype = wintypes.HANDLE
SetPriorityClass = windll.kernel32.SetPriorityClass
SetPriorityClass.argtypes = (wintypes.HANDLE, wintypes.DWORD)
SetPriorityClass ( GetCurrentProcess(), 0x00000040 )
def set_process_dpi_aware():
if sys.platform[0:3] == 'win':
windll.user32.SetProcessDPIAware(True)

14
utils/random_utils.py Normal file
View file

@ -0,0 +1,14 @@
import numpy as np
def random_normal( size=(1,), trunc_val = 2.5 ):
len = np.array(size).prod()
result = np.empty ( (len,) , dtype=np.float32)
for i in range (len):
while True:
x = np.random.normal()
if x >= -trunc_val and x <= trunc_val:
break
result[i] = (x / trunc_val)
return result.reshape ( size )

36
utils/std_utils.py Normal file
View file

@ -0,0 +1,36 @@
import os
import sys
class suppress_stdout_stderr(object):
def __enter__(self):
self.outnull_file = open(os.devnull, 'w')
self.errnull_file = open(os.devnull, 'w')
self.old_stdout_fileno_undup = sys.stdout.fileno()
self.old_stderr_fileno_undup = sys.stderr.fileno()
self.old_stdout_fileno = os.dup ( sys.stdout.fileno() )
self.old_stderr_fileno = os.dup ( sys.stderr.fileno() )
self.old_stdout = sys.stdout
self.old_stderr = sys.stderr
os.dup2 ( self.outnull_file.fileno(), self.old_stdout_fileno_undup )
os.dup2 ( self.errnull_file.fileno(), self.old_stderr_fileno_undup )
sys.stdout = self.outnull_file
sys.stderr = self.errnull_file
return self
def __exit__(self, *_):
sys.stdout = self.old_stdout
sys.stderr = self.old_stderr
os.dup2 ( self.old_stdout_fileno, self.old_stdout_fileno_undup )
os.dup2 ( self.old_stderr_fileno, self.old_stderr_fileno_undup )
os.close ( self.old_stdout_fileno )
os.close ( self.old_stderr_fileno )
self.outnull_file.close()
self.errnull_file.close()