fix bug with samples that were not clipped after tanh-untanh transformations, upd README.md

2025-08-22 14:24:40 -07:00 · 2019-02-10 10:45:51 +04:00 · 2019-02-10 10:45:51 +04:00 · 854ab11de3
commit 854ab11de3
parent 51a13c90d1
3 changed files with 16 additions and 8 deletions
--- a/README.md
+++ b/README.md
@ -106,15 +106,17 @@ Every model is good for specific scenes and faces.

 H64 - good for straight faces as a demo and for low vram.

-H128 - good for straight faces, gives highest resolution and details possible in 2019. Absolute best for asian faces, because they are flat, similar and evenly lighted with clear skin.
+H128 - good for straight faces, gives higher resolution and details.

 DF - good for side faces, but results in a lower resolution and details. Covers more area of cheeks. Keeps face unmorphed. Good for similar face shapes.

 LIAE - can partially fix dissimilar face shapes, but results in a less recognizable face.

+SAE - new flexible model. Absolute best in 2019.
+
 SAE tips:

- SAE - actually contains all other models, but better due to multiscale decoder + pixel loss. Just set style powers to 0.0 to work as default (H128/DF/LIAE) model.
+- SAE - actually contains all other models, but better due to smooth DSSIM-MSE(pixel loss) transition. Just set style powers to 0.0 to work as default (H128/DF/LIAE) model.

 - if src faceset has number of faces more than dst faceset, model can be not converged. In this case try 'Feed faces to network sorted by yaw' option.

@ -124,6 +126,8 @@ SAE tips:

 - if you have a lot of VRAM, you can choose between batch size that affects quality of generalization and enc/dec dims that affects image quality.

+- common training algorithm for styled face: set initial face and bg style values to 10.0, train it to 15k-20k epochs, then overwrite settings and set face style to 0.1, bg style to 4.0, and train it up to clear result.
+
 - how to train extremely obstructed face model with SAE? There are no absolute best solution for that. All depends on scene. Experiment with styling values on your own during training. Enable 'write preview history' and track changes. Backup model files every 10k epochs. You can revert model files and change values if something goes wrong.

 Improperly matched dst landmarks may significantly reduce fake quality:
--- a/models/Model_SAE/Model.py
+++ b/models/Model_SAE/Model.py
@ -345,10 +345,10 @@ class SAEModel(ModelBase):
        test_B_m = sample[1][2][0:4]

        if self.options['learn_mask']:
-            S, D, SS, DD, SD, SDM = [ x / 2 + 0.5 for x in ([test_A,test_B] + self.AE_view ([test_A, test_B]) ) ]
+            S, D, SS, DD, SD, SDM = [ np.clip(x / 2 + 0.5, 0.0, 1.0) for x in ([test_A,test_B] + self.AE_view ([test_A, test_B]) ) ]
            SDM, = [ np.repeat (x, (3,), -1) for x in [SDM] ]
        else:
-            S, D, SS, DD, SD, = [ x / 2 + 0.5 for x in ([test_A,test_B] + self.AE_view ([test_A, test_B]) ) ]
+            S, D, SS, DD, SD, = [ np.clip(x / 2 + 0.5, 0.0, 1.0) for x in ([test_A,test_B] + self.AE_view ([test_A, test_B]) ) ]

        st = []
        for i in range(0, len(test_A)):
@ -360,7 +360,8 @@ class SAEModel(ModelBase):
        return [ ('SAE', np.concatenate (st, axis=0 )), ]
    
    def predictor_func (self, face):
-        face_tanh = face * 2.0 - 1.0        
+        face_tanh = np.clip(face * 2.0 - 1.0, -1.0, 1.0)
+        
        face_bgr = face_tanh[...,0:3] 
        prd = [ (x[0] + 1.0) / 2.0 for x in self.AE_convert ( [ np.expand_dims(face_bgr,0) ] ) ]
 
--- a/samples/SampleProcessor.py
+++ b/samples/SampleProcessor.py
@ -197,8 +197,11 @@ class SampleProcessor(object):
                else:
                    raise ValueError ('expected SampleTypeFlags mode')
         
-                if not debug and sample_process_options.normalize_tanh:
-                    img = img * 2.0 - 1.0
+                if not debug:
+                    if sample_process_options.normalize_tanh:
+                        img = np.clip (img * 2.0 - 1.0, -1.0, 1.0)
+                    else:
+                        img = np.clip (img, 0.0, 1.0)

            outputs.append ( img )