Neural network: Using genetic algorithms to train and deploy neural networks: Contraction mapping - Lipschitz constant setting


(0 comments)

In inequality

d (f (Pu), f (Pv)) ≤ qd (Pv, Pu)

the coefficient q is called the Lipschitz constant. It determines the convergence speed of the contraction mapping



If t = 0



Let q = 0.99 we have



That means that the length of evolution is less or equal to 100 times the length of the first step of the algorithm. So Lipschitz constant lets us visualize the number of evolutionary generations of the population.


Lipschitz constant setting

We need to set the Lipschitz constant to satisfy the inequality of the contraction mapping. We will prove that inequality is equivalent to


d (Pt+1, Pt+2) ≤ qd (Pt, Pt+1)    (*)

∀ t ∈ ℕ

Inequality (*) states that at all times, the distance between two successive generations is always less or equal to q the distance between two previous successive generations.
Without losing generality, suppose u <v, we have

d (f (Pu), f (Pv)) = d (Pu+1, Pv+1) = Eu+1 - Ev+1 = Eu+1 - Eu+2 + Eu+2 - Eu+3 + ... + Ev - Ev+1

= (Eu+1 - Eu+2) + (Eu+2 - Eu+3) + ... + (Ev - Ev+1) = d (Pu+1, Pu+2) + d (Pu+2, Pu+3) + ... + d (Pv, Pv+1)

According to inequality (*) we have

d (Pu+1, Pu+2) ≤ qd (Pu, Pu+1)
d (Pu+2, Pu+3) ≤ qd (Pu+1, Pu+2)
...
d (Pv, Pv+1) ≤ qd (Pv-1, Pv)

Therefore

d (Pu+1, Pv+1) ≤ qd (Pu, Pu+1) + qd (Pu+1, Pu+2) + ... + qd (Pv-1, Pv) = qd (Pu, Pv)

Thus we only need to satisfy the inequality (*) instead of the inequality of the contraction mapping.

This is simply that we hold the two average error values ​​Et-2, Et-1 of the previous two generations, along with the current average error value Et to evaluate the condition of the contraction mapping

for (;;) {
    
    // ...
    
    if (Et == Et_1)
        break;
    
    if ( Et_1 - Et > q * (Et_2 - Et_1) ) {
        
        undo();
        continue;
    }
    
    backup();
    
    Et_2 = Et_1;
    Et_1 = Et;
    
    // ...

}



The above theory clearly shows the effectivity in experiment, the algorithm converges very fast, like the contraction nature of the mapping.

Note: The population needs to move freely before actually entering the contraction mapping.

Currently unrated

Comments

There are currently no comments

New Comment

required

required (not published)

optional

required


What is 2 - 1?

required