# Documentation

Total views
1,468,078

Viewing posts for the category Omarine User's Manual

## Neural network: Using genetic algorithms to train and deploy neural networks: Auto AI - Automatically determine the number of training steps according to the contraction mapping principle

The question of the number of steps needed to train a neural network is answered by the convergence of genetic algorithms according to the principle of contraction mapping. Having the number of training steps does not require a hyperparametric optimization method in the constraint domain, but it is formed with arbitrary values ​​according to each problem. It is similar to unsupervised learning, there is no monitoring or adjustment process for the number of steps to train, because it exists only as a consequence, when genetic algorithms converge. That's when the network population returns to itself after a generation. This point is called the fixed point.

Metric space

During evolution, the algorithm creates a sequence of network populations. Call Pt as a population in the t-generation. The initial population will be P0. We define a metric space as follows:

Let S be a set of populations. A metric d is defined:

d (Pu, Pv) = | Eu - Ev |
In which Eu and Ev are the average errors of the populations at the u and v generations, ie Pu and Pv.
We design the population to always improve after each generation, ie Ev <Eu ∀ Pu, Pv ∈ S | u <v.

(S, d) is a metric space due to ∀ Pu, Pv, Pt ∈ S:

d (Pu, Pv) ≥ 0
d (Pu, Pv) = 0 ⇔ Pu = Pv
d (Pu, Pv) = d (Pv, Pu)
d (Pu, Pv) ≤ d (Pu, Pt) + d (Pt, Pv)

For the fourth characteristic, without losing generality we assume u ≤ v. There are three situations:
If t ≤ u
d (Pu, Pv) = Eu - Ev
d (Pt, Pv) = Et - Ev
Since t ≤ u so Et ≥ Eu, therefore Et - Ev ≥ Eu - Ev. Ie d (Pu, Pv) ≤ d (Pt, Pv). Because d (Pu, Pt) ≥ 0 so d (Pu, Pv) ≤ d (Pu, Pt) + d (Pt, Pv).
It is similar to t ≥ v.
If u <t <v,
d (Pu, Pt) = Eu - Et
d (Pt, Pv) = Et - Ev
Therefore
d (Pu, Pt) + d (Pt, Pv) = Eu - Et + Et - Ev = Eu - Ev = d (Pu, Pv)

The population sequence {Pt} corresponds 1 - 1 with the average error sequence {Et}. The sequence {Et} is a sequence of real numbers that is monotonically decreasing and bounded from below (by 0) so it converges in ℝ. Correspondingly, the sequence {Pt} converges in S.
So the metric space (S, d) is a complete metric space.

Note that the above converged population sequence is just an example that illustrates the completion of space. It iterates infinitely to a local minima. We need another convergent sequence, "much shorter" with the contraction nature of the mapping, and go to the only fixed point, which is the global optimization.

Contraction mapping

Convergence occurs at the end of evolution. At this point the optimal scheme has formed and the algorithm preserves the scheme, the genetic operations make the population change less. So we can easily ensure that the population's evolution is slowed down.

For generations u <v, we have

Ev - Ev+1 <Eu - Eu+1
or
Eu+1 - Ev+1 <Eu - Ev

Consider the mapping f: S → S | f (Pt) = Pt+1. This is evolutionary mapping. It impacts on the population of t-generation and produces the t + 1 generation population.

d (f (Pu), f (Pv)) = d (Pu+1, Pv+1) = Eu+1 - Ev+1 <Eu - Ev = d (Pu, Pv)
We can also set a coefficient q ∈ [0, 1) such that ∀ Pu, Pv, ∈ S

d (f (Pu), f (Pv)) ≤ qd (Pu, Pv)

So f is a contraction mapping.

The algorithm will converge to the fixed point. We just need to check if

f (Pt*-1) = Pt* = Pt*-1

or

Et* - Et*-1 = 0

then stop.

What is interesting is that the time of stopping the algorithm determines the number of evolutionary generations that are very different for different problems.. They only share a common feature of convergence. The traditional optimization method will be difficult when the parameter is in a wide range. The superiority of genetic algorithms is there

## Neural network: Using genetic algorithms to train and deploy neural networks: Multithreading ## Neural network: Using genetic algorithms to train and deploy neural networks: Embedding

Unlike normal data mining processes, embedded operation is in the interference area between data mining and machine learning. It has just exploited knowledge to put it into machine learning, and received training in data exploitation from the network. The mining is like a pipe where the flow of knowledge is progressing over time through training. That means the network has to learn knowledge and adjust the source of knowledge.

Embedded work has two effects:

## Neural network: Using genetic algorithms to train and deploy neural networks: Selecting the class

I would like to say that the AI ​​era is just beginning, that there is plenty of room for you to be creative, especially young people. It is because of two reasons below:

## Neural network: Using genetic algorithms to train and deploy neural networks: The probability distributions

Uniform distribution is fundamental, used in most cases such as to assign hybrid rates and mutation rates in genetic algorithms. The distribution can be applied directly like that, or as a basis for a subsequent process, like browsing a tree starting from the root. A specific application is to use in embedding techniques of neural networks.
Beside that, the Standard normal distribution, especially Truncated standard normal distribution, is useful in the initialization of neural networks.
We will in turn learn about them with full source code.

Uniform distribution

Uniform distribution is the most common, the probability to yield values ​​is equal within a range of their values. Fortunately, we already have many functions available to create this distribution, including erand48(). This function returns nonnegative double-precision floating-point values uniformly distributed over the interval [0.0, 1.0). Transforming to an arbitrary range is very simple, for example creating a uniform distribution within [-1.0, 1.0): ### Top Posts & Pages

• ###### Omarine-log (3,664 hits)
Join 1,272 other followers

What is 7 × 5?

Can't see mail in Inbox? Check your Spam folder.