Viewing posts for the category Omarine User's Manual
Many data scientists confuse in distinguishing between Classification neural network and Regression neural network. There are several reasons:
Currently the popular output class encoding method is one-hot, each class corresponds to a vector that only one bit of the class is turned on (by 1), while the other bits are 0. The most applicable in this way is TensorFlow with one_hot () function
The output is a level 4 square matrix, each row representing one class. Suppose those classes have the following labels:
Local minima is also a controversial issue. A theoretical proof (Hornik 1989) under strong assumptions with the conclusion that the neural network has no local minima is rejected by experimentation on the real model with finite sample set. In this article we will analyze the problem in a different direction with conclusive conclusion: The neural network may have local minima but not serious.
In the previous article I talked about the limit of mathematics. So what is that limit? That is the limit on its axioms.
PRINCIPLE
"Can't be inside but prove the outside".
Mathematics is limited to axioms, its scope is only a small special case in the problem space in general. For example, vector space on ℝ must satisfy its 10 axioms. What would you do if, for example, only 9 axioms are satisfied? You cannot use mathematics to prove the outside things that is not bound by those axioms. If you put all the problems inside you will become misguided. We are entering the era of AI of cognitive programs that simulate the activity of the human brain. Human awareness is very rich and cannot be calculated. AI also, the capacity of a neural network is not the same as a normal program.
LOOK AT THE PARAMETERS, DO NOT CONSIDER THE WEIGHTS
Instead of proving, we are outside observing. We do not question the local minima of the error function in the weight space, because by doing that we have defaulted the problem to the minima problem of a function. Instead, consider the weights as the network parameters.
For simplicity, we consider a network that has only one input node, one output node, no hidden node, and no transfer function (many called activation function). Only two weights a_{1} and a_{0}, the output of the network is simply a linear function
y = a_{1}x + a_{0}
We also use only one sample (Xs, Ys).
For a_{0} = 0, we have y = a_{1}x
It can be said that setting up a neural network without using a test set is a legitimate desire of neural network researchers, because taking away some examples to create a test set the network will not be learned those examples. In the past, it must be mentioned that these approaches are made by John Moody, David MacKay, Vladimir Vapnik. However, those proposals have not come up with a solution that we can use today.
So the question "Need the test set?" Or "Whether or not a test set exists?" is left open, and we will answer the question here.
The problem of test set becomes important when the example set is too small, the loss of some rare examples used in testing will be an expensive price for the learning quality of the network. So is there any way that we do not need to use the test set and still ensure the requirements of the network? To answer this question, first of all, let's see what the test set is for. What does it check. If we achieve the requirement that the test requires and do not use it as a mandatory element, then we do not have to cost it.
OVERFITTING
The overfitting is a phenomenon that a network well fits with the examples it learns, but gets large errors for examples that it is not learned. In other words, it is not capable of generalization.
Can't see mail in Inbox? Check your Spam folder.