I would like to say that the AI era is just beginning, that there is plenty of room for you to be creative, especially young people. It is because of two reasons below:
You are a student? Can you reap a technology that surpasses tech giants like Google? Entirely possible. Because you have the power of youth.
Output data can be represented by a vector corresponding to a class. Suppose there are four classes 0, 1, 2, 3. If you have a vector (0.05, 0.1, 0.2, 0.25) which class would you choose?
Let's see how the argmax () function of Google's TensorFlow selects
It selects the class 3, ie the class with the index of the largest element in the vector (0.25). In other words, it selects the fourth class corresponding to the fourth node with the highest output value.
This method should be considered.
Assume that the output of a node reflects the probability that the sample belongs to the corresponding class (in perfect network conditions). Then the result is just an event where a sample belongs to a class and the class with the greatest probability is chosen. This is only true for one-hot encoding, the classes are labeled simply and have no value attached. In terms of vector space the classes do not distinguish the distance so do not describe the different levels of the classes.
Random events are just a simple start of probability. A problem in general needs to handle random variables with its values.
Imagine we play a dice game. There are 6 bonus levels corresponding to 6 sides of the dice. Level 1 is awarded $ 1, level 2 is $ 2 ... level 6 is $ 6 that evaluates to the number of spots you throw. If you throw the dice n times and have the highest probability of appearing side 1, but the probability of appearing on side 6 is only a little less, and you are awarded level 1 only based on its probability, is it fair ? Therefore, a better reward level should be calculated according to the expected value.
In the above output example, suppose classes are assigned values according to their index, the expected value is
(0 × 0.05 + 1 × 0.1 + 2 × 0.2 + 3 × 0.25) / (0.05 + 0.1 + 0.2 + 0.25) ≈ 2
We see that the expected value or the mean value is close to class 2, not class 3!
So it needs to be classified in other way. Not necessarily by probability because that is generally not guaranteed, but based on the distance between the vectors. Of course, it is necessary to re-present the vectors of the classes.
See also Is it possible to do defuzzication a discrete fuzzy set?
The techniques introduced in this series are all techniques that have been applied in Football Predictions 4.0.
Share on Twitter Share on Facebook
Can't see mail in Inbox? Check your Spam folder.