Neural network: Using genetic algorithms to train and deploy neural networks: Data mining with id3


We already know that although id3 is a legacy with limited capabilities, it is still the No. 1 candidate for distinguishing important features with a strong theoretical basis. So why not use it to remove irrelevant features?


Regularization is a measure of transforming the problem into a basic form that can be solved by known methods, which can be applied when certain assumptions are satisfied. For example linearization of data in linear regression.
In machine learning (deep learning) there is no specific way to do that. It is desirable to remove less relevant or irrelevant features to simplify the problem. Since then name the method. A typical example is Google's "L1 Regularization" method, which has been detailed in the article Neural network: Using genetic algorithms to train and deploy neural networks: Need the test set? so I don't repeat it here. One thing can be seen immediately that the "L1 Regularization" method eliminates the less relevant features only after they have been put into the network. What do you think about this? Put noise into the network and then find a way to remove it! How many features are less relevant? Which ones?

If there are indeed less relevant features then id3 is a great way to remove them. This is done in the data mining step, ie before putting data into the network for training. You can identify these less relevant features by programming or using the tool fpp. Removed features will not be present in the id3's output


id3 can also be used in other cases below:

1) Creating a test set
Use id3 to simulate a test set.

2) Finding the most important feature
The most important feature will be the feature at the root of the decision tree.

3) Checking the data
If the training data has many "No Data" leaves, it can be confirmed that it lacks the samples

Currently unrated


There are currently no comments

New Comment


required (not published)



What is 1 × 3?