Decision-making tree decipher

Click upper part attention, china of All In AI

Decision-making tree is born at 70 old before, had made one of the most powerful machine study tools nowadays. The main good point of decision-making tree is method of " of white box of a kind of " . What this means people to explain them easily is decision-making, and the complexity of nerve network normally exorbitant.

It also can express to be If-then regular volume, in order to increase readability. The purpose of the article is the basic and theoretical foundation that introduces decision-making tree study, put forward ID3 (iteration dichotomy 3) algorithmic.

Decision-making tree is used at classify and returning to a problem, here we will discuss classified issue.

What is decision-making tree?

Decision-making tree resembles the tree of a convert, its root is located in coping. Decision-making tree arrives through will cultivating Cong Shugen leaf node (decision-making) be down sort to undertake classified to example, this node provides the classification of example. Every node in decision-making tree assigned the test of the certain property to example, and drop from this node every branch is corresponding at this attribute one may is worth.

Decision-making tree decipher

Illustrate learns decision-making tree. Classify to its through the tree, return next classify related to this leaf (be or deny) . This tree delimits according to whether suiting to play tennis

For example, example (Outlook = Sunny, temperature = Hot, humidity = High, wind = Strong) will be to deny by classify example (namely, the tree forecasts PlayTennis = No) .

Be what lets its extraordinary?

It and other tool are different, work because it is OK intuitionisticly, namely ground of one after another make a decision. Nonparametric: Fast and efficient.

So how is compose built?

It have a few kinds of algorithm is OK to have a few kinds of algorithms compose builds decision-making tree, wh some of which is:

CART (classification and recursive tree) → is used base the Buddhist nun index (classification) as index. ID3 (iteration dichotomy 3) function of → use entropy and information gain index.

Here, we pay close attention to the key ID3 algorithm.

What is ID3 algorithm?

Which property should ID3 check in the place that establish a root from " ? "This problem begins. To answer this question, every example attribute uses the statistical test that calls information gain to undertake assessment, this test measures the rate that gives attribute to train a sample according to training a sample to depart, undertake classified to their target. Choose optimal property and use its as the test that the root node of the tree manages.

Repeat this process to every branch. This means us to carry out the greedy search of from above to below in possible decision-making tree vacuum, among them algorithmic never backdate arrives reconsider foregoing choices.

What is entropy?

Entropy is a kind of commonly used magnanimity in information theory, its token the homogeneity of example. If target attribute can use C different value, so opposite the entropy definition of the S that classifies at this C- is:

Decision-making tree decipher

Among them Pi is to belong to I kind the scale of S. Ask an attention additionally, if target attribute can use C potential value, criterion entropy is OK with Log2 (C) euqally big.

The give typical examples that Boer target classifies:

Hypothesis S is the gather of 14 example, include 9 exemple and 5 negative exemples [9 + , 5-] (C = 2) . Next the entropy of S is:

Decision-making tree decipher

Ask an attention, if all members of S are belonged to same kind, criterion entropy is 0. If all members are plus (P ⊕ = 1) , so P? = 0, and cannot collect new information. When P ⊕ = 1, receiving the case that just knows a scale is exemple, because this does not need to send a message, and entropy is 0. On the other hand, when P ⊕ =p? When =0.5, its meeting implementation is the biggest change.

Decision-making tree decipher

Relative to the entropy function at Boer classification, as the scale P+ of the exemple will be in 0 and 1 between change.

What is information gain?

Information gain is the entropy that undertakes to give typical examples according to this attribute divisional place is brought about only anticipate decrease. Say exactly, the information gain Gain of attribute A (S, a) be defined to be at S of gather of give typical examples relatively:

Decision-making tree decipher

Gain (S, a) it is the information that is worth about target function, give the value of A of a few other property.

Give typical examples: Hypothesis S is the drills day of give typical examples gather that describes by the property that includes Wind, its can have Weak value or Strong value. S includes 9 exemple and altogether of 5 negative exemples 14 example, namely [9+ , 5-] . Suppose 6 exemple and 2 negative exemples have Wind = Weak, the example of the others has Wind=Strong. Because pass attribute Wind to be opposite primitive the information gain that 14 give typical examples have sort and causes can be calculated to be:

Decision-making tree decipher

Information gain is the magnanimity that optimal property chooses in every measure that ID3 grows a tree with Yu Zaisheng.

Decision-making tree decipher

Classify at the target relatively, humidity provided bigger than wind information gain. E represents entropy here, s represents primitive aggregate case. Give 9 when collect at first the case that loses with 5, [9+ , 5 -] , have sort to them according to their humidity, can arise [3+ , = of 4 -](humidity is tall) and [6+ , = of 1 -](humidity is normal) gather. The information that this partition obtains is.151, and the gain of attribute Wind is.048 only.

The give a demonstration of sex of a specification that uses ID3 algorithm

Give typical examples is here medium, target attribute is dozen of tennis, its value is Yes or No. Must pass consideration property (Outlook, temperature, humidity, wind) will forecast.

Decision-making tree decipher

Training target concept hits the give typical examples of tennis

This algorithmic the first pace is the property that chooses a node. ID3 defines the information gain of every attribute, choose information gain highest that.

Decision-making tree decipher

Accordingly, choice Outlook regards a root as the decision-making attribute of node, and be below root catalog be worth possibly for every (namely Sunny, overcast and Rain) found branch. Ask an attention, all Overcast samples have only exemple, the foliaceous node that because this becomes target classification,is Yes (also call leaf node) . Accordingly, this node of decision-making tree calls foliaceous node. Contrary, to Outlook = Sunny and Outlook = Rain, we have be not 0 entropy, decision-making tree will be elaborated further below these node.

Choice new property and differentiate the process that trains give typical examples considers only related to this node the training give typical examples of couplet. In addition, removed taller in the tree property, accordingly any given attribute is most can appear along any method of decision-making tree only. Node of new to every part of a historical period wants proceed this process, till the any in satisfying two requirements: (1) every property that has included to carry a tree along this method, or (2) entropy is 0.

Decision-making tree decipher

By the decision-making tree of the partial study that the first pace of ID3 produces. Training give typical examples is gone to by classification corresponding unborn node.

We can observe the preference of ID3:

Shorter tree will surmount longer tree. The tree that places tall information acquires the property that stands by a root.

Conclusion

The conclusion that reachs finally is:

    Learning a tree to be able to express to be gain of information of If-then regular market to measure is to coach ID3 climbs the assessment of the search function. ID3 concludes through growing decision-making tree downward from the root decision-making tree, and devouringly cultivates every medium new decision-making branch to choose next optimal property to be added. ID3 includes the preference of pair of lesser and decision-making trees, undertake classified to train a sample to practicable, need to increase decision-making tree according to need only.
Decision-making tree decipher

未经允许不得转载:News » Decision-making tree decipher