[null,null,["最后更新时间 (UTC):2024-08-16。"],[[["\u003cp\u003eThis document explores multi-class classification models, which predict from multiple possibilities instead of just two, like binary classification models.\u003c/p\u003e\n"],["\u003cp\u003eMulti-class classification can be achieved through two main approaches: one-vs.-all and one-vs.-one (softmax).\u003c/p\u003e\n"],["\u003cp\u003eOne-vs.-all uses multiple binary classifiers, one for each possible outcome, to determine the probability of each class independently.\u003c/p\u003e\n"],["\u003cp\u003eOne-vs.-one (softmax) predicts probabilities of each class relative to all other classes, ensuring all probabilities sum to 1 using the softmax function.\u003c/p\u003e\n"],["\u003cp\u003eSoftmax is efficient for fewer classes but can become computationally expensive with many classes; candidate sampling offers an alternative for increased efficiency.\u003c/p\u003e\n"]]],[],null,["Earlier, you encountered\n[**binary classification**](/machine-learning/glossary#binary-classification)\nmodels that could pick between one of *two* possible choices, such as whether:\n\n- A given email is spam or not spam.\n- A given tumor is malignant or benign.\n\nIn this section, we'll investigate\n[**multi-class classification**](/machine-learning/glossary#multi-class)\nmodels, which can pick from *multiple* possibilities. For example:\n\n- Is this dog a beagle, a basset hound, or a bloodhound?\n- Is this flower a Siberian Iris, Dutch Iris, Blue Flag Iris, or Dwarf Bearded Iris?\n- Is that plane a Boeing 747, Airbus 320, Boeing 777, or Embraer 190?\n- Is this an image of an apple, bear, candy, dog, or egg?\n\nSome real-world multi-class problems entail choosing from *millions*\nof separate classes. For example, consider a multi-class classification\nmodel that can identify the image of just about anything.\n\nThis section details the two main variants of multi-class classification:\n\n- [**one-vs.-all**](/machine-learning/glossary#one-vs.-all)\n- **one-vs.-one** , which is usually known as [**softmax**](/machine-learning/glossary#softmax)\n\nOne versus all\n\n*One-vs.-all* provides a way to use binary classification\nfor a series of yes or no predictions across multiple possible labels.\n\nGiven a classification problem with N possible solutions, a one-vs.-all\nsolution consists of N separate binary classifiers---one binary\nclassifier for each possible outcome. During training, the model runs\nthrough a sequence of binary classifiers, training each to answer a separate\nclassification question.\n\nFor example, given a picture of a piece of fruit, four\ndifferent recognizers might be trained, each answering a different yes/no\nquestion:\n\n1. Is this image an apple?\n2. Is this image an orange?\n3. Is this image a banana?\n4. Is this image a grape?\n\nThe following image illustrates how this works in practice.\n**Figure 7. An image of a pear being passed as input to four different\nbinary classifiers. The first, second, and fourth models (predicting\nwhether or not the image is an apple, orange, or grape, respectively)\npredict the negative class. The third model (predicting whether or not\nthe image is a pear) predicts the positive class.** For more on how binary classifiers make predictions (setting a classification threshold to convert numerical model output into a positive or negative label), see the [Classification](/machine-learning/crash-course/classification) module.\n\nThis approach is fairly reasonable when the total number of classes\nis small, but becomes increasingly inefficient as the number of classes\nrises.\n\nWe can create a significantly more efficient one-vs.-all model\nwith a deep neural network in which each output node represents a different\nclass. The following image illustrates this approach.\n**Figure 8. The same one-vs.-all classification tasks accomplished using a\nneural net model. A sigmoid activation function is applied to the output\nlayer, and each output value represents the probability that the input\nimage is a specified fruit. This model predicts that there is a 84%\nchance that the image is a pear, and a 7% chance that the image is a\ngrape.**\n\nOne versus one (softmax)\n\nYou may have noticed that the probability values in the output layer of Figure 8\ndon't sum to 1.0 (or 100%). (In fact, they sum to 1.43.) In a one-vs.-all\napproach, the probability of each binary set of outcomes is determined\nindependently of all the other sets. That is, we're determining the probability\nof \"apple\" versus \"not apple\" without considering the likelihood of our other\nfruit options: \"orange\", \"pear\", or \"grape.\"\n\nBut what if we want to predict the probabilities of each fruit\nrelative to each other? In this case, instead of predicting \"apple\" versus \"not\napple\", we want to predict \"apple\" versus \"orange\" versus \"pear\" versus \"grape\".\nThis type of multi-class classification is called *one-vs.-one classification*.\n\nWe can implement a one-vs.-one classification using the same type of neural\nnetwork architecture used for one-vs.-all classification, with one key change.\nWe need to apply a different transform to the output layer.\n\nFor one-vs.-all, we applied the sigmoid activation function to each output\nnode independently, which resulted in an output value between 0 and 1 for each\nnode, but did not guarantee that these values summed to exactly 1.\n\nFor one-vs.-one, we can instead apply a function called *softmax*, which\nassigns decimal probabilities to each class in a multi-class problem such that\nall probabilities add up to 1.0. This additional constraint\nhelps training converge more quickly than it otherwise would.\n\nClick the plus icon to see the softmax equation. \nThe softmax equation is as follows: \n$$p(y = j\\|\\\\textbf{x}) = \\\\frac{e\\^{(\\\\textbf{w}_j\\^{T}\\\\textbf{x} + b_j)}}{\\\\sum_{k\\\\in K} {e\\^{(\\\\textbf{w}_k\\^{T}\\\\textbf{x} + b_k)}} }$$\n\nNote that this formula basically extends the formula for logistic\nregression into multiple classes.\n\nThe following image re-implements our one-vs.-all multi-class classification\ntask as a one-vs.-one task. Note that in order to perform softmax, the hidden\nlayer directly preceding the output layer (called the softmax layer) must have\nthe same number of nodes as the output layer.\n**Figure 9. Neural net implementation of one-vs.-one classification, using\na softmax layer. Each output value represents the probability that the\ninput image is the specified fruit and not any of the other three fruits\n(all probabilities sum to 1.0). This model predicts that there is a 63%\nchance that the image is a pear.**\n\nSoftmax options\n\nConsider the following variants of softmax:\n\n- **Full softmax** is the softmax we've been discussing; that is,\n softmax calculates a probability for every possible class.\n\n- **Candidate sampling** means that softmax calculates a probability\n for all the positive labels but only for a random sample of\n negative labels. For example, if we are interested in determining\n whether an input image is a beagle or a bloodhound, we don't have to\n provide probabilities for every non-doggy example.\n\nFull softmax is fairly cheap when the number of classes is small\nbut becomes prohibitively expensive when the number of classes climbs.\nCandidate sampling can improve efficiency in problems having a large\nnumber of classes.\n\nOne label versus many labels\n\nSoftmax assumes that each example is a member of exactly one class.\nSome examples, however, can simultaneously be a member of multiple classes.\nFor such examples:\n\n- You may not use softmax.\n- You must rely on multiple logistic regressions.\n\nFor example, the one-vs.-one model in Figure 9 above assumes that each input\nimage will depict exactly one type of fruit: an apple, an orange, a pear, or\na grape. However, if an input image might contain multiple types of fruit---a\nbowl of both apples and oranges---you'll have to use multiple logistic\nregressions instead.\n| **Key terms:**\n|\n| - [Binary classification](/machine-learning/glossary#binary-classification)\n| - [Multi class classification](/machine-learning/glossary#multi-class-classification)\n| - [One-vs.-all classification](/machine-learning/glossary#one-vs.-all)\n- [softmax (one-vs.-one classification)](/machine-learning/glossary#softmax) \n[Help Center](https://support.google.com/machinelearningeducation)"]]