[[["わかりやすい","easyToUnderstand","thumb-up"],["問題の解決に役立った","solvedMyProblem","thumb-up"],["その他","otherUp","thumb-up"]],[["必要な情報がない","missingTheInformationINeed","thumb-down"],["複雑すぎる / 手順が多すぎる","tooComplicatedTooManySteps","thumb-down"],["最新ではない","outOfDate","thumb-down"],["翻訳に関する問題","translationIssue","thumb-down"],["サンプル / コードに問題がある","samplesCodeIssue","thumb-down"],["その他","otherDown","thumb-down"]],["最終更新日 2025-07-27 UTC。"],[[["\u003cp\u003eL2 regularization is a technique used to reduce model complexity and prevent overfitting by penalizing large weights.\u003c/p\u003e\n"],["\u003cp\u003eA regularization rate (lambda) controls the strength of regularization, with higher values leading to simpler models and lower values increasing the risk of overfitting.\u003c/p\u003e\n"],["\u003cp\u003eEarly stopping is an alternative regularization method that involves ending training before the model fully converges to prevent overfitting.\u003c/p\u003e\n"],["\u003cp\u003eFinding the right balance between learning rate and regularization rate is crucial for optimal model performance, as they influence weights in opposite directions.\u003c/p\u003e\n"]]],[],null,["[**L~2~ regularization**](/machine-learning/glossary#l2-regularization)\nis a popular regularization metric, which uses the following formula: \n$$L_2\\\\text{ regularization } = {w_1\\^2 + w_2\\^2 + ... + w_n\\^2}$$\n\nFor example, the following table shows the calculation of L~2~\nregularization for a model with six weights:\n\n| | Value | Squared value |\n|------|-------|-------------------|\n| w~1~ | 0.2 | 0.04 |\n| w~2~ | -0.5 | 0.25 |\n| w~3~ | 5.0 | 25.0 |\n| w~4~ | -1.2 | 1.44 |\n| w~5~ | 0.3 | 0.09 |\n| w~6~ | -0.1 | 0.01 |\n| | | **26.83** = total |\n\nNotice that weights close to zero don't affect L~2~ regularization\nmuch, but large weights can have a huge impact. For example, in the\npreceding calculation:\n\n- A single weight (w~3~) contributes about 93% of the total complexity.\n- The other five weights collectively contribute only about 7% of the total complexity.\n\nL~2~ regularization encourages weights *toward* 0, but never pushes\nweights all the way to zero.\n\nExercises: Check your understanding \nIf you use L~2~ regularization while training a model, what will typically happen to the overall complexity of the model? \nThe overall complexity of the system will probably drop. \nSince L~2~ regularization encourages weights towards 0, the overall complexity will probably drop. \nThe overall complexity of the model will probably stay constant. \nThis is very unlikely. \nThe overall complexity of the model will probably increase. \nThis is unlikely. Remember that L~2~ regularization encourages weights towards 0. \nIf you use L~2~ regularization while training a model, some features will be removed from the model. \nTrue \nAlthough L~2~ regularization may make some weights very small, it will never push any weights all the way to zero. Consequently, all features will still contribute something to the model. \nFalse \nL~2~ regularization never pushes weights all the way to zero.\n\nRegularization rate (lambda)\n\nAs noted, training attempts to minimize some combination of loss and complexity: \n$$\\\\text{minimize(loss} + \\\\text{ complexity)}$$\n\nModel developers tune the overall impact of complexity on model training\nby multiplying its value by a scalar called the\n[**regularization rate**](/machine-learning/glossary#regularization-rate).\nThe Greek character lambda typically symbolizes the regularization rate.\n\nThat is, model developers aim to do the following: \n$$\\\\text{minimize(loss} + \\\\lambda \\\\text{ complexity)}$$\n\nA high regularization rate:\n\n- Strengthens the influence of regularization, thereby reducing the chances of overfitting.\n- Tends to produce a histogram of model weights having the following characteristics:\n - a normal distribution\n - a mean weight of 0.\n\nA low regularization rate:\n\n- Lowers the influence of regularization, thereby increasing the chances of overfitting.\n- Tends to produce a histogram of model weights with a flat distribution.\n\nFor example, the histogram of model weights for a high regularization rate\nmight look as shown in Figure 18.\n**Figure 18.** Weight histogram for a high regularization rate. Mean is zero. Normal distribution.\n\nIn contrast, a low regularization rate tends to yield a flatter histogram, as\nshown in Figure 19.\n**Figure 19.** Weight histogram for a low regularization rate. Mean may or may not be zero.\n\n| **Note:** Setting the regularization rate to zero removes regularization completely. In this case, training focuses exclusively on minimizing loss, which poses the highest possible overfitting risk.\n\nPicking the regularization rate\n\nThe ideal regularization rate produces a model that generalizes well to\nnew, previously unseen data.\nUnfortunately, that ideal value is data-dependent,\nso you must do some\n\ntuning.\n\n\nEarly stopping: an alternative to complexity-based regularization\n\n[**Early stopping**](/machine-learning/glossary#early-stopping) is a\nregularization method that doesn't involve a calculation of complexity.\nInstead, early stopping simply means ending training before the model\nfully converges. For example, you end training when the loss curve\nfor the validation set starts to increase (slope becomes positive).\n\nAlthough early stopping usually increases training loss, it can decrease\ntest loss.\n\nEarly stopping is a quick, but rarely optimal, form of regularization.\nThe resulting model is very unlikely to be as good as a model trained\nthoroughly on the ideal regularization rate.\n\nFinding equilibrium between learning rate and regularization rate\n\n[**Learning rate**](/machine-learning/glossary#learning-rate) and\nregularization rate tend to pull weights in opposite\ndirections. A high learning rate often pulls weights *away from* zero;\na high regularization rate pulls weights *towards* zero.\n\nIf the regularization rate is high with respect to the learning rate,\nthe weak weights tend to produce a model that makes poor predictions.\nConversely, if the learning rate is high with respect to the regularization\nrate, the strong weights tend to produce an overfit model.\n\nYour goal is to find the equilibrium between learning rate and\nregularization rate. This can be challenging. Worst of all, once you find\nthat elusive balance, you may have to ultimately change the learning rate.\nAnd, when you change the learning rate, you'll again have to find the ideal\nregularization rate.\n| **Key terms:**\n|\n| - [Early stopping](/machine-learning/glossary#early-stopping)\n| - [L~1~ regularization](/machine-learning/glossary#L1_regularization)\n| - [L~2~ regularization](/machine-learning/glossary#L2_regularization)\n| - [Learning rate](/machine-learning/glossary#learning-rate)\n| - [Regularization](/machine-learning/glossary#regularization)\n- [Regularization rate](/machine-learning/glossary#regularization-rate) \n[Help Center](https://support.google.com/machinelearningeducation)"]]