Is a neural network consisting of a single softmax classification layer only a linear classifier?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty{ margin-bottom:0;
}






up vote
3
down vote

favorite












Since the softmax function is a generalization of the logistic function it is continuous and non-linear.



So the output of the softmax layer is: softmax( weight_matrix * input_activation)



weight_matrix * input_activation is purely linear combination of features.



The question is: if the application of the softmax activation still yields in a linear classifier or is the model then capable of representing non-linear functions?










share|cite|improve this question







New contributor




tamtam_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.


























    up vote
    3
    down vote

    favorite












    Since the softmax function is a generalization of the logistic function it is continuous and non-linear.



    So the output of the softmax layer is: softmax( weight_matrix * input_activation)



    weight_matrix * input_activation is purely linear combination of features.



    The question is: if the application of the softmax activation still yields in a linear classifier or is the model then capable of representing non-linear functions?










    share|cite|improve this question







    New contributor




    tamtam_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
    Check out our Code of Conduct.






















      up vote
      3
      down vote

      favorite









      up vote
      3
      down vote

      favorite











      Since the softmax function is a generalization of the logistic function it is continuous and non-linear.



      So the output of the softmax layer is: softmax( weight_matrix * input_activation)



      weight_matrix * input_activation is purely linear combination of features.



      The question is: if the application of the softmax activation still yields in a linear classifier or is the model then capable of representing non-linear functions?










      share|cite|improve this question







      New contributor




      tamtam_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      Since the softmax function is a generalization of the logistic function it is continuous and non-linear.



      So the output of the softmax layer is: softmax( weight_matrix * input_activation)



      weight_matrix * input_activation is purely linear combination of features.



      The question is: if the application of the softmax activation still yields in a linear classifier or is the model then capable of representing non-linear functions?







      neural-networks generalized-linear-model softmax






      share|cite|improve this question







      New contributor




      tamtam_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.











      share|cite|improve this question







      New contributor




      tamtam_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      share|cite|improve this question




      share|cite|improve this question






      New contributor




      tamtam_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.









      asked Nov 22 at 14:07









      tamtam_

      363




      363




      New contributor




      tamtam_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.





      New contributor





      tamtam_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






      tamtam_ is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
      Check out our Code of Conduct.






















          1 Answer
          1






          active

          oldest

          votes

















          up vote
          6
          down vote













          A neural network with no hidden layers and a soft max output layer is exactly logistic regression (possibly with more than 2 classes), when trained to minimize categorical cross-entropy (equivalently maximize the log-likelihood of a multinomial model).



          Your explanation is right on the money: a linear combination of inputs learns linear functions, and the soft max function yields a probability vector.






          share|cite|improve this answer























            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "65"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: false,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: null,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });






            tamtam_ is a new contributor. Be nice, and check out our Code of Conduct.










             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f378276%2fis-a-neural-network-consisting-of-a-single-softmax-classification-layer-only-a-l%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            6
            down vote













            A neural network with no hidden layers and a soft max output layer is exactly logistic regression (possibly with more than 2 classes), when trained to minimize categorical cross-entropy (equivalently maximize the log-likelihood of a multinomial model).



            Your explanation is right on the money: a linear combination of inputs learns linear functions, and the soft max function yields a probability vector.






            share|cite|improve this answer



























              up vote
              6
              down vote













              A neural network with no hidden layers and a soft max output layer is exactly logistic regression (possibly with more than 2 classes), when trained to minimize categorical cross-entropy (equivalently maximize the log-likelihood of a multinomial model).



              Your explanation is right on the money: a linear combination of inputs learns linear functions, and the soft max function yields a probability vector.






              share|cite|improve this answer

























                up vote
                6
                down vote










                up vote
                6
                down vote









                A neural network with no hidden layers and a soft max output layer is exactly logistic regression (possibly with more than 2 classes), when trained to minimize categorical cross-entropy (equivalently maximize the log-likelihood of a multinomial model).



                Your explanation is right on the money: a linear combination of inputs learns linear functions, and the soft max function yields a probability vector.






                share|cite|improve this answer














                A neural network with no hidden layers and a soft max output layer is exactly logistic regression (possibly with more than 2 classes), when trained to minimize categorical cross-entropy (equivalently maximize the log-likelihood of a multinomial model).



                Your explanation is right on the money: a linear combination of inputs learns linear functions, and the soft max function yields a probability vector.







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited Nov 22 at 15:36

























                answered Nov 22 at 14:31









                Sycorax

                37.4k994183




                37.4k994183






















                    tamtam_ is a new contributor. Be nice, and check out our Code of Conduct.










                     

                    draft saved


                    draft discarded


















                    tamtam_ is a new contributor. Be nice, and check out our Code of Conduct.













                    tamtam_ is a new contributor. Be nice, and check out our Code of Conduct.












                    tamtam_ is a new contributor. Be nice, and check out our Code of Conduct.















                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f378276%2fis-a-neural-network-consisting-of-a-single-softmax-classification-layer-only-a-l%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to change which sound is reproduced for terminal bell?

                    Can I use Tabulator js library in my java Spring + Thymeleaf project?

                    Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents