Scale vector in scaled pivoting (numerical methods)












4












$begingroup$


In the scaled pivoting version of Gaussian elimination, you exchange rows/columns not only based on the largest element to be found, but rather the largest relative to the entries in its row.



You first calculate the scaling vector $S$ (each element in $S$ is the largest absolute value in its respective row), then start choosing the pivot elements for each iteration based on this same original $S$. I understand that $S$ isn't recalculated in each iteration because "it's not worth it" - meaning the proportions generally stay the same throughout all iterations.



From Numerical Mathematics and Computing:




Notice that the scale factors are not changed after each pivot step. Intuitively, one might think that after each step in the Gaussian algorithm, the remaining (modified) coefficients should be used to recompute the scale factors instead of using the original scale vector. Of course, this could be done, but it is generally believed that the extra computations involved in this procedure are not worthwhile in the majority of linear systems.




I'm looking for a more formal explanation of this - how can I actually see that the amount of computations involved in recalculating the remaining $S$ elements in each iteration is large compared to the increased accuracy that we will get.
(Maybe some upper bound on the accuracy, compared to the number of recalculations of $S$?)










share|cite|improve this question









$endgroup$

















    4












    $begingroup$


    In the scaled pivoting version of Gaussian elimination, you exchange rows/columns not only based on the largest element to be found, but rather the largest relative to the entries in its row.



    You first calculate the scaling vector $S$ (each element in $S$ is the largest absolute value in its respective row), then start choosing the pivot elements for each iteration based on this same original $S$. I understand that $S$ isn't recalculated in each iteration because "it's not worth it" - meaning the proportions generally stay the same throughout all iterations.



    From Numerical Mathematics and Computing:




    Notice that the scale factors are not changed after each pivot step. Intuitively, one might think that after each step in the Gaussian algorithm, the remaining (modified) coefficients should be used to recompute the scale factors instead of using the original scale vector. Of course, this could be done, but it is generally believed that the extra computations involved in this procedure are not worthwhile in the majority of linear systems.




    I'm looking for a more formal explanation of this - how can I actually see that the amount of computations involved in recalculating the remaining $S$ elements in each iteration is large compared to the increased accuracy that we will get.
    (Maybe some upper bound on the accuracy, compared to the number of recalculations of $S$?)










    share|cite|improve this question









    $endgroup$















      4












      4








      4


      1



      $begingroup$


      In the scaled pivoting version of Gaussian elimination, you exchange rows/columns not only based on the largest element to be found, but rather the largest relative to the entries in its row.



      You first calculate the scaling vector $S$ (each element in $S$ is the largest absolute value in its respective row), then start choosing the pivot elements for each iteration based on this same original $S$. I understand that $S$ isn't recalculated in each iteration because "it's not worth it" - meaning the proportions generally stay the same throughout all iterations.



      From Numerical Mathematics and Computing:




      Notice that the scale factors are not changed after each pivot step. Intuitively, one might think that after each step in the Gaussian algorithm, the remaining (modified) coefficients should be used to recompute the scale factors instead of using the original scale vector. Of course, this could be done, but it is generally believed that the extra computations involved in this procedure are not worthwhile in the majority of linear systems.




      I'm looking for a more formal explanation of this - how can I actually see that the amount of computations involved in recalculating the remaining $S$ elements in each iteration is large compared to the increased accuracy that we will get.
      (Maybe some upper bound on the accuracy, compared to the number of recalculations of $S$?)










      share|cite|improve this question









      $endgroup$




      In the scaled pivoting version of Gaussian elimination, you exchange rows/columns not only based on the largest element to be found, but rather the largest relative to the entries in its row.



      You first calculate the scaling vector $S$ (each element in $S$ is the largest absolute value in its respective row), then start choosing the pivot elements for each iteration based on this same original $S$. I understand that $S$ isn't recalculated in each iteration because "it's not worth it" - meaning the proportions generally stay the same throughout all iterations.



      From Numerical Mathematics and Computing:




      Notice that the scale factors are not changed after each pivot step. Intuitively, one might think that after each step in the Gaussian algorithm, the remaining (modified) coefficients should be used to recompute the scale factors instead of using the original scale vector. Of course, this could be done, but it is generally believed that the extra computations involved in this procedure are not worthwhile in the majority of linear systems.




      I'm looking for a more formal explanation of this - how can I actually see that the amount of computations involved in recalculating the remaining $S$ elements in each iteration is large compared to the increased accuracy that we will get.
      (Maybe some upper bound on the accuracy, compared to the number of recalculations of $S$?)







      numerical-methods gaussian-elimination






      share|cite|improve this question













      share|cite|improve this question











      share|cite|improve this question




      share|cite|improve this question










      asked May 27 '15 at 15:11









      CauthonCauthon

      209111




      209111






















          2 Answers
          2






          active

          oldest

          votes


















          0












          $begingroup$

          If Numerical Analysis says that something is not worth it and that it is generally believed that the extra computations involved in this procedure are not worthwhile in the majority of linear systems, then all of your alarm bells should go off. How much worth is a "belief" in mathematics in the first place? And how would such a "majority" of linear systems be defined? So I wouldn't be surprised if the real reason for scaled pivoting is ease of calculations and that's it.



          That being said, instead of heading for pivoting strategies, what I'm going to advertise here is rather to avoid any form of pivoting.
          The method to be employed for that purpose is called Least Squares. Instead of solving the original system
          $A x = b$ , we first square the equations and then solve the system $S x = A^T A x = A^T b$ instead. There are
          a number of advantages and disadvantages with this method. A disadvantage is that the equations must first
          be squared, at cost of computation time. But there are dumb and there are smart methods to do the squaring.
          Another disadvantage is that the conditioning of the equations becomes worse, namely proportional to
          the square of the original conditioning. The latter phenomenon may be cured by first optimizing the condition
          of the original the equations, which can be done by norming them, row by row. Meaning that the coefficients
          $a_{ki}$ of each row $(k)$ are divided by the square root of the sum of the squared coefficients of that row:
          $$
          a_{ki} := frac{a_{ki}}{sqrt{sum_i a_{ki}^2}}
          $$
          Let's talk about the advantages now. It's easy to see that the equations have become symmetric, which
          definitely is an advantage. (Note that if the symmetry of the original system is destroyed by norming
          then it is restored again by squaring) There is no pivoting needed, because:
          $$
          S_{ij} = S_{ji} = sum_k a_{ki} a_{kj} quad Longrightarrow quad S_{ii} = sum_k a_{ki}^2
          $$
          And this sum is only zero if all coefficients in a column are zero, which is certainly not the case with a
          non-singular matrix. Thus all pivots are on the main diagonal. Furthermore:
          $$
          vec{a}_i = begin{bmatrix} a_{1i} \ a_{2i} \ cdots \ a_{Ni} end{bmatrix}
          quad ; quad
          vec{a}_j = begin{bmatrix} a_{1j} \ a_{2j} \ cdots \ a_{Nj} end{bmatrix}
          qquad Longrightarrow qquad left{ begin{matrix}
          left(vec{a_i}cdotvec{a_j}right) = sum_k a_{ki} a_{kj} = S_{ij} \
          left(vec{a_i}cdotvec{a_i}right) = sum_k a_{ki} a_{ki} = S_{ii} \
          left(vec{a_j}cdotvec{a_j}right) = sum_k a_{kj} a_{kj} = S_{jj} end{matrix} right.
          $$
          Schwarz inequality:
          $$
          left(vec{a_i}cdotvec{a_j}right)^2 le
          left(vec{a_i}cdotvec{a_i}right)left(vec{a_j}cdotvec{a_j}right)
          qquad Longleftrightarrow qquad
          S_{ij}^2 le S_{ii} S_{jj}
          $$
          With the AM-GM inequality we find from here:
          $$
          frac{S_{ii}+S_{jj}}{2} ge sqrt{S_{ii}S_{jj}} ge left| S_{ij} right|
          $$
          With other words: the (arithmetic/geometric) mean of two main diagonal elements is positive
          and always greater than the absolute value of the corresponding off-diagonal element.
          This is consistent with the fact that all pivots are on the main diagonal.

          Further properties are that for an orthogonal matrix $A$ the squared equations become the
          identity: $S = A^T A = I$ and the sum of all main diagonal matrix elements equals the order of
          the system: $sum_k S_{kk} = N$.



          Simple $2 times 2$ example:
          $$
          left{ begin{matrix} x_2 = 1 \ 2 x_1 - x_2 = 1 end{matrix} right.
          qquad Longleftrightarrow qquad
          begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
          begin{bmatrix} x_1 \ x_2 end{bmatrix} =
          begin{bmatrix} 1 \ 1 end{bmatrix}
          $$
          Standard Gaussian elimination will fail because there is a zero in the first row and first column.
          Let the equations be normed in the first place:
          $$
          begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
          begin{bmatrix} x_1 \ x_2 end{bmatrix} =
          begin{bmatrix} 1 \ 1 end{bmatrix}
          qquad Longleftrightarrow qquad
          begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
          begin{bmatrix} x_1 \ x_2 end{bmatrix} =
          begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
          $$
          Least squares equivalent (calculating the square roots can always be avoided):
          $$
          begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
          begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
          begin{bmatrix} x_1 \ x_2 end{bmatrix} =
          begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
          begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
          qquad Longleftrightarrow \
          begin{bmatrix} 4/5 & -2/5 \ -2/5 & 6/5 end{bmatrix}
          begin{bmatrix} x_1 \ x_2 end{bmatrix} =
          begin{bmatrix} 2/5 \ 4/5 end{bmatrix}
          $$
          Gaussian elimination with the first row as a pivot:
          $$
          begin{bmatrix} 4/5 & -2/5 \ 0 & 5/5 end{bmatrix}
          begin{bmatrix} x_1 \ x_2 end{bmatrix} =
          begin{bmatrix} 2/5 \ 5/5 end{bmatrix}
          $$
          Followed by backsubstitution, as usual.



          EDIT. More information about Condition Numbers available
          at the personal website MSE link.






          share|cite|improve this answer











          $endgroup$













          • $begingroup$
            Thanks for your suggestion of an alternative method. I have learned a little about Least Squared, but this is a great explanation. However, I am still interested in my original question. I am assuming that there's some kind of analysis that demonstrates why $S$ isn't recalculated, aside for east of calculations, but I still cannot find any such thing.
            $endgroup$
            – Cauthon
            Jun 5 '15 at 11:28










          • $begingroup$
            @Cauthon: Nobody has been able to claim the bounty, right? Alas, sometimes I only have to wait in order to be sure that my judgment has not been wrong.
            $endgroup$
            – Han de Bruijn
            Jun 10 '15 at 13:36












          • $begingroup$
            Nope, no answer out there :(
            $endgroup$
            – Cauthon
            Jun 10 '15 at 19:51



















          0












          $begingroup$

          The cited passage does not suggest that you should not perform the same (row) permutation also in the vector of the scaling coefficients. You need to do that.



          However, recall that reducing the pivot and the associated column and/or row (depending on the choice of the algorithm) changes the rest of the matrix. So while you might have collected per-row maxima, these will no longer be true after this step. So if you were to be correct, you would need to recompute the scaling vector.



          The scaling vector needs to hold maxima for all elements of all of the remaining rows (not only the unreduced portions of the remaining rows). That amounts to $O(n^2)$ in the first step, $O(n cdot (n - 1))$ in the second step and $O(n)$ in the second last step. This sums up to roughly $O(n^3/2)$ which is comparable to the $O(n^3)$ cost of the whole elimination, so it is deemed too costly to be worthwhile. Also the numerical experiments mostly show that the advantage in reducing the error is not worth it.



          The upper bound on accuracy depends a lot on the matrix at hand. In any case, for some problems you want to turn to singular value decomposition or some other more robust method. Some problems simply cannot be solved with Gaussian elimination, no matter how hard you pivot them.






          share|cite|improve this answer











          $endgroup$













            Your Answer





            StackExchange.ifUsing("editor", function () {
            return StackExchange.using("mathjaxEditing", function () {
            StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
            StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
            });
            });
            }, "mathjax-editing");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "69"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            noCode: true, onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1301159%2fscale-vector-in-scaled-pivoting-numerical-methods%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            2 Answers
            2






            active

            oldest

            votes








            2 Answers
            2






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            0












            $begingroup$

            If Numerical Analysis says that something is not worth it and that it is generally believed that the extra computations involved in this procedure are not worthwhile in the majority of linear systems, then all of your alarm bells should go off. How much worth is a "belief" in mathematics in the first place? And how would such a "majority" of linear systems be defined? So I wouldn't be surprised if the real reason for scaled pivoting is ease of calculations and that's it.



            That being said, instead of heading for pivoting strategies, what I'm going to advertise here is rather to avoid any form of pivoting.
            The method to be employed for that purpose is called Least Squares. Instead of solving the original system
            $A x = b$ , we first square the equations and then solve the system $S x = A^T A x = A^T b$ instead. There are
            a number of advantages and disadvantages with this method. A disadvantage is that the equations must first
            be squared, at cost of computation time. But there are dumb and there are smart methods to do the squaring.
            Another disadvantage is that the conditioning of the equations becomes worse, namely proportional to
            the square of the original conditioning. The latter phenomenon may be cured by first optimizing the condition
            of the original the equations, which can be done by norming them, row by row. Meaning that the coefficients
            $a_{ki}$ of each row $(k)$ are divided by the square root of the sum of the squared coefficients of that row:
            $$
            a_{ki} := frac{a_{ki}}{sqrt{sum_i a_{ki}^2}}
            $$
            Let's talk about the advantages now. It's easy to see that the equations have become symmetric, which
            definitely is an advantage. (Note that if the symmetry of the original system is destroyed by norming
            then it is restored again by squaring) There is no pivoting needed, because:
            $$
            S_{ij} = S_{ji} = sum_k a_{ki} a_{kj} quad Longrightarrow quad S_{ii} = sum_k a_{ki}^2
            $$
            And this sum is only zero if all coefficients in a column are zero, which is certainly not the case with a
            non-singular matrix. Thus all pivots are on the main diagonal. Furthermore:
            $$
            vec{a}_i = begin{bmatrix} a_{1i} \ a_{2i} \ cdots \ a_{Ni} end{bmatrix}
            quad ; quad
            vec{a}_j = begin{bmatrix} a_{1j} \ a_{2j} \ cdots \ a_{Nj} end{bmatrix}
            qquad Longrightarrow qquad left{ begin{matrix}
            left(vec{a_i}cdotvec{a_j}right) = sum_k a_{ki} a_{kj} = S_{ij} \
            left(vec{a_i}cdotvec{a_i}right) = sum_k a_{ki} a_{ki} = S_{ii} \
            left(vec{a_j}cdotvec{a_j}right) = sum_k a_{kj} a_{kj} = S_{jj} end{matrix} right.
            $$
            Schwarz inequality:
            $$
            left(vec{a_i}cdotvec{a_j}right)^2 le
            left(vec{a_i}cdotvec{a_i}right)left(vec{a_j}cdotvec{a_j}right)
            qquad Longleftrightarrow qquad
            S_{ij}^2 le S_{ii} S_{jj}
            $$
            With the AM-GM inequality we find from here:
            $$
            frac{S_{ii}+S_{jj}}{2} ge sqrt{S_{ii}S_{jj}} ge left| S_{ij} right|
            $$
            With other words: the (arithmetic/geometric) mean of two main diagonal elements is positive
            and always greater than the absolute value of the corresponding off-diagonal element.
            This is consistent with the fact that all pivots are on the main diagonal.

            Further properties are that for an orthogonal matrix $A$ the squared equations become the
            identity: $S = A^T A = I$ and the sum of all main diagonal matrix elements equals the order of
            the system: $sum_k S_{kk} = N$.



            Simple $2 times 2$ example:
            $$
            left{ begin{matrix} x_2 = 1 \ 2 x_1 - x_2 = 1 end{matrix} right.
            qquad Longleftrightarrow qquad
            begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1 end{bmatrix}
            $$
            Standard Gaussian elimination will fail because there is a zero in the first row and first column.
            Let the equations be normed in the first place:
            $$
            begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1 end{bmatrix}
            qquad Longleftrightarrow qquad
            begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
            $$
            Least squares equivalent (calculating the square roots can always be avoided):
            $$
            begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
            qquad Longleftrightarrow \
            begin{bmatrix} 4/5 & -2/5 \ -2/5 & 6/5 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 2/5 \ 4/5 end{bmatrix}
            $$
            Gaussian elimination with the first row as a pivot:
            $$
            begin{bmatrix} 4/5 & -2/5 \ 0 & 5/5 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 2/5 \ 5/5 end{bmatrix}
            $$
            Followed by backsubstitution, as usual.



            EDIT. More information about Condition Numbers available
            at the personal website MSE link.






            share|cite|improve this answer











            $endgroup$













            • $begingroup$
              Thanks for your suggestion of an alternative method. I have learned a little about Least Squared, but this is a great explanation. However, I am still interested in my original question. I am assuming that there's some kind of analysis that demonstrates why $S$ isn't recalculated, aside for east of calculations, but I still cannot find any such thing.
              $endgroup$
              – Cauthon
              Jun 5 '15 at 11:28










            • $begingroup$
              @Cauthon: Nobody has been able to claim the bounty, right? Alas, sometimes I only have to wait in order to be sure that my judgment has not been wrong.
              $endgroup$
              – Han de Bruijn
              Jun 10 '15 at 13:36












            • $begingroup$
              Nope, no answer out there :(
              $endgroup$
              – Cauthon
              Jun 10 '15 at 19:51
















            0












            $begingroup$

            If Numerical Analysis says that something is not worth it and that it is generally believed that the extra computations involved in this procedure are not worthwhile in the majority of linear systems, then all of your alarm bells should go off. How much worth is a "belief" in mathematics in the first place? And how would such a "majority" of linear systems be defined? So I wouldn't be surprised if the real reason for scaled pivoting is ease of calculations and that's it.



            That being said, instead of heading for pivoting strategies, what I'm going to advertise here is rather to avoid any form of pivoting.
            The method to be employed for that purpose is called Least Squares. Instead of solving the original system
            $A x = b$ , we first square the equations and then solve the system $S x = A^T A x = A^T b$ instead. There are
            a number of advantages and disadvantages with this method. A disadvantage is that the equations must first
            be squared, at cost of computation time. But there are dumb and there are smart methods to do the squaring.
            Another disadvantage is that the conditioning of the equations becomes worse, namely proportional to
            the square of the original conditioning. The latter phenomenon may be cured by first optimizing the condition
            of the original the equations, which can be done by norming them, row by row. Meaning that the coefficients
            $a_{ki}$ of each row $(k)$ are divided by the square root of the sum of the squared coefficients of that row:
            $$
            a_{ki} := frac{a_{ki}}{sqrt{sum_i a_{ki}^2}}
            $$
            Let's talk about the advantages now. It's easy to see that the equations have become symmetric, which
            definitely is an advantage. (Note that if the symmetry of the original system is destroyed by norming
            then it is restored again by squaring) There is no pivoting needed, because:
            $$
            S_{ij} = S_{ji} = sum_k a_{ki} a_{kj} quad Longrightarrow quad S_{ii} = sum_k a_{ki}^2
            $$
            And this sum is only zero if all coefficients in a column are zero, which is certainly not the case with a
            non-singular matrix. Thus all pivots are on the main diagonal. Furthermore:
            $$
            vec{a}_i = begin{bmatrix} a_{1i} \ a_{2i} \ cdots \ a_{Ni} end{bmatrix}
            quad ; quad
            vec{a}_j = begin{bmatrix} a_{1j} \ a_{2j} \ cdots \ a_{Nj} end{bmatrix}
            qquad Longrightarrow qquad left{ begin{matrix}
            left(vec{a_i}cdotvec{a_j}right) = sum_k a_{ki} a_{kj} = S_{ij} \
            left(vec{a_i}cdotvec{a_i}right) = sum_k a_{ki} a_{ki} = S_{ii} \
            left(vec{a_j}cdotvec{a_j}right) = sum_k a_{kj} a_{kj} = S_{jj} end{matrix} right.
            $$
            Schwarz inequality:
            $$
            left(vec{a_i}cdotvec{a_j}right)^2 le
            left(vec{a_i}cdotvec{a_i}right)left(vec{a_j}cdotvec{a_j}right)
            qquad Longleftrightarrow qquad
            S_{ij}^2 le S_{ii} S_{jj}
            $$
            With the AM-GM inequality we find from here:
            $$
            frac{S_{ii}+S_{jj}}{2} ge sqrt{S_{ii}S_{jj}} ge left| S_{ij} right|
            $$
            With other words: the (arithmetic/geometric) mean of two main diagonal elements is positive
            and always greater than the absolute value of the corresponding off-diagonal element.
            This is consistent with the fact that all pivots are on the main diagonal.

            Further properties are that for an orthogonal matrix $A$ the squared equations become the
            identity: $S = A^T A = I$ and the sum of all main diagonal matrix elements equals the order of
            the system: $sum_k S_{kk} = N$.



            Simple $2 times 2$ example:
            $$
            left{ begin{matrix} x_2 = 1 \ 2 x_1 - x_2 = 1 end{matrix} right.
            qquad Longleftrightarrow qquad
            begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1 end{bmatrix}
            $$
            Standard Gaussian elimination will fail because there is a zero in the first row and first column.
            Let the equations be normed in the first place:
            $$
            begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1 end{bmatrix}
            qquad Longleftrightarrow qquad
            begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
            $$
            Least squares equivalent (calculating the square roots can always be avoided):
            $$
            begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
            qquad Longleftrightarrow \
            begin{bmatrix} 4/5 & -2/5 \ -2/5 & 6/5 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 2/5 \ 4/5 end{bmatrix}
            $$
            Gaussian elimination with the first row as a pivot:
            $$
            begin{bmatrix} 4/5 & -2/5 \ 0 & 5/5 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 2/5 \ 5/5 end{bmatrix}
            $$
            Followed by backsubstitution, as usual.



            EDIT. More information about Condition Numbers available
            at the personal website MSE link.






            share|cite|improve this answer











            $endgroup$













            • $begingroup$
              Thanks for your suggestion of an alternative method. I have learned a little about Least Squared, but this is a great explanation. However, I am still interested in my original question. I am assuming that there's some kind of analysis that demonstrates why $S$ isn't recalculated, aside for east of calculations, but I still cannot find any such thing.
              $endgroup$
              – Cauthon
              Jun 5 '15 at 11:28










            • $begingroup$
              @Cauthon: Nobody has been able to claim the bounty, right? Alas, sometimes I only have to wait in order to be sure that my judgment has not been wrong.
              $endgroup$
              – Han de Bruijn
              Jun 10 '15 at 13:36












            • $begingroup$
              Nope, no answer out there :(
              $endgroup$
              – Cauthon
              Jun 10 '15 at 19:51














            0












            0








            0





            $begingroup$

            If Numerical Analysis says that something is not worth it and that it is generally believed that the extra computations involved in this procedure are not worthwhile in the majority of linear systems, then all of your alarm bells should go off. How much worth is a "belief" in mathematics in the first place? And how would such a "majority" of linear systems be defined? So I wouldn't be surprised if the real reason for scaled pivoting is ease of calculations and that's it.



            That being said, instead of heading for pivoting strategies, what I'm going to advertise here is rather to avoid any form of pivoting.
            The method to be employed for that purpose is called Least Squares. Instead of solving the original system
            $A x = b$ , we first square the equations and then solve the system $S x = A^T A x = A^T b$ instead. There are
            a number of advantages and disadvantages with this method. A disadvantage is that the equations must first
            be squared, at cost of computation time. But there are dumb and there are smart methods to do the squaring.
            Another disadvantage is that the conditioning of the equations becomes worse, namely proportional to
            the square of the original conditioning. The latter phenomenon may be cured by first optimizing the condition
            of the original the equations, which can be done by norming them, row by row. Meaning that the coefficients
            $a_{ki}$ of each row $(k)$ are divided by the square root of the sum of the squared coefficients of that row:
            $$
            a_{ki} := frac{a_{ki}}{sqrt{sum_i a_{ki}^2}}
            $$
            Let's talk about the advantages now. It's easy to see that the equations have become symmetric, which
            definitely is an advantage. (Note that if the symmetry of the original system is destroyed by norming
            then it is restored again by squaring) There is no pivoting needed, because:
            $$
            S_{ij} = S_{ji} = sum_k a_{ki} a_{kj} quad Longrightarrow quad S_{ii} = sum_k a_{ki}^2
            $$
            And this sum is only zero if all coefficients in a column are zero, which is certainly not the case with a
            non-singular matrix. Thus all pivots are on the main diagonal. Furthermore:
            $$
            vec{a}_i = begin{bmatrix} a_{1i} \ a_{2i} \ cdots \ a_{Ni} end{bmatrix}
            quad ; quad
            vec{a}_j = begin{bmatrix} a_{1j} \ a_{2j} \ cdots \ a_{Nj} end{bmatrix}
            qquad Longrightarrow qquad left{ begin{matrix}
            left(vec{a_i}cdotvec{a_j}right) = sum_k a_{ki} a_{kj} = S_{ij} \
            left(vec{a_i}cdotvec{a_i}right) = sum_k a_{ki} a_{ki} = S_{ii} \
            left(vec{a_j}cdotvec{a_j}right) = sum_k a_{kj} a_{kj} = S_{jj} end{matrix} right.
            $$
            Schwarz inequality:
            $$
            left(vec{a_i}cdotvec{a_j}right)^2 le
            left(vec{a_i}cdotvec{a_i}right)left(vec{a_j}cdotvec{a_j}right)
            qquad Longleftrightarrow qquad
            S_{ij}^2 le S_{ii} S_{jj}
            $$
            With the AM-GM inequality we find from here:
            $$
            frac{S_{ii}+S_{jj}}{2} ge sqrt{S_{ii}S_{jj}} ge left| S_{ij} right|
            $$
            With other words: the (arithmetic/geometric) mean of two main diagonal elements is positive
            and always greater than the absolute value of the corresponding off-diagonal element.
            This is consistent with the fact that all pivots are on the main diagonal.

            Further properties are that for an orthogonal matrix $A$ the squared equations become the
            identity: $S = A^T A = I$ and the sum of all main diagonal matrix elements equals the order of
            the system: $sum_k S_{kk} = N$.



            Simple $2 times 2$ example:
            $$
            left{ begin{matrix} x_2 = 1 \ 2 x_1 - x_2 = 1 end{matrix} right.
            qquad Longleftrightarrow qquad
            begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1 end{bmatrix}
            $$
            Standard Gaussian elimination will fail because there is a zero in the first row and first column.
            Let the equations be normed in the first place:
            $$
            begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1 end{bmatrix}
            qquad Longleftrightarrow qquad
            begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
            $$
            Least squares equivalent (calculating the square roots can always be avoided):
            $$
            begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
            qquad Longleftrightarrow \
            begin{bmatrix} 4/5 & -2/5 \ -2/5 & 6/5 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 2/5 \ 4/5 end{bmatrix}
            $$
            Gaussian elimination with the first row as a pivot:
            $$
            begin{bmatrix} 4/5 & -2/5 \ 0 & 5/5 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 2/5 \ 5/5 end{bmatrix}
            $$
            Followed by backsubstitution, as usual.



            EDIT. More information about Condition Numbers available
            at the personal website MSE link.






            share|cite|improve this answer











            $endgroup$



            If Numerical Analysis says that something is not worth it and that it is generally believed that the extra computations involved in this procedure are not worthwhile in the majority of linear systems, then all of your alarm bells should go off. How much worth is a "belief" in mathematics in the first place? And how would such a "majority" of linear systems be defined? So I wouldn't be surprised if the real reason for scaled pivoting is ease of calculations and that's it.



            That being said, instead of heading for pivoting strategies, what I'm going to advertise here is rather to avoid any form of pivoting.
            The method to be employed for that purpose is called Least Squares. Instead of solving the original system
            $A x = b$ , we first square the equations and then solve the system $S x = A^T A x = A^T b$ instead. There are
            a number of advantages and disadvantages with this method. A disadvantage is that the equations must first
            be squared, at cost of computation time. But there are dumb and there are smart methods to do the squaring.
            Another disadvantage is that the conditioning of the equations becomes worse, namely proportional to
            the square of the original conditioning. The latter phenomenon may be cured by first optimizing the condition
            of the original the equations, which can be done by norming them, row by row. Meaning that the coefficients
            $a_{ki}$ of each row $(k)$ are divided by the square root of the sum of the squared coefficients of that row:
            $$
            a_{ki} := frac{a_{ki}}{sqrt{sum_i a_{ki}^2}}
            $$
            Let's talk about the advantages now. It's easy to see that the equations have become symmetric, which
            definitely is an advantage. (Note that if the symmetry of the original system is destroyed by norming
            then it is restored again by squaring) There is no pivoting needed, because:
            $$
            S_{ij} = S_{ji} = sum_k a_{ki} a_{kj} quad Longrightarrow quad S_{ii} = sum_k a_{ki}^2
            $$
            And this sum is only zero if all coefficients in a column are zero, which is certainly not the case with a
            non-singular matrix. Thus all pivots are on the main diagonal. Furthermore:
            $$
            vec{a}_i = begin{bmatrix} a_{1i} \ a_{2i} \ cdots \ a_{Ni} end{bmatrix}
            quad ; quad
            vec{a}_j = begin{bmatrix} a_{1j} \ a_{2j} \ cdots \ a_{Nj} end{bmatrix}
            qquad Longrightarrow qquad left{ begin{matrix}
            left(vec{a_i}cdotvec{a_j}right) = sum_k a_{ki} a_{kj} = S_{ij} \
            left(vec{a_i}cdotvec{a_i}right) = sum_k a_{ki} a_{ki} = S_{ii} \
            left(vec{a_j}cdotvec{a_j}right) = sum_k a_{kj} a_{kj} = S_{jj} end{matrix} right.
            $$
            Schwarz inequality:
            $$
            left(vec{a_i}cdotvec{a_j}right)^2 le
            left(vec{a_i}cdotvec{a_i}right)left(vec{a_j}cdotvec{a_j}right)
            qquad Longleftrightarrow qquad
            S_{ij}^2 le S_{ii} S_{jj}
            $$
            With the AM-GM inequality we find from here:
            $$
            frac{S_{ii}+S_{jj}}{2} ge sqrt{S_{ii}S_{jj}} ge left| S_{ij} right|
            $$
            With other words: the (arithmetic/geometric) mean of two main diagonal elements is positive
            and always greater than the absolute value of the corresponding off-diagonal element.
            This is consistent with the fact that all pivots are on the main diagonal.

            Further properties are that for an orthogonal matrix $A$ the squared equations become the
            identity: $S = A^T A = I$ and the sum of all main diagonal matrix elements equals the order of
            the system: $sum_k S_{kk} = N$.



            Simple $2 times 2$ example:
            $$
            left{ begin{matrix} x_2 = 1 \ 2 x_1 - x_2 = 1 end{matrix} right.
            qquad Longleftrightarrow qquad
            begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1 end{bmatrix}
            $$
            Standard Gaussian elimination will fail because there is a zero in the first row and first column.
            Let the equations be normed in the first place:
            $$
            begin{bmatrix} 0 & 1 \ 2 & -1 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1 end{bmatrix}
            qquad Longleftrightarrow qquad
            begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
            $$
            Least squares equivalent (calculating the square roots can always be avoided):
            $$
            begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} 0 & 1 \ 2/sqrt{5} & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 0 & 2/sqrt{5} \ 1 & -1/sqrt{5} end{bmatrix}
            begin{bmatrix} 1 \ 1/sqrt{5} end{bmatrix}
            qquad Longleftrightarrow \
            begin{bmatrix} 4/5 & -2/5 \ -2/5 & 6/5 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 2/5 \ 4/5 end{bmatrix}
            $$
            Gaussian elimination with the first row as a pivot:
            $$
            begin{bmatrix} 4/5 & -2/5 \ 0 & 5/5 end{bmatrix}
            begin{bmatrix} x_1 \ x_2 end{bmatrix} =
            begin{bmatrix} 2/5 \ 5/5 end{bmatrix}
            $$
            Followed by backsubstitution, as usual.



            EDIT. More information about Condition Numbers available
            at the personal website MSE link.







            share|cite|improve this answer














            share|cite|improve this answer



            share|cite|improve this answer








            edited Jun 10 '15 at 13:29

























            answered Jun 4 '15 at 12:59









            Han de BruijnHan de Bruijn

            12.2k22361




            12.2k22361












            • $begingroup$
              Thanks for your suggestion of an alternative method. I have learned a little about Least Squared, but this is a great explanation. However, I am still interested in my original question. I am assuming that there's some kind of analysis that demonstrates why $S$ isn't recalculated, aside for east of calculations, but I still cannot find any such thing.
              $endgroup$
              – Cauthon
              Jun 5 '15 at 11:28










            • $begingroup$
              @Cauthon: Nobody has been able to claim the bounty, right? Alas, sometimes I only have to wait in order to be sure that my judgment has not been wrong.
              $endgroup$
              – Han de Bruijn
              Jun 10 '15 at 13:36












            • $begingroup$
              Nope, no answer out there :(
              $endgroup$
              – Cauthon
              Jun 10 '15 at 19:51


















            • $begingroup$
              Thanks for your suggestion of an alternative method. I have learned a little about Least Squared, but this is a great explanation. However, I am still interested in my original question. I am assuming that there's some kind of analysis that demonstrates why $S$ isn't recalculated, aside for east of calculations, but I still cannot find any such thing.
              $endgroup$
              – Cauthon
              Jun 5 '15 at 11:28










            • $begingroup$
              @Cauthon: Nobody has been able to claim the bounty, right? Alas, sometimes I only have to wait in order to be sure that my judgment has not been wrong.
              $endgroup$
              – Han de Bruijn
              Jun 10 '15 at 13:36












            • $begingroup$
              Nope, no answer out there :(
              $endgroup$
              – Cauthon
              Jun 10 '15 at 19:51
















            $begingroup$
            Thanks for your suggestion of an alternative method. I have learned a little about Least Squared, but this is a great explanation. However, I am still interested in my original question. I am assuming that there's some kind of analysis that demonstrates why $S$ isn't recalculated, aside for east of calculations, but I still cannot find any such thing.
            $endgroup$
            – Cauthon
            Jun 5 '15 at 11:28




            $begingroup$
            Thanks for your suggestion of an alternative method. I have learned a little about Least Squared, but this is a great explanation. However, I am still interested in my original question. I am assuming that there's some kind of analysis that demonstrates why $S$ isn't recalculated, aside for east of calculations, but I still cannot find any such thing.
            $endgroup$
            – Cauthon
            Jun 5 '15 at 11:28












            $begingroup$
            @Cauthon: Nobody has been able to claim the bounty, right? Alas, sometimes I only have to wait in order to be sure that my judgment has not been wrong.
            $endgroup$
            – Han de Bruijn
            Jun 10 '15 at 13:36






            $begingroup$
            @Cauthon: Nobody has been able to claim the bounty, right? Alas, sometimes I only have to wait in order to be sure that my judgment has not been wrong.
            $endgroup$
            – Han de Bruijn
            Jun 10 '15 at 13:36














            $begingroup$
            Nope, no answer out there :(
            $endgroup$
            – Cauthon
            Jun 10 '15 at 19:51




            $begingroup$
            Nope, no answer out there :(
            $endgroup$
            – Cauthon
            Jun 10 '15 at 19:51











            0












            $begingroup$

            The cited passage does not suggest that you should not perform the same (row) permutation also in the vector of the scaling coefficients. You need to do that.



            However, recall that reducing the pivot and the associated column and/or row (depending on the choice of the algorithm) changes the rest of the matrix. So while you might have collected per-row maxima, these will no longer be true after this step. So if you were to be correct, you would need to recompute the scaling vector.



            The scaling vector needs to hold maxima for all elements of all of the remaining rows (not only the unreduced portions of the remaining rows). That amounts to $O(n^2)$ in the first step, $O(n cdot (n - 1))$ in the second step and $O(n)$ in the second last step. This sums up to roughly $O(n^3/2)$ which is comparable to the $O(n^3)$ cost of the whole elimination, so it is deemed too costly to be worthwhile. Also the numerical experiments mostly show that the advantage in reducing the error is not worth it.



            The upper bound on accuracy depends a lot on the matrix at hand. In any case, for some problems you want to turn to singular value decomposition or some other more robust method. Some problems simply cannot be solved with Gaussian elimination, no matter how hard you pivot them.






            share|cite|improve this answer











            $endgroup$


















              0












              $begingroup$

              The cited passage does not suggest that you should not perform the same (row) permutation also in the vector of the scaling coefficients. You need to do that.



              However, recall that reducing the pivot and the associated column and/or row (depending on the choice of the algorithm) changes the rest of the matrix. So while you might have collected per-row maxima, these will no longer be true after this step. So if you were to be correct, you would need to recompute the scaling vector.



              The scaling vector needs to hold maxima for all elements of all of the remaining rows (not only the unreduced portions of the remaining rows). That amounts to $O(n^2)$ in the first step, $O(n cdot (n - 1))$ in the second step and $O(n)$ in the second last step. This sums up to roughly $O(n^3/2)$ which is comparable to the $O(n^3)$ cost of the whole elimination, so it is deemed too costly to be worthwhile. Also the numerical experiments mostly show that the advantage in reducing the error is not worth it.



              The upper bound on accuracy depends a lot on the matrix at hand. In any case, for some problems you want to turn to singular value decomposition or some other more robust method. Some problems simply cannot be solved with Gaussian elimination, no matter how hard you pivot them.






              share|cite|improve this answer











              $endgroup$
















                0












                0








                0





                $begingroup$

                The cited passage does not suggest that you should not perform the same (row) permutation also in the vector of the scaling coefficients. You need to do that.



                However, recall that reducing the pivot and the associated column and/or row (depending on the choice of the algorithm) changes the rest of the matrix. So while you might have collected per-row maxima, these will no longer be true after this step. So if you were to be correct, you would need to recompute the scaling vector.



                The scaling vector needs to hold maxima for all elements of all of the remaining rows (not only the unreduced portions of the remaining rows). That amounts to $O(n^2)$ in the first step, $O(n cdot (n - 1))$ in the second step and $O(n)$ in the second last step. This sums up to roughly $O(n^3/2)$ which is comparable to the $O(n^3)$ cost of the whole elimination, so it is deemed too costly to be worthwhile. Also the numerical experiments mostly show that the advantage in reducing the error is not worth it.



                The upper bound on accuracy depends a lot on the matrix at hand. In any case, for some problems you want to turn to singular value decomposition or some other more robust method. Some problems simply cannot be solved with Gaussian elimination, no matter how hard you pivot them.






                share|cite|improve this answer











                $endgroup$



                The cited passage does not suggest that you should not perform the same (row) permutation also in the vector of the scaling coefficients. You need to do that.



                However, recall that reducing the pivot and the associated column and/or row (depending on the choice of the algorithm) changes the rest of the matrix. So while you might have collected per-row maxima, these will no longer be true after this step. So if you were to be correct, you would need to recompute the scaling vector.



                The scaling vector needs to hold maxima for all elements of all of the remaining rows (not only the unreduced portions of the remaining rows). That amounts to $O(n^2)$ in the first step, $O(n cdot (n - 1))$ in the second step and $O(n)$ in the second last step. This sums up to roughly $O(n^3/2)$ which is comparable to the $O(n^3)$ cost of the whole elimination, so it is deemed too costly to be worthwhile. Also the numerical experiments mostly show that the advantage in reducing the error is not worth it.



                The upper bound on accuracy depends a lot on the matrix at hand. In any case, for some problems you want to turn to singular value decomposition or some other more robust method. Some problems simply cannot be solved with Gaussian elimination, no matter how hard you pivot them.







                share|cite|improve this answer














                share|cite|improve this answer



                share|cite|improve this answer








                edited Jan 14 '17 at 13:56

























                answered Jan 14 '17 at 1:12









                the swinethe swine

                2261312




                2261312






























                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Mathematics Stack Exchange!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    Use MathJax to format equations. MathJax reference.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmath.stackexchange.com%2fquestions%2f1301159%2fscale-vector-in-scaled-pivoting-numerical-methods%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    How to change which sound is reproduced for terminal bell?

                    Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

                    Can I use Tabulator js library in my java Spring + Thymeleaf project?