Using a hash function as a random number generator












7












$begingroup$


Using MD5 or SHA1 for instance, and applying integers (as seed so to speak) to the hash function, in sequence, and only keeping, say, the first 64 bits of the resulting hash, do we always have a probability close to $1/{2^{64}}$ to have $$text{hash}_{64}(n) = text{hash}_{64}(n+1)$$ for every $ninbig[1,2^{64}]$?










share|improve this question











$endgroup$








  • 2




    $begingroup$
    Related Is SHA-256 secure as a CTR block cipher?
    $endgroup$
    – kelalaka
    Jan 31 at 16:53










  • $begingroup$
    This is the argument of Homebrew vs standardized. You are not too far of the NIST DRBG Hash variant
    $endgroup$
    – patrickf
    Feb 5 at 21:49
















7












$begingroup$


Using MD5 or SHA1 for instance, and applying integers (as seed so to speak) to the hash function, in sequence, and only keeping, say, the first 64 bits of the resulting hash, do we always have a probability close to $1/{2^{64}}$ to have $$text{hash}_{64}(n) = text{hash}_{64}(n+1)$$ for every $ninbig[1,2^{64}]$?










share|improve this question











$endgroup$








  • 2




    $begingroup$
    Related Is SHA-256 secure as a CTR block cipher?
    $endgroup$
    – kelalaka
    Jan 31 at 16:53










  • $begingroup$
    This is the argument of Homebrew vs standardized. You are not too far of the NIST DRBG Hash variant
    $endgroup$
    – patrickf
    Feb 5 at 21:49














7












7








7


1



$begingroup$


Using MD5 or SHA1 for instance, and applying integers (as seed so to speak) to the hash function, in sequence, and only keeping, say, the first 64 bits of the resulting hash, do we always have a probability close to $1/{2^{64}}$ to have $$text{hash}_{64}(n) = text{hash}_{64}(n+1)$$ for every $ninbig[1,2^{64}]$?










share|improve this question











$endgroup$




Using MD5 or SHA1 for instance, and applying integers (as seed so to speak) to the hash function, in sequence, and only keeping, say, the first 64 bits of the resulting hash, do we always have a probability close to $1/{2^{64}}$ to have $$text{hash}_{64}(n) = text{hash}_{64}(n+1)$$ for every $ninbig[1,2^{64}]$?







hash pseudo-random-generator sha-1 md5 probability






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 31 at 16:03









kelalaka

7,41022244




7,41022244










asked Jan 31 at 15:43









Ring ØRing Ø

1363




1363








  • 2




    $begingroup$
    Related Is SHA-256 secure as a CTR block cipher?
    $endgroup$
    – kelalaka
    Jan 31 at 16:53










  • $begingroup$
    This is the argument of Homebrew vs standardized. You are not too far of the NIST DRBG Hash variant
    $endgroup$
    – patrickf
    Feb 5 at 21:49














  • 2




    $begingroup$
    Related Is SHA-256 secure as a CTR block cipher?
    $endgroup$
    – kelalaka
    Jan 31 at 16:53










  • $begingroup$
    This is the argument of Homebrew vs standardized. You are not too far of the NIST DRBG Hash variant
    $endgroup$
    – patrickf
    Feb 5 at 21:49








2




2




$begingroup$
Related Is SHA-256 secure as a CTR block cipher?
$endgroup$
– kelalaka
Jan 31 at 16:53




$begingroup$
Related Is SHA-256 secure as a CTR block cipher?
$endgroup$
– kelalaka
Jan 31 at 16:53












$begingroup$
This is the argument of Homebrew vs standardized. You are not too far of the NIST DRBG Hash variant
$endgroup$
– patrickf
Feb 5 at 21:49




$begingroup$
This is the argument of Homebrew vs standardized. You are not too far of the NIST DRBG Hash variant
$endgroup$
– patrickf
Feb 5 at 21:49










2 Answers
2






active

oldest

votes


















7












$begingroup$

What you are talking about is a counter based algorithm and the integer is then of course called the counter. Usually it is first encoded to a statically sized encoding of the number, as hash functions operate on bits / bytes, not numbers.



Yes, these kind of constructions are considered secure, if the other requirements are met. For instance, you should seed your algorithm correctly. Instead of defining a hash based DRBG of your own, you probably want to use a known good implementation such as Hash_DRBG as defined by NIST.



You should also try and stay away from broken hash functions such as MD5 and SHA-1. Although they are not broken for this purpose, attacks only get better, never worse.






share|improve this answer









$endgroup$





















    4












    $begingroup$

    Caveat: that answer is written with my mind in Vulcan mode. That is, I'm answering the question as taken literally (in the first part); or in only a minor variant (second part); and ignoring the question's title, which I find unrelated, especially given the limited span of $n$.



    The probability considered is $0$ for each of the many different common ways to define $text{hash}_{64}(n)$ as the first 64 bits of the MD5 or SHA-1 of integer $n$. Whether $0$ is "close to $1/2^{64}$" is a matter of opinion.



    For a few other definitions of $text{hash}_{64}(n)$, that probability is $1$, which is not "close to $1/2^{64}$" by many accounts.



    Justification: it is considered the probability $p$ that for every $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



    For fixed function $text{hash}_{64}$ this is a probability that a certain determined fact holds or not, and thus $p$ is either $0$ or $1$ depending on the function $text{hash}_{64}$. This also holds (perhaps with a different $p$) if we change every to some.



    For the many different common ways to define the function $text{hash}_{64}$ for an integer parameter as the first 64 bits of the MD5 or SHA-1 of that integer, that probability is $0$. For example, if we define $text{hash}_{64}$ to be the first 64 bits of the SHA-1 of the shortest big-endian decimal representation of $n$ encoded in ASCII, then $text{hash}_{64}(1)$ is the SHA-1 of the single byte 0x31 and starts with a $0$ bit, while $text{hash}_{64}(2)$ is the SHA-1 of the single byte 0x32 and starts with a $1$ bit, thus they do not match.



    We could try with MD5 and SHA-1, various common bases (including 2, 8, 10, 16, 64, 256, 232), common suitable alphabets (ASCII, EBCDIC, binary, bytes) and common variants (uppercase or lowercase hex, various base64 idioms), endianness (big, little, various mixed), and width for $n$ (fixed to say 64 with some exception for $n=2^{64}$, 65, 72 or 96 bits, adaptive according to $n$ with various steps like even which is common with hex). But I'm not convinced that we would find any such $text{hash}_{64}$ with $text{hash}_{64}(1)=text{hash}_{64}(2)$. And I'm ready to bet the house $text{hash}_{64}(2)=text{hash}_{64}(3)$ won't hold in addition.



    When we average the probability asked over all the possible definitions of $text{hash}_{64}$, we get $2^{left(64-2^{70}right)}$. That ludicrously small number is much less than $1/2^{64}$, yet still closer to $1/2^{64}$ than $0$ is.





    Now perhaps we should instead consider the probability $p$ that for a random $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



    That probability $p$ is $k/{2^{64}}$ for some integer $k$, which can be computed with significant but feasible effort from the exact definition of $text{hash}_{64}$.



    When we average $p$ over all the possible definitions of $text{hash}_{64}$, we get $1/2^{64}$; thus "close to $1/2^{64}$" for sure.



    We can also define the probability $p_k$ that $p$ is $k/{2^{64}}$ for a random function $text{hash}_{64}$, which is a plausible model for the various $text{hash}_{64}$ that the question considers. The largest $p_k$ is $p_1$ (meaning $p=1/{2^{64}}$ is the most likely), with $p_0$ extremely close second, both extremely close to $1/eapprox36.8%$. $p_2$ is about half that, and $p_k$ goes down fast as $k$ grows, down to $p_{2^{64}}=2^{left(-2^{70}right)}$, then $p_k=0$ for $k>2^{64}$. It holds $1=sum p_k$.



    We see that $p$ is in the 3-element set ${0,1/2^{64},1/2^{63}}$ with probability about $5/2eapprox92.0%$. We can say $p$ is very often "close to $1/2^{64}$".






    share|improve this answer











    $endgroup$













    • $begingroup$
      If the hash is a RO then the probability must be $1/2^{64}$ for any subsequent to equal.
      $endgroup$
      – kelalaka
      Jan 31 at 17:56










    • $begingroup$
      @kelalaka: I do not get "for any subsequent to equal".
      $endgroup$
      – fgrieu
      Jan 31 at 23:39






    • 3




      $begingroup$
      @fgrieu For any subsequent [64-bit block] to [be] equal.
      $endgroup$
      – Future Security
      Feb 1 at 16:23











    Your Answer





    StackExchange.ifUsing("editor", function () {
    return StackExchange.using("mathjaxEditing", function () {
    StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
    StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
    });
    });
    }, "mathjax-editing");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "281"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    noCode: true, onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcrypto.stackexchange.com%2fquestions%2f66932%2fusing-a-hash-function-as-a-random-number-generator%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    7












    $begingroup$

    What you are talking about is a counter based algorithm and the integer is then of course called the counter. Usually it is first encoded to a statically sized encoding of the number, as hash functions operate on bits / bytes, not numbers.



    Yes, these kind of constructions are considered secure, if the other requirements are met. For instance, you should seed your algorithm correctly. Instead of defining a hash based DRBG of your own, you probably want to use a known good implementation such as Hash_DRBG as defined by NIST.



    You should also try and stay away from broken hash functions such as MD5 and SHA-1. Although they are not broken for this purpose, attacks only get better, never worse.






    share|improve this answer









    $endgroup$


















      7












      $begingroup$

      What you are talking about is a counter based algorithm and the integer is then of course called the counter. Usually it is first encoded to a statically sized encoding of the number, as hash functions operate on bits / bytes, not numbers.



      Yes, these kind of constructions are considered secure, if the other requirements are met. For instance, you should seed your algorithm correctly. Instead of defining a hash based DRBG of your own, you probably want to use a known good implementation such as Hash_DRBG as defined by NIST.



      You should also try and stay away from broken hash functions such as MD5 and SHA-1. Although they are not broken for this purpose, attacks only get better, never worse.






      share|improve this answer









      $endgroup$
















        7












        7








        7





        $begingroup$

        What you are talking about is a counter based algorithm and the integer is then of course called the counter. Usually it is first encoded to a statically sized encoding of the number, as hash functions operate on bits / bytes, not numbers.



        Yes, these kind of constructions are considered secure, if the other requirements are met. For instance, you should seed your algorithm correctly. Instead of defining a hash based DRBG of your own, you probably want to use a known good implementation such as Hash_DRBG as defined by NIST.



        You should also try and stay away from broken hash functions such as MD5 and SHA-1. Although they are not broken for this purpose, attacks only get better, never worse.






        share|improve this answer









        $endgroup$



        What you are talking about is a counter based algorithm and the integer is then of course called the counter. Usually it is first encoded to a statically sized encoding of the number, as hash functions operate on bits / bytes, not numbers.



        Yes, these kind of constructions are considered secure, if the other requirements are met. For instance, you should seed your algorithm correctly. Instead of defining a hash based DRBG of your own, you probably want to use a known good implementation such as Hash_DRBG as defined by NIST.



        You should also try and stay away from broken hash functions such as MD5 and SHA-1. Although they are not broken for this purpose, attacks only get better, never worse.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 31 at 15:58









        Maarten BodewesMaarten Bodewes

        54.3k679193




        54.3k679193























            4












            $begingroup$

            Caveat: that answer is written with my mind in Vulcan mode. That is, I'm answering the question as taken literally (in the first part); or in only a minor variant (second part); and ignoring the question's title, which I find unrelated, especially given the limited span of $n$.



            The probability considered is $0$ for each of the many different common ways to define $text{hash}_{64}(n)$ as the first 64 bits of the MD5 or SHA-1 of integer $n$. Whether $0$ is "close to $1/2^{64}$" is a matter of opinion.



            For a few other definitions of $text{hash}_{64}(n)$, that probability is $1$, which is not "close to $1/2^{64}$" by many accounts.



            Justification: it is considered the probability $p$ that for every $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



            For fixed function $text{hash}_{64}$ this is a probability that a certain determined fact holds or not, and thus $p$ is either $0$ or $1$ depending on the function $text{hash}_{64}$. This also holds (perhaps with a different $p$) if we change every to some.



            For the many different common ways to define the function $text{hash}_{64}$ for an integer parameter as the first 64 bits of the MD5 or SHA-1 of that integer, that probability is $0$. For example, if we define $text{hash}_{64}$ to be the first 64 bits of the SHA-1 of the shortest big-endian decimal representation of $n$ encoded in ASCII, then $text{hash}_{64}(1)$ is the SHA-1 of the single byte 0x31 and starts with a $0$ bit, while $text{hash}_{64}(2)$ is the SHA-1 of the single byte 0x32 and starts with a $1$ bit, thus they do not match.



            We could try with MD5 and SHA-1, various common bases (including 2, 8, 10, 16, 64, 256, 232), common suitable alphabets (ASCII, EBCDIC, binary, bytes) and common variants (uppercase or lowercase hex, various base64 idioms), endianness (big, little, various mixed), and width for $n$ (fixed to say 64 with some exception for $n=2^{64}$, 65, 72 or 96 bits, adaptive according to $n$ with various steps like even which is common with hex). But I'm not convinced that we would find any such $text{hash}_{64}$ with $text{hash}_{64}(1)=text{hash}_{64}(2)$. And I'm ready to bet the house $text{hash}_{64}(2)=text{hash}_{64}(3)$ won't hold in addition.



            When we average the probability asked over all the possible definitions of $text{hash}_{64}$, we get $2^{left(64-2^{70}right)}$. That ludicrously small number is much less than $1/2^{64}$, yet still closer to $1/2^{64}$ than $0$ is.





            Now perhaps we should instead consider the probability $p$ that for a random $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



            That probability $p$ is $k/{2^{64}}$ for some integer $k$, which can be computed with significant but feasible effort from the exact definition of $text{hash}_{64}$.



            When we average $p$ over all the possible definitions of $text{hash}_{64}$, we get $1/2^{64}$; thus "close to $1/2^{64}$" for sure.



            We can also define the probability $p_k$ that $p$ is $k/{2^{64}}$ for a random function $text{hash}_{64}$, which is a plausible model for the various $text{hash}_{64}$ that the question considers. The largest $p_k$ is $p_1$ (meaning $p=1/{2^{64}}$ is the most likely), with $p_0$ extremely close second, both extremely close to $1/eapprox36.8%$. $p_2$ is about half that, and $p_k$ goes down fast as $k$ grows, down to $p_{2^{64}}=2^{left(-2^{70}right)}$, then $p_k=0$ for $k>2^{64}$. It holds $1=sum p_k$.



            We see that $p$ is in the 3-element set ${0,1/2^{64},1/2^{63}}$ with probability about $5/2eapprox92.0%$. We can say $p$ is very often "close to $1/2^{64}$".






            share|improve this answer











            $endgroup$













            • $begingroup$
              If the hash is a RO then the probability must be $1/2^{64}$ for any subsequent to equal.
              $endgroup$
              – kelalaka
              Jan 31 at 17:56










            • $begingroup$
              @kelalaka: I do not get "for any subsequent to equal".
              $endgroup$
              – fgrieu
              Jan 31 at 23:39






            • 3




              $begingroup$
              @fgrieu For any subsequent [64-bit block] to [be] equal.
              $endgroup$
              – Future Security
              Feb 1 at 16:23
















            4












            $begingroup$

            Caveat: that answer is written with my mind in Vulcan mode. That is, I'm answering the question as taken literally (in the first part); or in only a minor variant (second part); and ignoring the question's title, which I find unrelated, especially given the limited span of $n$.



            The probability considered is $0$ for each of the many different common ways to define $text{hash}_{64}(n)$ as the first 64 bits of the MD5 or SHA-1 of integer $n$. Whether $0$ is "close to $1/2^{64}$" is a matter of opinion.



            For a few other definitions of $text{hash}_{64}(n)$, that probability is $1$, which is not "close to $1/2^{64}$" by many accounts.



            Justification: it is considered the probability $p$ that for every $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



            For fixed function $text{hash}_{64}$ this is a probability that a certain determined fact holds or not, and thus $p$ is either $0$ or $1$ depending on the function $text{hash}_{64}$. This also holds (perhaps with a different $p$) if we change every to some.



            For the many different common ways to define the function $text{hash}_{64}$ for an integer parameter as the first 64 bits of the MD5 or SHA-1 of that integer, that probability is $0$. For example, if we define $text{hash}_{64}$ to be the first 64 bits of the SHA-1 of the shortest big-endian decimal representation of $n$ encoded in ASCII, then $text{hash}_{64}(1)$ is the SHA-1 of the single byte 0x31 and starts with a $0$ bit, while $text{hash}_{64}(2)$ is the SHA-1 of the single byte 0x32 and starts with a $1$ bit, thus they do not match.



            We could try with MD5 and SHA-1, various common bases (including 2, 8, 10, 16, 64, 256, 232), common suitable alphabets (ASCII, EBCDIC, binary, bytes) and common variants (uppercase or lowercase hex, various base64 idioms), endianness (big, little, various mixed), and width for $n$ (fixed to say 64 with some exception for $n=2^{64}$, 65, 72 or 96 bits, adaptive according to $n$ with various steps like even which is common with hex). But I'm not convinced that we would find any such $text{hash}_{64}$ with $text{hash}_{64}(1)=text{hash}_{64}(2)$. And I'm ready to bet the house $text{hash}_{64}(2)=text{hash}_{64}(3)$ won't hold in addition.



            When we average the probability asked over all the possible definitions of $text{hash}_{64}$, we get $2^{left(64-2^{70}right)}$. That ludicrously small number is much less than $1/2^{64}$, yet still closer to $1/2^{64}$ than $0$ is.





            Now perhaps we should instead consider the probability $p$ that for a random $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



            That probability $p$ is $k/{2^{64}}$ for some integer $k$, which can be computed with significant but feasible effort from the exact definition of $text{hash}_{64}$.



            When we average $p$ over all the possible definitions of $text{hash}_{64}$, we get $1/2^{64}$; thus "close to $1/2^{64}$" for sure.



            We can also define the probability $p_k$ that $p$ is $k/{2^{64}}$ for a random function $text{hash}_{64}$, which is a plausible model for the various $text{hash}_{64}$ that the question considers. The largest $p_k$ is $p_1$ (meaning $p=1/{2^{64}}$ is the most likely), with $p_0$ extremely close second, both extremely close to $1/eapprox36.8%$. $p_2$ is about half that, and $p_k$ goes down fast as $k$ grows, down to $p_{2^{64}}=2^{left(-2^{70}right)}$, then $p_k=0$ for $k>2^{64}$. It holds $1=sum p_k$.



            We see that $p$ is in the 3-element set ${0,1/2^{64},1/2^{63}}$ with probability about $5/2eapprox92.0%$. We can say $p$ is very often "close to $1/2^{64}$".






            share|improve this answer











            $endgroup$













            • $begingroup$
              If the hash is a RO then the probability must be $1/2^{64}$ for any subsequent to equal.
              $endgroup$
              – kelalaka
              Jan 31 at 17:56










            • $begingroup$
              @kelalaka: I do not get "for any subsequent to equal".
              $endgroup$
              – fgrieu
              Jan 31 at 23:39






            • 3




              $begingroup$
              @fgrieu For any subsequent [64-bit block] to [be] equal.
              $endgroup$
              – Future Security
              Feb 1 at 16:23














            4












            4








            4





            $begingroup$

            Caveat: that answer is written with my mind in Vulcan mode. That is, I'm answering the question as taken literally (in the first part); or in only a minor variant (second part); and ignoring the question's title, which I find unrelated, especially given the limited span of $n$.



            The probability considered is $0$ for each of the many different common ways to define $text{hash}_{64}(n)$ as the first 64 bits of the MD5 or SHA-1 of integer $n$. Whether $0$ is "close to $1/2^{64}$" is a matter of opinion.



            For a few other definitions of $text{hash}_{64}(n)$, that probability is $1$, which is not "close to $1/2^{64}$" by many accounts.



            Justification: it is considered the probability $p$ that for every $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



            For fixed function $text{hash}_{64}$ this is a probability that a certain determined fact holds or not, and thus $p$ is either $0$ or $1$ depending on the function $text{hash}_{64}$. This also holds (perhaps with a different $p$) if we change every to some.



            For the many different common ways to define the function $text{hash}_{64}$ for an integer parameter as the first 64 bits of the MD5 or SHA-1 of that integer, that probability is $0$. For example, if we define $text{hash}_{64}$ to be the first 64 bits of the SHA-1 of the shortest big-endian decimal representation of $n$ encoded in ASCII, then $text{hash}_{64}(1)$ is the SHA-1 of the single byte 0x31 and starts with a $0$ bit, while $text{hash}_{64}(2)$ is the SHA-1 of the single byte 0x32 and starts with a $1$ bit, thus they do not match.



            We could try with MD5 and SHA-1, various common bases (including 2, 8, 10, 16, 64, 256, 232), common suitable alphabets (ASCII, EBCDIC, binary, bytes) and common variants (uppercase or lowercase hex, various base64 idioms), endianness (big, little, various mixed), and width for $n$ (fixed to say 64 with some exception for $n=2^{64}$, 65, 72 or 96 bits, adaptive according to $n$ with various steps like even which is common with hex). But I'm not convinced that we would find any such $text{hash}_{64}$ with $text{hash}_{64}(1)=text{hash}_{64}(2)$. And I'm ready to bet the house $text{hash}_{64}(2)=text{hash}_{64}(3)$ won't hold in addition.



            When we average the probability asked over all the possible definitions of $text{hash}_{64}$, we get $2^{left(64-2^{70}right)}$. That ludicrously small number is much less than $1/2^{64}$, yet still closer to $1/2^{64}$ than $0$ is.





            Now perhaps we should instead consider the probability $p$ that for a random $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



            That probability $p$ is $k/{2^{64}}$ for some integer $k$, which can be computed with significant but feasible effort from the exact definition of $text{hash}_{64}$.



            When we average $p$ over all the possible definitions of $text{hash}_{64}$, we get $1/2^{64}$; thus "close to $1/2^{64}$" for sure.



            We can also define the probability $p_k$ that $p$ is $k/{2^{64}}$ for a random function $text{hash}_{64}$, which is a plausible model for the various $text{hash}_{64}$ that the question considers. The largest $p_k$ is $p_1$ (meaning $p=1/{2^{64}}$ is the most likely), with $p_0$ extremely close second, both extremely close to $1/eapprox36.8%$. $p_2$ is about half that, and $p_k$ goes down fast as $k$ grows, down to $p_{2^{64}}=2^{left(-2^{70}right)}$, then $p_k=0$ for $k>2^{64}$. It holds $1=sum p_k$.



            We see that $p$ is in the 3-element set ${0,1/2^{64},1/2^{63}}$ with probability about $5/2eapprox92.0%$. We can say $p$ is very often "close to $1/2^{64}$".






            share|improve this answer











            $endgroup$



            Caveat: that answer is written with my mind in Vulcan mode. That is, I'm answering the question as taken literally (in the first part); or in only a minor variant (second part); and ignoring the question's title, which I find unrelated, especially given the limited span of $n$.



            The probability considered is $0$ for each of the many different common ways to define $text{hash}_{64}(n)$ as the first 64 bits of the MD5 or SHA-1 of integer $n$. Whether $0$ is "close to $1/2^{64}$" is a matter of opinion.



            For a few other definitions of $text{hash}_{64}(n)$, that probability is $1$, which is not "close to $1/2^{64}$" by many accounts.



            Justification: it is considered the probability $p$ that for every $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



            For fixed function $text{hash}_{64}$ this is a probability that a certain determined fact holds or not, and thus $p$ is either $0$ or $1$ depending on the function $text{hash}_{64}$. This also holds (perhaps with a different $p$) if we change every to some.



            For the many different common ways to define the function $text{hash}_{64}$ for an integer parameter as the first 64 bits of the MD5 or SHA-1 of that integer, that probability is $0$. For example, if we define $text{hash}_{64}$ to be the first 64 bits of the SHA-1 of the shortest big-endian decimal representation of $n$ encoded in ASCII, then $text{hash}_{64}(1)$ is the SHA-1 of the single byte 0x31 and starts with a $0$ bit, while $text{hash}_{64}(2)$ is the SHA-1 of the single byte 0x32 and starts with a $1$ bit, thus they do not match.



            We could try with MD5 and SHA-1, various common bases (including 2, 8, 10, 16, 64, 256, 232), common suitable alphabets (ASCII, EBCDIC, binary, bytes) and common variants (uppercase or lowercase hex, various base64 idioms), endianness (big, little, various mixed), and width for $n$ (fixed to say 64 with some exception for $n=2^{64}$, 65, 72 or 96 bits, adaptive according to $n$ with various steps like even which is common with hex). But I'm not convinced that we would find any such $text{hash}_{64}$ with $text{hash}_{64}(1)=text{hash}_{64}(2)$. And I'm ready to bet the house $text{hash}_{64}(2)=text{hash}_{64}(3)$ won't hold in addition.



            When we average the probability asked over all the possible definitions of $text{hash}_{64}$, we get $2^{left(64-2^{70}right)}$. That ludicrously small number is much less than $1/2^{64}$, yet still closer to $1/2^{64}$ than $0$ is.





            Now perhaps we should instead consider the probability $p$ that for a random $n$ in $big[1,2^{64}big]$ it holds $text{hash}_{64}(n)=text{hash}_{64}(n+1)$.



            That probability $p$ is $k/{2^{64}}$ for some integer $k$, which can be computed with significant but feasible effort from the exact definition of $text{hash}_{64}$.



            When we average $p$ over all the possible definitions of $text{hash}_{64}$, we get $1/2^{64}$; thus "close to $1/2^{64}$" for sure.



            We can also define the probability $p_k$ that $p$ is $k/{2^{64}}$ for a random function $text{hash}_{64}$, which is a plausible model for the various $text{hash}_{64}$ that the question considers. The largest $p_k$ is $p_1$ (meaning $p=1/{2^{64}}$ is the most likely), with $p_0$ extremely close second, both extremely close to $1/eapprox36.8%$. $p_2$ is about half that, and $p_k$ goes down fast as $k$ grows, down to $p_{2^{64}}=2^{left(-2^{70}right)}$, then $p_k=0$ for $k>2^{64}$. It holds $1=sum p_k$.



            We see that $p$ is in the 3-element set ${0,1/2^{64},1/2^{63}}$ with probability about $5/2eapprox92.0%$. We can say $p$ is very often "close to $1/2^{64}$".







            share|improve this answer














            share|improve this answer



            share|improve this answer








            edited Feb 1 at 6:38

























            answered Jan 31 at 17:53









            fgrieufgrieu

            79.5k7169336




            79.5k7169336












            • $begingroup$
              If the hash is a RO then the probability must be $1/2^{64}$ for any subsequent to equal.
              $endgroup$
              – kelalaka
              Jan 31 at 17:56










            • $begingroup$
              @kelalaka: I do not get "for any subsequent to equal".
              $endgroup$
              – fgrieu
              Jan 31 at 23:39






            • 3




              $begingroup$
              @fgrieu For any subsequent [64-bit block] to [be] equal.
              $endgroup$
              – Future Security
              Feb 1 at 16:23


















            • $begingroup$
              If the hash is a RO then the probability must be $1/2^{64}$ for any subsequent to equal.
              $endgroup$
              – kelalaka
              Jan 31 at 17:56










            • $begingroup$
              @kelalaka: I do not get "for any subsequent to equal".
              $endgroup$
              – fgrieu
              Jan 31 at 23:39






            • 3




              $begingroup$
              @fgrieu For any subsequent [64-bit block] to [be] equal.
              $endgroup$
              – Future Security
              Feb 1 at 16:23
















            $begingroup$
            If the hash is a RO then the probability must be $1/2^{64}$ for any subsequent to equal.
            $endgroup$
            – kelalaka
            Jan 31 at 17:56




            $begingroup$
            If the hash is a RO then the probability must be $1/2^{64}$ for any subsequent to equal.
            $endgroup$
            – kelalaka
            Jan 31 at 17:56












            $begingroup$
            @kelalaka: I do not get "for any subsequent to equal".
            $endgroup$
            – fgrieu
            Jan 31 at 23:39




            $begingroup$
            @kelalaka: I do not get "for any subsequent to equal".
            $endgroup$
            – fgrieu
            Jan 31 at 23:39




            3




            3




            $begingroup$
            @fgrieu For any subsequent [64-bit block] to [be] equal.
            $endgroup$
            – Future Security
            Feb 1 at 16:23




            $begingroup$
            @fgrieu For any subsequent [64-bit block] to [be] equal.
            $endgroup$
            – Future Security
            Feb 1 at 16:23


















            draft saved

            draft discarded




















































            Thanks for contributing an answer to Cryptography Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            Use MathJax to format equations. MathJax reference.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcrypto.stackexchange.com%2fquestions%2f66932%2fusing-a-hash-function-as-a-random-number-generator%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

            ComboBox Display Member on multiple fields

            Is it possible to collect Nectar points via Trainline?