Implementation differences between parser combinators and packrat algorithm












3















In order to have a better understanding of packrat I've tried to have a look at the provided implementation coming with the paper (I'm focusing on the bind):



instance Derivs d => Monad (Parser d) where 

-- Sequencing combinator
(Parser p1) >>= f = Parser parse

where parse dvs = first (p1 dvs)

first (Parsed val rem err) =
let Parser p2 = f val
in second err (p2 rem)
first (NoParse err) = NoParse err

second err1 (Parsed val rem err) =
Parsed val rem (joinErrors err1 err)
second err1 (NoParse err) =
NoParse (joinErrors err1 err)

-- Result-producing combinator
return x = Parser (dvs -> Parsed x dvs (nullError dvs))

-- Failure combinator
fail = Parser (dvs -> NoParse (nullError dvs))
fail msg = Parser (dvs -> NoParse (msgError (dvPos dvs) msg))


For me it looks like (errors handling aside) to parser combinators (such as this simplified version of Parsec):



bind :: Parser a -> (a -> Parser b) -> Parser b
bind p f = Parser $ s -> concatMap ((a, s') -> parse (f a) s') $ parse p s


I'm quite confused because before that I thought that the big difference was that packrat was a parser generator with a memoization part.
Sadly it seems that this concept is not used in this implementation.



What is the big difference between parser combinators and packrat at implementation level?



PS: I also have had a look at Papillon but it seems to be very different from the implementation coming with the paper.










share|improve this question

























  • The implementation you linked to mixes tabs and spaces, and uses a different tab width than Stack Overflow does, so your copy/paste looks mis-indented. Obviously fixing this will not answer your question, but it will make it easier to read.

    – amalloy
    Nov 21 '18 at 23:09






  • 1





    "Parser generators vs. parser combinators vs. hand-written parsers" and "packrat vs. predictive parsing vs. recursive descent with backtracking vs. bottom-up algorithms" are two completely separate classifications. One's about the tools you use to write the parser, the other's about the algorithm the parser uses. Your linked implementation is a parser combinator implementing packrat parsing. That's not a contradiction.

    – sepp2k
    Nov 22 '18 at 2:06






  • 1





    @sepp2k you're right that his question should have said "what's the difference between the packrat library implementation and recursive descent or predictive parser combinator libraries?". However, the implementation he links to does not actually implement packrat parsing: you can use it, but you have to apply the packrat trick yourself in your own code. See my answer below.

    – Dominique Devriese
    Jan 25 at 13:59


















3















In order to have a better understanding of packrat I've tried to have a look at the provided implementation coming with the paper (I'm focusing on the bind):



instance Derivs d => Monad (Parser d) where 

-- Sequencing combinator
(Parser p1) >>= f = Parser parse

where parse dvs = first (p1 dvs)

first (Parsed val rem err) =
let Parser p2 = f val
in second err (p2 rem)
first (NoParse err) = NoParse err

second err1 (Parsed val rem err) =
Parsed val rem (joinErrors err1 err)
second err1 (NoParse err) =
NoParse (joinErrors err1 err)

-- Result-producing combinator
return x = Parser (dvs -> Parsed x dvs (nullError dvs))

-- Failure combinator
fail = Parser (dvs -> NoParse (nullError dvs))
fail msg = Parser (dvs -> NoParse (msgError (dvPos dvs) msg))


For me it looks like (errors handling aside) to parser combinators (such as this simplified version of Parsec):



bind :: Parser a -> (a -> Parser b) -> Parser b
bind p f = Parser $ s -> concatMap ((a, s') -> parse (f a) s') $ parse p s


I'm quite confused because before that I thought that the big difference was that packrat was a parser generator with a memoization part.
Sadly it seems that this concept is not used in this implementation.



What is the big difference between parser combinators and packrat at implementation level?



PS: I also have had a look at Papillon but it seems to be very different from the implementation coming with the paper.










share|improve this question

























  • The implementation you linked to mixes tabs and spaces, and uses a different tab width than Stack Overflow does, so your copy/paste looks mis-indented. Obviously fixing this will not answer your question, but it will make it easier to read.

    – amalloy
    Nov 21 '18 at 23:09






  • 1





    "Parser generators vs. parser combinators vs. hand-written parsers" and "packrat vs. predictive parsing vs. recursive descent with backtracking vs. bottom-up algorithms" are two completely separate classifications. One's about the tools you use to write the parser, the other's about the algorithm the parser uses. Your linked implementation is a parser combinator implementing packrat parsing. That's not a contradiction.

    – sepp2k
    Nov 22 '18 at 2:06






  • 1





    @sepp2k you're right that his question should have said "what's the difference between the packrat library implementation and recursive descent or predictive parser combinator libraries?". However, the implementation he links to does not actually implement packrat parsing: you can use it, but you have to apply the packrat trick yourself in your own code. See my answer below.

    – Dominique Devriese
    Jan 25 at 13:59
















3












3








3








In order to have a better understanding of packrat I've tried to have a look at the provided implementation coming with the paper (I'm focusing on the bind):



instance Derivs d => Monad (Parser d) where 

-- Sequencing combinator
(Parser p1) >>= f = Parser parse

where parse dvs = first (p1 dvs)

first (Parsed val rem err) =
let Parser p2 = f val
in second err (p2 rem)
first (NoParse err) = NoParse err

second err1 (Parsed val rem err) =
Parsed val rem (joinErrors err1 err)
second err1 (NoParse err) =
NoParse (joinErrors err1 err)

-- Result-producing combinator
return x = Parser (dvs -> Parsed x dvs (nullError dvs))

-- Failure combinator
fail = Parser (dvs -> NoParse (nullError dvs))
fail msg = Parser (dvs -> NoParse (msgError (dvPos dvs) msg))


For me it looks like (errors handling aside) to parser combinators (such as this simplified version of Parsec):



bind :: Parser a -> (a -> Parser b) -> Parser b
bind p f = Parser $ s -> concatMap ((a, s') -> parse (f a) s') $ parse p s


I'm quite confused because before that I thought that the big difference was that packrat was a parser generator with a memoization part.
Sadly it seems that this concept is not used in this implementation.



What is the big difference between parser combinators and packrat at implementation level?



PS: I also have had a look at Papillon but it seems to be very different from the implementation coming with the paper.










share|improve this question
















In order to have a better understanding of packrat I've tried to have a look at the provided implementation coming with the paper (I'm focusing on the bind):



instance Derivs d => Monad (Parser d) where 

-- Sequencing combinator
(Parser p1) >>= f = Parser parse

where parse dvs = first (p1 dvs)

first (Parsed val rem err) =
let Parser p2 = f val
in second err (p2 rem)
first (NoParse err) = NoParse err

second err1 (Parsed val rem err) =
Parsed val rem (joinErrors err1 err)
second err1 (NoParse err) =
NoParse (joinErrors err1 err)

-- Result-producing combinator
return x = Parser (dvs -> Parsed x dvs (nullError dvs))

-- Failure combinator
fail = Parser (dvs -> NoParse (nullError dvs))
fail msg = Parser (dvs -> NoParse (msgError (dvPos dvs) msg))


For me it looks like (errors handling aside) to parser combinators (such as this simplified version of Parsec):



bind :: Parser a -> (a -> Parser b) -> Parser b
bind p f = Parser $ s -> concatMap ((a, s') -> parse (f a) s') $ parse p s


I'm quite confused because before that I thought that the big difference was that packrat was a parser generator with a memoization part.
Sadly it seems that this concept is not used in this implementation.



What is the big difference between parser combinators and packrat at implementation level?



PS: I also have had a look at Papillon but it seems to be very different from the implementation coming with the paper.







parsing haskell parsec parser-combinators peg






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 22 '18 at 6:44







GlinesMome

















asked Nov 21 '18 at 20:02









GlinesMomeGlinesMome

48411024




48411024













  • The implementation you linked to mixes tabs and spaces, and uses a different tab width than Stack Overflow does, so your copy/paste looks mis-indented. Obviously fixing this will not answer your question, but it will make it easier to read.

    – amalloy
    Nov 21 '18 at 23:09






  • 1





    "Parser generators vs. parser combinators vs. hand-written parsers" and "packrat vs. predictive parsing vs. recursive descent with backtracking vs. bottom-up algorithms" are two completely separate classifications. One's about the tools you use to write the parser, the other's about the algorithm the parser uses. Your linked implementation is a parser combinator implementing packrat parsing. That's not a contradiction.

    – sepp2k
    Nov 22 '18 at 2:06






  • 1





    @sepp2k you're right that his question should have said "what's the difference between the packrat library implementation and recursive descent or predictive parser combinator libraries?". However, the implementation he links to does not actually implement packrat parsing: you can use it, but you have to apply the packrat trick yourself in your own code. See my answer below.

    – Dominique Devriese
    Jan 25 at 13:59





















  • The implementation you linked to mixes tabs and spaces, and uses a different tab width than Stack Overflow does, so your copy/paste looks mis-indented. Obviously fixing this will not answer your question, but it will make it easier to read.

    – amalloy
    Nov 21 '18 at 23:09






  • 1





    "Parser generators vs. parser combinators vs. hand-written parsers" and "packrat vs. predictive parsing vs. recursive descent with backtracking vs. bottom-up algorithms" are two completely separate classifications. One's about the tools you use to write the parser, the other's about the algorithm the parser uses. Your linked implementation is a parser combinator implementing packrat parsing. That's not a contradiction.

    – sepp2k
    Nov 22 '18 at 2:06






  • 1





    @sepp2k you're right that his question should have said "what's the difference between the packrat library implementation and recursive descent or predictive parser combinator libraries?". However, the implementation he links to does not actually implement packrat parsing: you can use it, but you have to apply the packrat trick yourself in your own code. See my answer below.

    – Dominique Devriese
    Jan 25 at 13:59



















The implementation you linked to mixes tabs and spaces, and uses a different tab width than Stack Overflow does, so your copy/paste looks mis-indented. Obviously fixing this will not answer your question, but it will make it easier to read.

– amalloy
Nov 21 '18 at 23:09





The implementation you linked to mixes tabs and spaces, and uses a different tab width than Stack Overflow does, so your copy/paste looks mis-indented. Obviously fixing this will not answer your question, but it will make it easier to read.

– amalloy
Nov 21 '18 at 23:09




1




1





"Parser generators vs. parser combinators vs. hand-written parsers" and "packrat vs. predictive parsing vs. recursive descent with backtracking vs. bottom-up algorithms" are two completely separate classifications. One's about the tools you use to write the parser, the other's about the algorithm the parser uses. Your linked implementation is a parser combinator implementing packrat parsing. That's not a contradiction.

– sepp2k
Nov 22 '18 at 2:06





"Parser generators vs. parser combinators vs. hand-written parsers" and "packrat vs. predictive parsing vs. recursive descent with backtracking vs. bottom-up algorithms" are two completely separate classifications. One's about the tools you use to write the parser, the other's about the algorithm the parser uses. Your linked implementation is a parser combinator implementing packrat parsing. That's not a contradiction.

– sepp2k
Nov 22 '18 at 2:06




1




1





@sepp2k you're right that his question should have said "what's the difference between the packrat library implementation and recursive descent or predictive parser combinator libraries?". However, the implementation he links to does not actually implement packrat parsing: you can use it, but you have to apply the packrat trick yourself in your own code. See my answer below.

– Dominique Devriese
Jan 25 at 13:59







@sepp2k you're right that his question should have said "what's the difference between the packrat library implementation and recursive descent or predictive parser combinator libraries?". However, the implementation he links to does not actually implement packrat parsing: you can use it, but you have to apply the packrat trick yourself in your own code. See my answer below.

– Dominique Devriese
Jan 25 at 13:59














2 Answers
2






active

oldest

votes


















2














The point here is really that this Packrat parser combinator library is not a full implementation of the Packrat algorithm, but more like a set of definitions that can be reused between different packrat parsers.



The real trick of the packrat algorithm (namely the memoization of parse results) happens elsewhere.
Look at the following code (taken from Ford's thesis):



data Derivs = Derivs {
dvAdditive :: Result Int,
dvMultitive :: Result Int,
dvPrimary :: Result Int,
dvDecimal :: Result Int,
dvChar :: Result Char}


pExpression :: Derivs -> Result ArithDerivs Int
Parser pExpression = (do char ’(’
l <- Parser dvExpression
char ’+’
r <- Parser dvExpression
char ’)’
return (l + r))
</> (do Parser dvDecimal)


Here, it's important to notice that the recursive call of the expression parser to itself is broken (in a kind of open-recursion fashion) by simply projecting the appropriate component of the Derivs structure.



This recursive knot is then tied in the "recursive tie-up function" (again taken from Ford's thesis):



parse :: String -> Derivs
parse s = d where
d = Derivs add mult prim dec chr
add = pAdditive d
mult = pMultitive d
prim = pPrimary d
dec = pDecimal d
chr = case s of
(c:s’) -> Parsed c (parse s’)
-> NoParse


These snippets are really where the packrat trick happens.
It's important to understand that this trick cannot be implemented in a standard way in a traditional parser combinator library (at least in a pure programming language like Haskell), because it needs to know the recursive structure of the grammar.
There are experimental approaches to parser combinator libraries that use a particular representation of the recursive structure of the grammar, and there it is possible to provide a standard implementation of Packrat.
For example, my own grammar-combinators library (not maintained atm, sorry) offers an implementation of Packrat.






share|improve this answer


























  • Thanks for your answer, but I'm not sure to understand how this co-recursion make the memoization work. I have done some tests with fix (on a fibonacci sequence) for example and it have to impact on performances. If you can add some link/insights about it, thanks.

    – GlinesMome
    Nov 25 '18 at 21:53











  • Sorry, but it's not clear to me what you're asking. The memoization is the core trick of the packrat algorithm. Essentially, the only values of type Derivs that will ever be generated for a given input, is the initial result of parse "input", as well as the ones generated in the chr case of parse above. For those latter ones, only one value of type Derivs is generated for every character consumed. So there will be as many Derivs values as inputs in the string.

    – Dominique Devriese
    Nov 26 '18 at 22:20













  • Those Derivs values have one field for every non-terminal (plus the chr primitive), so in summary, there is one Result ArithDerivs x for every non-terminal (plus the chr primitive) at every input position. All other non-terminals only ever return one of these pre-existing Derivs values, because they just project a field of one of those pre-generated Derivs values.

    – Dominique Devriese
    Nov 26 '18 at 22:24











  • Any field of those Derivs values will only be evaluated once, by virtue of Haskell's lazyness, so this means you get the desired memoization.

    – Dominique Devriese
    Nov 26 '18 at 22:25






  • 1





    Evaluating pAdditive d will then first evaluate advMultitive d. It's important that it's advMultitive d that's being evaluated (a field of the record d), not pMultitive d (the actual parse of a multiplicative expression a this position).

    – Dominique Devriese
    Nov 30 '18 at 8:12





















0














As stated elsewhere, packrat is not an alternative to combinators, but is an implementation option in those parsers. Pyparsing is a combinator that offers an optional packrat optimization by calling ParserElement.enablePackrat(). Its implementation is almost a drop-in replacement for pyparsing's _parse method (renamed to _parseNoCache), with a _parseCache method.



Pyparsing uses a fixed-length FIFO queue for its cache, since packrat cache entries get stale once the combinator fully matches the cached expression and moves on through the input stream. A custom cache size can be passed as an integer argument to enablePackrat(), or if None is passed, the cache is unbounded. A packrat cache with the default value of 128 was only 2% less efficient than an unbounded cache against the supplied Verilog parser, with significant savings in memory.






share|improve this answer

























    Your Answer






    StackExchange.ifUsing("editor", function () {
    StackExchange.using("externalEditor", function () {
    StackExchange.using("snippets", function () {
    StackExchange.snippets.init();
    });
    });
    }, "code-snippets");

    StackExchange.ready(function() {
    var channelOptions = {
    tags: "".split(" "),
    id: "1"
    };
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function() {
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled) {
    StackExchange.using("snippets", function() {
    createEditor();
    });
    }
    else {
    createEditor();
    }
    });

    function createEditor() {
    StackExchange.prepareEditor({
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader: {
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    },
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    });


    }
    });














    draft saved

    draft discarded


















    StackExchange.ready(
    function () {
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53419694%2fimplementation-differences-between-parser-combinators-and-packrat-algorithm%23new-answer', 'question_page');
    }
    );

    Post as a guest















    Required, but never shown

























    2 Answers
    2






    active

    oldest

    votes








    2 Answers
    2






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    The point here is really that this Packrat parser combinator library is not a full implementation of the Packrat algorithm, but more like a set of definitions that can be reused between different packrat parsers.



    The real trick of the packrat algorithm (namely the memoization of parse results) happens elsewhere.
    Look at the following code (taken from Ford's thesis):



    data Derivs = Derivs {
    dvAdditive :: Result Int,
    dvMultitive :: Result Int,
    dvPrimary :: Result Int,
    dvDecimal :: Result Int,
    dvChar :: Result Char}


    pExpression :: Derivs -> Result ArithDerivs Int
    Parser pExpression = (do char ’(’
    l <- Parser dvExpression
    char ’+’
    r <- Parser dvExpression
    char ’)’
    return (l + r))
    </> (do Parser dvDecimal)


    Here, it's important to notice that the recursive call of the expression parser to itself is broken (in a kind of open-recursion fashion) by simply projecting the appropriate component of the Derivs structure.



    This recursive knot is then tied in the "recursive tie-up function" (again taken from Ford's thesis):



    parse :: String -> Derivs
    parse s = d where
    d = Derivs add mult prim dec chr
    add = pAdditive d
    mult = pMultitive d
    prim = pPrimary d
    dec = pDecimal d
    chr = case s of
    (c:s’) -> Parsed c (parse s’)
    -> NoParse


    These snippets are really where the packrat trick happens.
    It's important to understand that this trick cannot be implemented in a standard way in a traditional parser combinator library (at least in a pure programming language like Haskell), because it needs to know the recursive structure of the grammar.
    There are experimental approaches to parser combinator libraries that use a particular representation of the recursive structure of the grammar, and there it is possible to provide a standard implementation of Packrat.
    For example, my own grammar-combinators library (not maintained atm, sorry) offers an implementation of Packrat.






    share|improve this answer


























    • Thanks for your answer, but I'm not sure to understand how this co-recursion make the memoization work. I have done some tests with fix (on a fibonacci sequence) for example and it have to impact on performances. If you can add some link/insights about it, thanks.

      – GlinesMome
      Nov 25 '18 at 21:53











    • Sorry, but it's not clear to me what you're asking. The memoization is the core trick of the packrat algorithm. Essentially, the only values of type Derivs that will ever be generated for a given input, is the initial result of parse "input", as well as the ones generated in the chr case of parse above. For those latter ones, only one value of type Derivs is generated for every character consumed. So there will be as many Derivs values as inputs in the string.

      – Dominique Devriese
      Nov 26 '18 at 22:20













    • Those Derivs values have one field for every non-terminal (plus the chr primitive), so in summary, there is one Result ArithDerivs x for every non-terminal (plus the chr primitive) at every input position. All other non-terminals only ever return one of these pre-existing Derivs values, because they just project a field of one of those pre-generated Derivs values.

      – Dominique Devriese
      Nov 26 '18 at 22:24











    • Any field of those Derivs values will only be evaluated once, by virtue of Haskell's lazyness, so this means you get the desired memoization.

      – Dominique Devriese
      Nov 26 '18 at 22:25






    • 1





      Evaluating pAdditive d will then first evaluate advMultitive d. It's important that it's advMultitive d that's being evaluated (a field of the record d), not pMultitive d (the actual parse of a multiplicative expression a this position).

      – Dominique Devriese
      Nov 30 '18 at 8:12


















    2














    The point here is really that this Packrat parser combinator library is not a full implementation of the Packrat algorithm, but more like a set of definitions that can be reused between different packrat parsers.



    The real trick of the packrat algorithm (namely the memoization of parse results) happens elsewhere.
    Look at the following code (taken from Ford's thesis):



    data Derivs = Derivs {
    dvAdditive :: Result Int,
    dvMultitive :: Result Int,
    dvPrimary :: Result Int,
    dvDecimal :: Result Int,
    dvChar :: Result Char}


    pExpression :: Derivs -> Result ArithDerivs Int
    Parser pExpression = (do char ’(’
    l <- Parser dvExpression
    char ’+’
    r <- Parser dvExpression
    char ’)’
    return (l + r))
    </> (do Parser dvDecimal)


    Here, it's important to notice that the recursive call of the expression parser to itself is broken (in a kind of open-recursion fashion) by simply projecting the appropriate component of the Derivs structure.



    This recursive knot is then tied in the "recursive tie-up function" (again taken from Ford's thesis):



    parse :: String -> Derivs
    parse s = d where
    d = Derivs add mult prim dec chr
    add = pAdditive d
    mult = pMultitive d
    prim = pPrimary d
    dec = pDecimal d
    chr = case s of
    (c:s’) -> Parsed c (parse s’)
    -> NoParse


    These snippets are really where the packrat trick happens.
    It's important to understand that this trick cannot be implemented in a standard way in a traditional parser combinator library (at least in a pure programming language like Haskell), because it needs to know the recursive structure of the grammar.
    There are experimental approaches to parser combinator libraries that use a particular representation of the recursive structure of the grammar, and there it is possible to provide a standard implementation of Packrat.
    For example, my own grammar-combinators library (not maintained atm, sorry) offers an implementation of Packrat.






    share|improve this answer


























    • Thanks for your answer, but I'm not sure to understand how this co-recursion make the memoization work. I have done some tests with fix (on a fibonacci sequence) for example and it have to impact on performances. If you can add some link/insights about it, thanks.

      – GlinesMome
      Nov 25 '18 at 21:53











    • Sorry, but it's not clear to me what you're asking. The memoization is the core trick of the packrat algorithm. Essentially, the only values of type Derivs that will ever be generated for a given input, is the initial result of parse "input", as well as the ones generated in the chr case of parse above. For those latter ones, only one value of type Derivs is generated for every character consumed. So there will be as many Derivs values as inputs in the string.

      – Dominique Devriese
      Nov 26 '18 at 22:20













    • Those Derivs values have one field for every non-terminal (plus the chr primitive), so in summary, there is one Result ArithDerivs x for every non-terminal (plus the chr primitive) at every input position. All other non-terminals only ever return one of these pre-existing Derivs values, because they just project a field of one of those pre-generated Derivs values.

      – Dominique Devriese
      Nov 26 '18 at 22:24











    • Any field of those Derivs values will only be evaluated once, by virtue of Haskell's lazyness, so this means you get the desired memoization.

      – Dominique Devriese
      Nov 26 '18 at 22:25






    • 1





      Evaluating pAdditive d will then first evaluate advMultitive d. It's important that it's advMultitive d that's being evaluated (a field of the record d), not pMultitive d (the actual parse of a multiplicative expression a this position).

      – Dominique Devriese
      Nov 30 '18 at 8:12
















    2












    2








    2







    The point here is really that this Packrat parser combinator library is not a full implementation of the Packrat algorithm, but more like a set of definitions that can be reused between different packrat parsers.



    The real trick of the packrat algorithm (namely the memoization of parse results) happens elsewhere.
    Look at the following code (taken from Ford's thesis):



    data Derivs = Derivs {
    dvAdditive :: Result Int,
    dvMultitive :: Result Int,
    dvPrimary :: Result Int,
    dvDecimal :: Result Int,
    dvChar :: Result Char}


    pExpression :: Derivs -> Result ArithDerivs Int
    Parser pExpression = (do char ’(’
    l <- Parser dvExpression
    char ’+’
    r <- Parser dvExpression
    char ’)’
    return (l + r))
    </> (do Parser dvDecimal)


    Here, it's important to notice that the recursive call of the expression parser to itself is broken (in a kind of open-recursion fashion) by simply projecting the appropriate component of the Derivs structure.



    This recursive knot is then tied in the "recursive tie-up function" (again taken from Ford's thesis):



    parse :: String -> Derivs
    parse s = d where
    d = Derivs add mult prim dec chr
    add = pAdditive d
    mult = pMultitive d
    prim = pPrimary d
    dec = pDecimal d
    chr = case s of
    (c:s’) -> Parsed c (parse s’)
    -> NoParse


    These snippets are really where the packrat trick happens.
    It's important to understand that this trick cannot be implemented in a standard way in a traditional parser combinator library (at least in a pure programming language like Haskell), because it needs to know the recursive structure of the grammar.
    There are experimental approaches to parser combinator libraries that use a particular representation of the recursive structure of the grammar, and there it is possible to provide a standard implementation of Packrat.
    For example, my own grammar-combinators library (not maintained atm, sorry) offers an implementation of Packrat.






    share|improve this answer















    The point here is really that this Packrat parser combinator library is not a full implementation of the Packrat algorithm, but more like a set of definitions that can be reused between different packrat parsers.



    The real trick of the packrat algorithm (namely the memoization of parse results) happens elsewhere.
    Look at the following code (taken from Ford's thesis):



    data Derivs = Derivs {
    dvAdditive :: Result Int,
    dvMultitive :: Result Int,
    dvPrimary :: Result Int,
    dvDecimal :: Result Int,
    dvChar :: Result Char}


    pExpression :: Derivs -> Result ArithDerivs Int
    Parser pExpression = (do char ’(’
    l <- Parser dvExpression
    char ’+’
    r <- Parser dvExpression
    char ’)’
    return (l + r))
    </> (do Parser dvDecimal)


    Here, it's important to notice that the recursive call of the expression parser to itself is broken (in a kind of open-recursion fashion) by simply projecting the appropriate component of the Derivs structure.



    This recursive knot is then tied in the "recursive tie-up function" (again taken from Ford's thesis):



    parse :: String -> Derivs
    parse s = d where
    d = Derivs add mult prim dec chr
    add = pAdditive d
    mult = pMultitive d
    prim = pPrimary d
    dec = pDecimal d
    chr = case s of
    (c:s’) -> Parsed c (parse s’)
    -> NoParse


    These snippets are really where the packrat trick happens.
    It's important to understand that this trick cannot be implemented in a standard way in a traditional parser combinator library (at least in a pure programming language like Haskell), because it needs to know the recursive structure of the grammar.
    There are experimental approaches to parser combinator libraries that use a particular representation of the recursive structure of the grammar, and there it is possible to provide a standard implementation of Packrat.
    For example, my own grammar-combinators library (not maintained atm, sorry) offers an implementation of Packrat.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 25 at 13:52

























    answered Nov 22 '18 at 7:22









    Dominique DevrieseDominique Devriese

    2,7201920




    2,7201920













    • Thanks for your answer, but I'm not sure to understand how this co-recursion make the memoization work. I have done some tests with fix (on a fibonacci sequence) for example and it have to impact on performances. If you can add some link/insights about it, thanks.

      – GlinesMome
      Nov 25 '18 at 21:53











    • Sorry, but it's not clear to me what you're asking. The memoization is the core trick of the packrat algorithm. Essentially, the only values of type Derivs that will ever be generated for a given input, is the initial result of parse "input", as well as the ones generated in the chr case of parse above. For those latter ones, only one value of type Derivs is generated for every character consumed. So there will be as many Derivs values as inputs in the string.

      – Dominique Devriese
      Nov 26 '18 at 22:20













    • Those Derivs values have one field for every non-terminal (plus the chr primitive), so in summary, there is one Result ArithDerivs x for every non-terminal (plus the chr primitive) at every input position. All other non-terminals only ever return one of these pre-existing Derivs values, because they just project a field of one of those pre-generated Derivs values.

      – Dominique Devriese
      Nov 26 '18 at 22:24











    • Any field of those Derivs values will only be evaluated once, by virtue of Haskell's lazyness, so this means you get the desired memoization.

      – Dominique Devriese
      Nov 26 '18 at 22:25






    • 1





      Evaluating pAdditive d will then first evaluate advMultitive d. It's important that it's advMultitive d that's being evaluated (a field of the record d), not pMultitive d (the actual parse of a multiplicative expression a this position).

      – Dominique Devriese
      Nov 30 '18 at 8:12





















    • Thanks for your answer, but I'm not sure to understand how this co-recursion make the memoization work. I have done some tests with fix (on a fibonacci sequence) for example and it have to impact on performances. If you can add some link/insights about it, thanks.

      – GlinesMome
      Nov 25 '18 at 21:53











    • Sorry, but it's not clear to me what you're asking. The memoization is the core trick of the packrat algorithm. Essentially, the only values of type Derivs that will ever be generated for a given input, is the initial result of parse "input", as well as the ones generated in the chr case of parse above. For those latter ones, only one value of type Derivs is generated for every character consumed. So there will be as many Derivs values as inputs in the string.

      – Dominique Devriese
      Nov 26 '18 at 22:20













    • Those Derivs values have one field for every non-terminal (plus the chr primitive), so in summary, there is one Result ArithDerivs x for every non-terminal (plus the chr primitive) at every input position. All other non-terminals only ever return one of these pre-existing Derivs values, because they just project a field of one of those pre-generated Derivs values.

      – Dominique Devriese
      Nov 26 '18 at 22:24











    • Any field of those Derivs values will only be evaluated once, by virtue of Haskell's lazyness, so this means you get the desired memoization.

      – Dominique Devriese
      Nov 26 '18 at 22:25






    • 1





      Evaluating pAdditive d will then first evaluate advMultitive d. It's important that it's advMultitive d that's being evaluated (a field of the record d), not pMultitive d (the actual parse of a multiplicative expression a this position).

      – Dominique Devriese
      Nov 30 '18 at 8:12



















    Thanks for your answer, but I'm not sure to understand how this co-recursion make the memoization work. I have done some tests with fix (on a fibonacci sequence) for example and it have to impact on performances. If you can add some link/insights about it, thanks.

    – GlinesMome
    Nov 25 '18 at 21:53





    Thanks for your answer, but I'm not sure to understand how this co-recursion make the memoization work. I have done some tests with fix (on a fibonacci sequence) for example and it have to impact on performances. If you can add some link/insights about it, thanks.

    – GlinesMome
    Nov 25 '18 at 21:53













    Sorry, but it's not clear to me what you're asking. The memoization is the core trick of the packrat algorithm. Essentially, the only values of type Derivs that will ever be generated for a given input, is the initial result of parse "input", as well as the ones generated in the chr case of parse above. For those latter ones, only one value of type Derivs is generated for every character consumed. So there will be as many Derivs values as inputs in the string.

    – Dominique Devriese
    Nov 26 '18 at 22:20







    Sorry, but it's not clear to me what you're asking. The memoization is the core trick of the packrat algorithm. Essentially, the only values of type Derivs that will ever be generated for a given input, is the initial result of parse "input", as well as the ones generated in the chr case of parse above. For those latter ones, only one value of type Derivs is generated for every character consumed. So there will be as many Derivs values as inputs in the string.

    – Dominique Devriese
    Nov 26 '18 at 22:20















    Those Derivs values have one field for every non-terminal (plus the chr primitive), so in summary, there is one Result ArithDerivs x for every non-terminal (plus the chr primitive) at every input position. All other non-terminals only ever return one of these pre-existing Derivs values, because they just project a field of one of those pre-generated Derivs values.

    – Dominique Devriese
    Nov 26 '18 at 22:24





    Those Derivs values have one field for every non-terminal (plus the chr primitive), so in summary, there is one Result ArithDerivs x for every non-terminal (plus the chr primitive) at every input position. All other non-terminals only ever return one of these pre-existing Derivs values, because they just project a field of one of those pre-generated Derivs values.

    – Dominique Devriese
    Nov 26 '18 at 22:24













    Any field of those Derivs values will only be evaluated once, by virtue of Haskell's lazyness, so this means you get the desired memoization.

    – Dominique Devriese
    Nov 26 '18 at 22:25





    Any field of those Derivs values will only be evaluated once, by virtue of Haskell's lazyness, so this means you get the desired memoization.

    – Dominique Devriese
    Nov 26 '18 at 22:25




    1




    1





    Evaluating pAdditive d will then first evaluate advMultitive d. It's important that it's advMultitive d that's being evaluated (a field of the record d), not pMultitive d (the actual parse of a multiplicative expression a this position).

    – Dominique Devriese
    Nov 30 '18 at 8:12







    Evaluating pAdditive d will then first evaluate advMultitive d. It's important that it's advMultitive d that's being evaluated (a field of the record d), not pMultitive d (the actual parse of a multiplicative expression a this position).

    – Dominique Devriese
    Nov 30 '18 at 8:12















    0














    As stated elsewhere, packrat is not an alternative to combinators, but is an implementation option in those parsers. Pyparsing is a combinator that offers an optional packrat optimization by calling ParserElement.enablePackrat(). Its implementation is almost a drop-in replacement for pyparsing's _parse method (renamed to _parseNoCache), with a _parseCache method.



    Pyparsing uses a fixed-length FIFO queue for its cache, since packrat cache entries get stale once the combinator fully matches the cached expression and moves on through the input stream. A custom cache size can be passed as an integer argument to enablePackrat(), or if None is passed, the cache is unbounded. A packrat cache with the default value of 128 was only 2% less efficient than an unbounded cache against the supplied Verilog parser, with significant savings in memory.






    share|improve this answer






























      0














      As stated elsewhere, packrat is not an alternative to combinators, but is an implementation option in those parsers. Pyparsing is a combinator that offers an optional packrat optimization by calling ParserElement.enablePackrat(). Its implementation is almost a drop-in replacement for pyparsing's _parse method (renamed to _parseNoCache), with a _parseCache method.



      Pyparsing uses a fixed-length FIFO queue for its cache, since packrat cache entries get stale once the combinator fully matches the cached expression and moves on through the input stream. A custom cache size can be passed as an integer argument to enablePackrat(), or if None is passed, the cache is unbounded. A packrat cache with the default value of 128 was only 2% less efficient than an unbounded cache against the supplied Verilog parser, with significant savings in memory.






      share|improve this answer




























        0












        0








        0







        As stated elsewhere, packrat is not an alternative to combinators, but is an implementation option in those parsers. Pyparsing is a combinator that offers an optional packrat optimization by calling ParserElement.enablePackrat(). Its implementation is almost a drop-in replacement for pyparsing's _parse method (renamed to _parseNoCache), with a _parseCache method.



        Pyparsing uses a fixed-length FIFO queue for its cache, since packrat cache entries get stale once the combinator fully matches the cached expression and moves on through the input stream. A custom cache size can be passed as an integer argument to enablePackrat(), or if None is passed, the cache is unbounded. A packrat cache with the default value of 128 was only 2% less efficient than an unbounded cache against the supplied Verilog parser, with significant savings in memory.






        share|improve this answer















        As stated elsewhere, packrat is not an alternative to combinators, but is an implementation option in those parsers. Pyparsing is a combinator that offers an optional packrat optimization by calling ParserElement.enablePackrat(). Its implementation is almost a drop-in replacement for pyparsing's _parse method (renamed to _parseNoCache), with a _parseCache method.



        Pyparsing uses a fixed-length FIFO queue for its cache, since packrat cache entries get stale once the combinator fully matches the cached expression and moves on through the input stream. A custom cache size can be passed as an integer argument to enablePackrat(), or if None is passed, the cache is unbounded. A packrat cache with the default value of 128 was only 2% less efficient than an unbounded cache against the supplied Verilog parser, with significant savings in memory.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Dec 29 '18 at 17:07

























        answered Dec 29 '18 at 16:54









        PaulMcGPaulMcG

        46.7k969111




        46.7k969111






























            draft saved

            draft discarded




















































            Thanks for contributing an answer to Stack Overflow!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid



            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.


            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53419694%2fimplementation-differences-between-parser-combinators-and-packrat-algorithm%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            How to change which sound is reproduced for terminal bell?

            Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

            Can I use Tabulator js library in my java Spring + Thymeleaf project?