How to properly read an HTTP Post message segmented into two TCP segments?












0














When I execute the following Python code on a pcap file:



if tcp.dport == 80:    
try:
http=dpkt.http.Request(tcp.data)
except (dpkt.dpkt.NeedData):
continue
except (dpkt.dpkt.UnpackError):
continue
if http.method == 'POST':
print('POST Message')


Packets such as the following ones create a problem:
enter image description here



These are a single HTTP Post message segmented into two TCP segments and each one is sent in a different packet. However, because the first segment is a TCP only and the second one is recognised as HTTP, it seems that when dpkt.http.Request tries to read the first segment as HTTP it fails.



So far no problem. It is OK to fail as it is not really a full HTTP message. However, the issue is that it does not seem to be reading the second segment at all ("POST Message" is not printed)!!! The second segment is totally ignored as if it does not exist!!! The only possible explanation for that is that dpkt automatically reads the second segment at once as it recognises they both are segments for the same message.



The issue is that, though both TCP segments are read at once (following the above assumption), the resulted tcp.data is not recognised as an HTTP packet, rather it is still recognised as TCP only because the first segment of the message is a TCP only packet.



So what shall I do to read the HTTP header and data of such pcap file?










share|improve this question



























    0














    When I execute the following Python code on a pcap file:



    if tcp.dport == 80:    
    try:
    http=dpkt.http.Request(tcp.data)
    except (dpkt.dpkt.NeedData):
    continue
    except (dpkt.dpkt.UnpackError):
    continue
    if http.method == 'POST':
    print('POST Message')


    Packets such as the following ones create a problem:
    enter image description here



    These are a single HTTP Post message segmented into two TCP segments and each one is sent in a different packet. However, because the first segment is a TCP only and the second one is recognised as HTTP, it seems that when dpkt.http.Request tries to read the first segment as HTTP it fails.



    So far no problem. It is OK to fail as it is not really a full HTTP message. However, the issue is that it does not seem to be reading the second segment at all ("POST Message" is not printed)!!! The second segment is totally ignored as if it does not exist!!! The only possible explanation for that is that dpkt automatically reads the second segment at once as it recognises they both are segments for the same message.



    The issue is that, though both TCP segments are read at once (following the above assumption), the resulted tcp.data is not recognised as an HTTP packet, rather it is still recognised as TCP only because the first segment of the message is a TCP only packet.



    So what shall I do to read the HTTP header and data of such pcap file?










    share|improve this question

























      0












      0








      0


      1





      When I execute the following Python code on a pcap file:



      if tcp.dport == 80:    
      try:
      http=dpkt.http.Request(tcp.data)
      except (dpkt.dpkt.NeedData):
      continue
      except (dpkt.dpkt.UnpackError):
      continue
      if http.method == 'POST':
      print('POST Message')


      Packets such as the following ones create a problem:
      enter image description here



      These are a single HTTP Post message segmented into two TCP segments and each one is sent in a different packet. However, because the first segment is a TCP only and the second one is recognised as HTTP, it seems that when dpkt.http.Request tries to read the first segment as HTTP it fails.



      So far no problem. It is OK to fail as it is not really a full HTTP message. However, the issue is that it does not seem to be reading the second segment at all ("POST Message" is not printed)!!! The second segment is totally ignored as if it does not exist!!! The only possible explanation for that is that dpkt automatically reads the second segment at once as it recognises they both are segments for the same message.



      The issue is that, though both TCP segments are read at once (following the above assumption), the resulted tcp.data is not recognised as an HTTP packet, rather it is still recognised as TCP only because the first segment of the message is a TCP only packet.



      So what shall I do to read the HTTP header and data of such pcap file?










      share|improve this question













      When I execute the following Python code on a pcap file:



      if tcp.dport == 80:    
      try:
      http=dpkt.http.Request(tcp.data)
      except (dpkt.dpkt.NeedData):
      continue
      except (dpkt.dpkt.UnpackError):
      continue
      if http.method == 'POST':
      print('POST Message')


      Packets such as the following ones create a problem:
      enter image description here



      These are a single HTTP Post message segmented into two TCP segments and each one is sent in a different packet. However, because the first segment is a TCP only and the second one is recognised as HTTP, it seems that when dpkt.http.Request tries to read the first segment as HTTP it fails.



      So far no problem. It is OK to fail as it is not really a full HTTP message. However, the issue is that it does not seem to be reading the second segment at all ("POST Message" is not printed)!!! The second segment is totally ignored as if it does not exist!!! The only possible explanation for that is that dpkt automatically reads the second segment at once as it recognises they both are segments for the same message.



      The issue is that, though both TCP segments are read at once (following the above assumption), the resulted tcp.data is not recognised as an HTTP packet, rather it is still recognised as TCP only because the first segment of the message is a TCP only packet.



      So what shall I do to read the HTTP header and data of such pcap file?







      python tcp dpkt






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 16 '18 at 16:33









      mksoi

      2217




      2217
























          1 Answer
          1






          active

          oldest

          votes


















          1














          dpkt only works at the packet level. dpkt.http.Request expects the full HTTP request as input and not only the part in the current packet. This means you have to collect the input from all packets belonging to the connection, i.e. reassembling the TCP data stream.



          Reassembling is not simply concatenating packets but also making sure that there are no lost packets, no duplicates and that the packets gets reassembled in the proper order which might not be the order on the wire. Essentially you need to do everything which the OS kernel would do before putting the extracted payload into a socket buffer.



          For some example how parts of this can be done see Follow HTTP Stream (with decompression). Note that the example there blindly assumes that the packets are already in order, complete and without duplicates - and assumption which is not guaranteed in real life.






          share|improve this answer





















          • Thank you. I wil try your way and get back to you.
            – mksoi
            Nov 18 '18 at 20:54










          • It works with some modifications. Thank you.
            – mksoi
            Nov 20 '18 at 22:33











          Your Answer






          StackExchange.ifUsing("editor", function () {
          StackExchange.using("externalEditor", function () {
          StackExchange.using("snippets", function () {
          StackExchange.snippets.init();
          });
          });
          }, "code-snippets");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "1"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53341946%2fhow-to-properly-read-an-http-post-message-segmented-into-two-tcp-segments%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          dpkt only works at the packet level. dpkt.http.Request expects the full HTTP request as input and not only the part in the current packet. This means you have to collect the input from all packets belonging to the connection, i.e. reassembling the TCP data stream.



          Reassembling is not simply concatenating packets but also making sure that there are no lost packets, no duplicates and that the packets gets reassembled in the proper order which might not be the order on the wire. Essentially you need to do everything which the OS kernel would do before putting the extracted payload into a socket buffer.



          For some example how parts of this can be done see Follow HTTP Stream (with decompression). Note that the example there blindly assumes that the packets are already in order, complete and without duplicates - and assumption which is not guaranteed in real life.






          share|improve this answer





















          • Thank you. I wil try your way and get back to you.
            – mksoi
            Nov 18 '18 at 20:54










          • It works with some modifications. Thank you.
            – mksoi
            Nov 20 '18 at 22:33
















          1














          dpkt only works at the packet level. dpkt.http.Request expects the full HTTP request as input and not only the part in the current packet. This means you have to collect the input from all packets belonging to the connection, i.e. reassembling the TCP data stream.



          Reassembling is not simply concatenating packets but also making sure that there are no lost packets, no duplicates and that the packets gets reassembled in the proper order which might not be the order on the wire. Essentially you need to do everything which the OS kernel would do before putting the extracted payload into a socket buffer.



          For some example how parts of this can be done see Follow HTTP Stream (with decompression). Note that the example there blindly assumes that the packets are already in order, complete and without duplicates - and assumption which is not guaranteed in real life.






          share|improve this answer





















          • Thank you. I wil try your way and get back to you.
            – mksoi
            Nov 18 '18 at 20:54










          • It works with some modifications. Thank you.
            – mksoi
            Nov 20 '18 at 22:33














          1












          1








          1






          dpkt only works at the packet level. dpkt.http.Request expects the full HTTP request as input and not only the part in the current packet. This means you have to collect the input from all packets belonging to the connection, i.e. reassembling the TCP data stream.



          Reassembling is not simply concatenating packets but also making sure that there are no lost packets, no duplicates and that the packets gets reassembled in the proper order which might not be the order on the wire. Essentially you need to do everything which the OS kernel would do before putting the extracted payload into a socket buffer.



          For some example how parts of this can be done see Follow HTTP Stream (with decompression). Note that the example there blindly assumes that the packets are already in order, complete and without duplicates - and assumption which is not guaranteed in real life.






          share|improve this answer












          dpkt only works at the packet level. dpkt.http.Request expects the full HTTP request as input and not only the part in the current packet. This means you have to collect the input from all packets belonging to the connection, i.e. reassembling the TCP data stream.



          Reassembling is not simply concatenating packets but also making sure that there are no lost packets, no duplicates and that the packets gets reassembled in the proper order which might not be the order on the wire. Essentially you need to do everything which the OS kernel would do before putting the extracted payload into a socket buffer.



          For some example how parts of this can be done see Follow HTTP Stream (with decompression). Note that the example there blindly assumes that the packets are already in order, complete and without duplicates - and assumption which is not guaranteed in real life.







          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Nov 16 '18 at 20:01









          Steffen Ullrich

          59.7k35798




          59.7k35798












          • Thank you. I wil try your way and get back to you.
            – mksoi
            Nov 18 '18 at 20:54










          • It works with some modifications. Thank you.
            – mksoi
            Nov 20 '18 at 22:33


















          • Thank you. I wil try your way and get back to you.
            – mksoi
            Nov 18 '18 at 20:54










          • It works with some modifications. Thank you.
            – mksoi
            Nov 20 '18 at 22:33
















          Thank you. I wil try your way and get back to you.
          – mksoi
          Nov 18 '18 at 20:54




          Thank you. I wil try your way and get back to you.
          – mksoi
          Nov 18 '18 at 20:54












          It works with some modifications. Thank you.
          – mksoi
          Nov 20 '18 at 22:33




          It works with some modifications. Thank you.
          – mksoi
          Nov 20 '18 at 22:33


















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Stack Overflow!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.





          Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


          Please pay close attention to the following guidance:


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53341946%2fhow-to-properly-read-an-http-post-message-segmented-into-two-tcp-segments%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Biblatex bibliography style without URLs when DOI exists (in Overleaf with Zotero bibliography)

          ComboBox Display Member on multiple fields

          Is it possible to collect Nectar points via Trainline?