using preg_match with html comments





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







0















I want to convert into a string the html contained between these comments



<!--content-start-->
desired html
<!--content-end-->


so I use pregmatch, right?



preg_match("/<!--content-start-->(.*)<!--content-end-->/i", $rss, $content);


but it wont work. Maybe a problem with the REGEX?



Thank you.










share|improve this question

























  • No, you don't use regular expressions to parse HTML. You use an HTML parser!

    – miken32
    Nov 23 '18 at 4:09











  • for example please?

    – Cain Nuke
    Nov 23 '18 at 4:11











  • @miken32 Although, arguably, they aren't "parsing HTML". They are simply extracting one block of text between two unique tokens (regardless of content-type). Using an HTML parser in this particular example (a simple one-off pattern matching exercise) is overkill IMO. (Only a small tweak to the OPs regex is required.)

    – MrWhite
    Dec 1 '18 at 1:28




















0















I want to convert into a string the html contained between these comments



<!--content-start-->
desired html
<!--content-end-->


so I use pregmatch, right?



preg_match("/<!--content-start-->(.*)<!--content-end-->/i", $rss, $content);


but it wont work. Maybe a problem with the REGEX?



Thank you.










share|improve this question

























  • No, you don't use regular expressions to parse HTML. You use an HTML parser!

    – miken32
    Nov 23 '18 at 4:09











  • for example please?

    – Cain Nuke
    Nov 23 '18 at 4:11











  • @miken32 Although, arguably, they aren't "parsing HTML". They are simply extracting one block of text between two unique tokens (regardless of content-type). Using an HTML parser in this particular example (a simple one-off pattern matching exercise) is overkill IMO. (Only a small tweak to the OPs regex is required.)

    – MrWhite
    Dec 1 '18 at 1:28
















0












0








0








I want to convert into a string the html contained between these comments



<!--content-start-->
desired html
<!--content-end-->


so I use pregmatch, right?



preg_match("/<!--content-start-->(.*)<!--content-end-->/i", $rss, $content);


but it wont work. Maybe a problem with the REGEX?



Thank you.










share|improve this question
















I want to convert into a string the html contained between these comments



<!--content-start-->
desired html
<!--content-end-->


so I use pregmatch, right?



preg_match("/<!--content-start-->(.*)<!--content-end-->/i", $rss, $content);


but it wont work. Maybe a problem with the REGEX?



Thank you.







php preg-match






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 23 '18 at 4:47









miken32

24.9k95173




24.9k95173










asked Nov 23 '18 at 3:58









Cain NukeCain Nuke

65611335




65611335













  • No, you don't use regular expressions to parse HTML. You use an HTML parser!

    – miken32
    Nov 23 '18 at 4:09











  • for example please?

    – Cain Nuke
    Nov 23 '18 at 4:11











  • @miken32 Although, arguably, they aren't "parsing HTML". They are simply extracting one block of text between two unique tokens (regardless of content-type). Using an HTML parser in this particular example (a simple one-off pattern matching exercise) is overkill IMO. (Only a small tweak to the OPs regex is required.)

    – MrWhite
    Dec 1 '18 at 1:28





















  • No, you don't use regular expressions to parse HTML. You use an HTML parser!

    – miken32
    Nov 23 '18 at 4:09











  • for example please?

    – Cain Nuke
    Nov 23 '18 at 4:11











  • @miken32 Although, arguably, they aren't "parsing HTML". They are simply extracting one block of text between two unique tokens (regardless of content-type). Using an HTML parser in this particular example (a simple one-off pattern matching exercise) is overkill IMO. (Only a small tweak to the OPs regex is required.)

    – MrWhite
    Dec 1 '18 at 1:28



















No, you don't use regular expressions to parse HTML. You use an HTML parser!

– miken32
Nov 23 '18 at 4:09





No, you don't use regular expressions to parse HTML. You use an HTML parser!

– miken32
Nov 23 '18 at 4:09













for example please?

– Cain Nuke
Nov 23 '18 at 4:11





for example please?

– Cain Nuke
Nov 23 '18 at 4:11













@miken32 Although, arguably, they aren't "parsing HTML". They are simply extracting one block of text between two unique tokens (regardless of content-type). Using an HTML parser in this particular example (a simple one-off pattern matching exercise) is overkill IMO. (Only a small tweak to the OPs regex is required.)

– MrWhite
Dec 1 '18 at 1:28







@miken32 Although, arguably, they aren't "parsing HTML". They are simply extracting one block of text between two unique tokens (regardless of content-type). Using an HTML parser in this particular example (a simple one-off pattern matching exercise) is overkill IMO. (Only a small tweak to the OPs regex is required.)

– MrWhite
Dec 1 '18 at 1:28














2 Answers
2






active

oldest

votes


















1














Perhaps a /s modifier will help. Check the documentation:




s (PCRE_DOTALL)



If this modifier is set, a dot metacharacter in the pattern matches all characters,
including newlines. Without it, newlines are excluded. This modifier is equivalent to
Perl's /s modifier. A negative class such as [^a] always matches a newline character,
independent of the setting of this modifier.







share|improve this answer
























  • Yes, this is all that's required in the OPs example. Parsing the HTML (as mentioned in the other answer) is overkill IMO - for what is really just a simple pattern matching exercise.

    – MrWhite
    Dec 1 '18 at 1:19



















1














Something like this should work. The XPath query looks for a comment containing "content-start" and then returns the sibling nodes following it. We loop through until we find the closing comment.



$html = <<< HTML
<!--content-start-->
<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>
<!--content-end-->
<p>Not returning this</p>
HTML;
$return = "";
$dom = new DomDocument;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
$xpath = new DomXpath($dom);
$siblings = $xpath->query("//comment()[.='content-start']/following-sibling::node()");
foreach ($siblings as $node) {
if ($node instanceof DOMComment && $node->textContent === "content-end") {
break;
}
$return .= $dom->saveHTML($node) . "n";
}
echo $return;


Output:



<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>





share|improve this answer


























  • will this work if the html is from another website?

    – Cain Nuke
    Nov 23 '18 at 17:17











  • It's HTML, it doesn't matter where it's from.

    – miken32
    Nov 23 '18 at 17:18











  • great, I will try it. Thanks

    – Cain Nuke
    Nov 23 '18 at 17:20











  • sorry but I got these warnings: DOMDocument::loadHTML() expects exactly 1 parameter, 2 given Warning: DOMXPath::query() [domxpath.query]: Invalid or inclomplete context Warning: Invalid argument supplied for foreach()

    – Cain Nuke
    Nov 23 '18 at 17:30











  • Are you kidding me? PHP 5.3 has been EOL for 5 years now. You gotta upgrade!

    – miken32
    Nov 23 '18 at 17:31












Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440493%2fusing-preg-match-with-html-comments%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














Perhaps a /s modifier will help. Check the documentation:




s (PCRE_DOTALL)



If this modifier is set, a dot metacharacter in the pattern matches all characters,
including newlines. Without it, newlines are excluded. This modifier is equivalent to
Perl's /s modifier. A negative class such as [^a] always matches a newline character,
independent of the setting of this modifier.







share|improve this answer
























  • Yes, this is all that's required in the OPs example. Parsing the HTML (as mentioned in the other answer) is overkill IMO - for what is really just a simple pattern matching exercise.

    – MrWhite
    Dec 1 '18 at 1:19
















1














Perhaps a /s modifier will help. Check the documentation:




s (PCRE_DOTALL)



If this modifier is set, a dot metacharacter in the pattern matches all characters,
including newlines. Without it, newlines are excluded. This modifier is equivalent to
Perl's /s modifier. A negative class such as [^a] always matches a newline character,
independent of the setting of this modifier.







share|improve this answer
























  • Yes, this is all that's required in the OPs example. Parsing the HTML (as mentioned in the other answer) is overkill IMO - for what is really just a simple pattern matching exercise.

    – MrWhite
    Dec 1 '18 at 1:19














1












1








1







Perhaps a /s modifier will help. Check the documentation:




s (PCRE_DOTALL)



If this modifier is set, a dot metacharacter in the pattern matches all characters,
including newlines. Without it, newlines are excluded. This modifier is equivalent to
Perl's /s modifier. A negative class such as [^a] always matches a newline character,
independent of the setting of this modifier.







share|improve this answer













Perhaps a /s modifier will help. Check the documentation:




s (PCRE_DOTALL)



If this modifier is set, a dot metacharacter in the pattern matches all characters,
including newlines. Without it, newlines are excluded. This modifier is equivalent to
Perl's /s modifier. A negative class such as [^a] always matches a newline character,
independent of the setting of this modifier.








share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 23 '18 at 4:08









drmaddrmad

1567




1567













  • Yes, this is all that's required in the OPs example. Parsing the HTML (as mentioned in the other answer) is overkill IMO - for what is really just a simple pattern matching exercise.

    – MrWhite
    Dec 1 '18 at 1:19



















  • Yes, this is all that's required in the OPs example. Parsing the HTML (as mentioned in the other answer) is overkill IMO - for what is really just a simple pattern matching exercise.

    – MrWhite
    Dec 1 '18 at 1:19

















Yes, this is all that's required in the OPs example. Parsing the HTML (as mentioned in the other answer) is overkill IMO - for what is really just a simple pattern matching exercise.

– MrWhite
Dec 1 '18 at 1:19





Yes, this is all that's required in the OPs example. Parsing the HTML (as mentioned in the other answer) is overkill IMO - for what is really just a simple pattern matching exercise.

– MrWhite
Dec 1 '18 at 1:19













1














Something like this should work. The XPath query looks for a comment containing "content-start" and then returns the sibling nodes following it. We loop through until we find the closing comment.



$html = <<< HTML
<!--content-start-->
<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>
<!--content-end-->
<p>Not returning this</p>
HTML;
$return = "";
$dom = new DomDocument;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
$xpath = new DomXpath($dom);
$siblings = $xpath->query("//comment()[.='content-start']/following-sibling::node()");
foreach ($siblings as $node) {
if ($node instanceof DOMComment && $node->textContent === "content-end") {
break;
}
$return .= $dom->saveHTML($node) . "n";
}
echo $return;


Output:



<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>





share|improve this answer


























  • will this work if the html is from another website?

    – Cain Nuke
    Nov 23 '18 at 17:17











  • It's HTML, it doesn't matter where it's from.

    – miken32
    Nov 23 '18 at 17:18











  • great, I will try it. Thanks

    – Cain Nuke
    Nov 23 '18 at 17:20











  • sorry but I got these warnings: DOMDocument::loadHTML() expects exactly 1 parameter, 2 given Warning: DOMXPath::query() [domxpath.query]: Invalid or inclomplete context Warning: Invalid argument supplied for foreach()

    – Cain Nuke
    Nov 23 '18 at 17:30











  • Are you kidding me? PHP 5.3 has been EOL for 5 years now. You gotta upgrade!

    – miken32
    Nov 23 '18 at 17:31
















1














Something like this should work. The XPath query looks for a comment containing "content-start" and then returns the sibling nodes following it. We loop through until we find the closing comment.



$html = <<< HTML
<!--content-start-->
<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>
<!--content-end-->
<p>Not returning this</p>
HTML;
$return = "";
$dom = new DomDocument;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
$xpath = new DomXpath($dom);
$siblings = $xpath->query("//comment()[.='content-start']/following-sibling::node()");
foreach ($siblings as $node) {
if ($node instanceof DOMComment && $node->textContent === "content-end") {
break;
}
$return .= $dom->saveHTML($node) . "n";
}
echo $return;


Output:



<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>





share|improve this answer


























  • will this work if the html is from another website?

    – Cain Nuke
    Nov 23 '18 at 17:17











  • It's HTML, it doesn't matter where it's from.

    – miken32
    Nov 23 '18 at 17:18











  • great, I will try it. Thanks

    – Cain Nuke
    Nov 23 '18 at 17:20











  • sorry but I got these warnings: DOMDocument::loadHTML() expects exactly 1 parameter, 2 given Warning: DOMXPath::query() [domxpath.query]: Invalid or inclomplete context Warning: Invalid argument supplied for foreach()

    – Cain Nuke
    Nov 23 '18 at 17:30











  • Are you kidding me? PHP 5.3 has been EOL for 5 years now. You gotta upgrade!

    – miken32
    Nov 23 '18 at 17:31














1












1








1







Something like this should work. The XPath query looks for a comment containing "content-start" and then returns the sibling nodes following it. We loop through until we find the closing comment.



$html = <<< HTML
<!--content-start-->
<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>
<!--content-end-->
<p>Not returning this</p>
HTML;
$return = "";
$dom = new DomDocument;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
$xpath = new DomXpath($dom);
$siblings = $xpath->query("//comment()[.='content-start']/following-sibling::node()");
foreach ($siblings as $node) {
if ($node instanceof DOMComment && $node->textContent === "content-end") {
break;
}
$return .= $dom->saveHTML($node) . "n";
}
echo $return;


Output:



<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>





share|improve this answer















Something like this should work. The XPath query looks for a comment containing "content-start" and then returns the sibling nodes following it. We loop through until we find the closing comment.



$html = <<< HTML
<!--content-start-->
<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>
<!--content-end-->
<p>Not returning this</p>
HTML;
$return = "";
$dom = new DomDocument;
$dom->loadHTML($html, LIBXML_HTML_NODEFDTD | LIBXML_HTML_NOIMPLIED);
$xpath = new DomXpath($dom);
$siblings = $xpath->query("//comment()[.='content-start']/following-sibling::node()");
foreach ($siblings as $node) {
if ($node instanceof DOMComment && $node->textContent === "content-end") {
break;
}
$return .= $dom->saveHTML($node) . "n";
}
echo $return;


Output:



<p>Here is my <i>desired html</i></p>
<!-- a comment -->
<div class="foo">Here is more</div>






share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 23 '18 at 4:50

























answered Nov 23 '18 at 4:37









miken32miken32

24.9k95173




24.9k95173













  • will this work if the html is from another website?

    – Cain Nuke
    Nov 23 '18 at 17:17











  • It's HTML, it doesn't matter where it's from.

    – miken32
    Nov 23 '18 at 17:18











  • great, I will try it. Thanks

    – Cain Nuke
    Nov 23 '18 at 17:20











  • sorry but I got these warnings: DOMDocument::loadHTML() expects exactly 1 parameter, 2 given Warning: DOMXPath::query() [domxpath.query]: Invalid or inclomplete context Warning: Invalid argument supplied for foreach()

    – Cain Nuke
    Nov 23 '18 at 17:30











  • Are you kidding me? PHP 5.3 has been EOL for 5 years now. You gotta upgrade!

    – miken32
    Nov 23 '18 at 17:31



















  • will this work if the html is from another website?

    – Cain Nuke
    Nov 23 '18 at 17:17











  • It's HTML, it doesn't matter where it's from.

    – miken32
    Nov 23 '18 at 17:18











  • great, I will try it. Thanks

    – Cain Nuke
    Nov 23 '18 at 17:20











  • sorry but I got these warnings: DOMDocument::loadHTML() expects exactly 1 parameter, 2 given Warning: DOMXPath::query() [domxpath.query]: Invalid or inclomplete context Warning: Invalid argument supplied for foreach()

    – Cain Nuke
    Nov 23 '18 at 17:30











  • Are you kidding me? PHP 5.3 has been EOL for 5 years now. You gotta upgrade!

    – miken32
    Nov 23 '18 at 17:31

















will this work if the html is from another website?

– Cain Nuke
Nov 23 '18 at 17:17





will this work if the html is from another website?

– Cain Nuke
Nov 23 '18 at 17:17













It's HTML, it doesn't matter where it's from.

– miken32
Nov 23 '18 at 17:18





It's HTML, it doesn't matter where it's from.

– miken32
Nov 23 '18 at 17:18













great, I will try it. Thanks

– Cain Nuke
Nov 23 '18 at 17:20





great, I will try it. Thanks

– Cain Nuke
Nov 23 '18 at 17:20













sorry but I got these warnings: DOMDocument::loadHTML() expects exactly 1 parameter, 2 given Warning: DOMXPath::query() [domxpath.query]: Invalid or inclomplete context Warning: Invalid argument supplied for foreach()

– Cain Nuke
Nov 23 '18 at 17:30





sorry but I got these warnings: DOMDocument::loadHTML() expects exactly 1 parameter, 2 given Warning: DOMXPath::query() [domxpath.query]: Invalid or inclomplete context Warning: Invalid argument supplied for foreach()

– Cain Nuke
Nov 23 '18 at 17:30













Are you kidding me? PHP 5.3 has been EOL for 5 years now. You gotta upgrade!

– miken32
Nov 23 '18 at 17:31





Are you kidding me? PHP 5.3 has been EOL for 5 years now. You gotta upgrade!

– miken32
Nov 23 '18 at 17:31


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53440493%2fusing-preg-match-with-html-comments%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to send String Array data to Server using php in android

Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

Is anime1.com a legal site for watching anime?