Formatting Regex to Match Table of Contents [duplicate]











up vote
1
down vote

favorite













This question already has an answer here:




  • Using ^ to match beginning of line in Python regex

    3 answers



  • My regex is matching too much. How do I make it stop?

    5 answers




I need to pull the Table of Contents from a readme file in a Github repository. I used the 'requests' module in python to pull the text from the readme, and now I'm trying to match the Table of Contents using regular expressions. Here's the code I have leading up to my question:



import requests
import os
import sys
import re

# Get readme page info via Github API.
rm_pg_url = ('https://api.github.com/repos/PillarOfSand/Projects/readme')
rm_pg = requests.get(rm_pg_url, timeout = 10)
rm_pg_content = rm_pg.json()

# Isolate download page. Get actual text from readme file.
download_url = rm_pg_content['download_url']
real_rm = requests.get(download_url, timeout = 10)
all_text = real_rm.text

toc_regex = re.compile(r'(?s)^## Table of Contents.*security)$')
table_of_contents = toc_regex.search(all_text)


The last two lines are what I'm trying to get at specifically. The table_of_contents variable is type None, so the regular expression I'm using isn't matching anything. The text string I'm searching can be found at the following URL:



ReadME Text



So, my actual question is, where am I going wrong? How does my regular expression need to be adjusted to match the entire table of contents?



Thanks.










share|improve this question













marked as duplicate by Wiktor Stribiżew regex
Users with the  regex badge can single-handedly close regex questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 12 at 22:48


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • You probably want r'(?sm)^## Table of Contents.*?security)$'
    – Wiktor Stribiżew
    Nov 12 at 22:46















up vote
1
down vote

favorite













This question already has an answer here:




  • Using ^ to match beginning of line in Python regex

    3 answers



  • My regex is matching too much. How do I make it stop?

    5 answers




I need to pull the Table of Contents from a readme file in a Github repository. I used the 'requests' module in python to pull the text from the readme, and now I'm trying to match the Table of Contents using regular expressions. Here's the code I have leading up to my question:



import requests
import os
import sys
import re

# Get readme page info via Github API.
rm_pg_url = ('https://api.github.com/repos/PillarOfSand/Projects/readme')
rm_pg = requests.get(rm_pg_url, timeout = 10)
rm_pg_content = rm_pg.json()

# Isolate download page. Get actual text from readme file.
download_url = rm_pg_content['download_url']
real_rm = requests.get(download_url, timeout = 10)
all_text = real_rm.text

toc_regex = re.compile(r'(?s)^## Table of Contents.*security)$')
table_of_contents = toc_regex.search(all_text)


The last two lines are what I'm trying to get at specifically. The table_of_contents variable is type None, so the regular expression I'm using isn't matching anything. The text string I'm searching can be found at the following URL:



ReadME Text



So, my actual question is, where am I going wrong? How does my regular expression need to be adjusted to match the entire table of contents?



Thanks.










share|improve this question













marked as duplicate by Wiktor Stribiżew regex
Users with the  regex badge can single-handedly close regex questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 12 at 22:48


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.















  • You probably want r'(?sm)^## Table of Contents.*?security)$'
    – Wiktor Stribiżew
    Nov 12 at 22:46













up vote
1
down vote

favorite









up vote
1
down vote

favorite












This question already has an answer here:




  • Using ^ to match beginning of line in Python regex

    3 answers



  • My regex is matching too much. How do I make it stop?

    5 answers




I need to pull the Table of Contents from a readme file in a Github repository. I used the 'requests' module in python to pull the text from the readme, and now I'm trying to match the Table of Contents using regular expressions. Here's the code I have leading up to my question:



import requests
import os
import sys
import re

# Get readme page info via Github API.
rm_pg_url = ('https://api.github.com/repos/PillarOfSand/Projects/readme')
rm_pg = requests.get(rm_pg_url, timeout = 10)
rm_pg_content = rm_pg.json()

# Isolate download page. Get actual text from readme file.
download_url = rm_pg_content['download_url']
real_rm = requests.get(download_url, timeout = 10)
all_text = real_rm.text

toc_regex = re.compile(r'(?s)^## Table of Contents.*security)$')
table_of_contents = toc_regex.search(all_text)


The last two lines are what I'm trying to get at specifically. The table_of_contents variable is type None, so the regular expression I'm using isn't matching anything. The text string I'm searching can be found at the following URL:



ReadME Text



So, my actual question is, where am I going wrong? How does my regular expression need to be adjusted to match the entire table of contents?



Thanks.










share|improve this question














This question already has an answer here:




  • Using ^ to match beginning of line in Python regex

    3 answers



  • My regex is matching too much. How do I make it stop?

    5 answers




I need to pull the Table of Contents from a readme file in a Github repository. I used the 'requests' module in python to pull the text from the readme, and now I'm trying to match the Table of Contents using regular expressions. Here's the code I have leading up to my question:



import requests
import os
import sys
import re

# Get readme page info via Github API.
rm_pg_url = ('https://api.github.com/repos/PillarOfSand/Projects/readme')
rm_pg = requests.get(rm_pg_url, timeout = 10)
rm_pg_content = rm_pg.json()

# Isolate download page. Get actual text from readme file.
download_url = rm_pg_content['download_url']
real_rm = requests.get(download_url, timeout = 10)
all_text = real_rm.text

toc_regex = re.compile(r'(?s)^## Table of Contents.*security)$')
table_of_contents = toc_regex.search(all_text)


The last two lines are what I'm trying to get at specifically. The table_of_contents variable is type None, so the regular expression I'm using isn't matching anything. The text string I'm searching can be found at the following URL:



ReadME Text



So, my actual question is, where am I going wrong? How does my regular expression need to be adjusted to match the entire table of contents?



Thanks.





This question already has an answer here:




  • Using ^ to match beginning of line in Python regex

    3 answers



  • My regex is matching too much. How do I make it stop?

    5 answers








regex python-3.x






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Nov 12 at 22:44









Sean

61




61




marked as duplicate by Wiktor Stribiżew regex
Users with the  regex badge can single-handedly close regex questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 12 at 22:48


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.






marked as duplicate by Wiktor Stribiżew regex
Users with the  regex badge can single-handedly close regex questions as duplicates and reopen them as needed.

StackExchange.ready(function() {
if (StackExchange.options.isMobile) return;

$('.dupe-hammer-message-hover:not(.hover-bound)').each(function() {
var $hover = $(this).addClass('hover-bound'),
$msg = $hover.siblings('.dupe-hammer-message');

$hover.hover(
function() {
$hover.showInfoMessage('', {
messageElement: $msg.clone().show(),
transient: false,
position: { my: 'bottom left', at: 'top center', offsetTop: -7 },
dismissable: false,
relativeToBody: true
});
},
function() {
StackExchange.helpers.removeMessages();
}
);
});
});
Nov 12 at 22:48


This question has been asked before and already has an answer. If those answers do not fully address your question, please ask a new question.














  • You probably want r'(?sm)^## Table of Contents.*?security)$'
    – Wiktor Stribiżew
    Nov 12 at 22:46


















  • You probably want r'(?sm)^## Table of Contents.*?security)$'
    – Wiktor Stribiżew
    Nov 12 at 22:46
















You probably want r'(?sm)^## Table of Contents.*?security)$'
– Wiktor Stribiżew
Nov 12 at 22:46




You probably want r'(?sm)^## Table of Contents.*?security)$'
– Wiktor Stribiżew
Nov 12 at 22:46

















active

oldest

votes






















active

oldest

votes













active

oldest

votes









active

oldest

votes






active

oldest

votes

Popular posts from this blog

How to change which sound is reproduced for terminal bell?

Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents

Can I use Tabulator js library in my java Spring + Thymeleaf project?