find newline with words starting with underscore with specific pattern
I need to find the following from c code using regular expression python but some how i could not write it properly.
if(condition)
/*~T*/
{
/*~T*/
_getmethis = FALSE;
/*~T*/
}
..........
/*~T*/
_findmethis = FALSE;
......
/*~T*/
_findthat = True;
I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file
import re
fh = open('filename.c', "r")
output = open("output.txt", "w")
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
for line in fh:
for m in re.finditer(pattern, line):
output.write(m.group(3))
output.write("n")
output.close()
regex python-3.x
add a comment |
I need to find the following from c code using regular expression python but some how i could not write it properly.
if(condition)
/*~T*/
{
/*~T*/
_getmethis = FALSE;
/*~T*/
}
..........
/*~T*/
_findmethis = FALSE;
......
/*~T*/
_findthat = True;
I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file
import re
fh = open('filename.c', "r")
output = open("output.txt", "w")
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
for line in fh:
for m in re.finditer(pattern, line):
output.write(m.group(3))
output.write("n")
output.close()
regex python-3.x
[aA-zZ]does not only match letters, it also matches[,,],^,_,`. You must have meant[a-zA-Z]. All you need to do is removefor line in fh:and usere.finditer(pattern, fh.read())
– Wiktor Stribiżew
Nov 21 '18 at 16:55
add a comment |
I need to find the following from c code using regular expression python but some how i could not write it properly.
if(condition)
/*~T*/
{
/*~T*/
_getmethis = FALSE;
/*~T*/
}
..........
/*~T*/
_findmethis = FALSE;
......
/*~T*/
_findthat = True;
I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file
import re
fh = open('filename.c', "r")
output = open("output.txt", "w")
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
for line in fh:
for m in re.finditer(pattern, line):
output.write(m.group(3))
output.write("n")
output.close()
regex python-3.x
I need to find the following from c code using regular expression python but some how i could not write it properly.
if(condition)
/*~T*/
{
/*~T*/
_getmethis = FALSE;
/*~T*/
}
..........
/*~T*/
_findmethis = FALSE;
......
/*~T*/
_findthat = True;
I need to find all variables after /*~T/ starting with underscore and write to new file but my code could not find it i tried several regex pattern it is always empty output file
import re
fh = open('filename.c', "r")
output = open("output.txt", "w")
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
for line in fh:
for m in re.finditer(pattern, line):
output.write(m.group(3))
output.write("n")
output.close()
regex python-3.x
regex python-3.x
edited Nov 21 '18 at 15:57
fastlearner
asked Nov 21 '18 at 15:44
fastlearnerfastlearner
3117
3117
[aA-zZ]does not only match letters, it also matches[,,],^,_,`. You must have meant[a-zA-Z]. All you need to do is removefor line in fh:and usere.finditer(pattern, fh.read())
– Wiktor Stribiżew
Nov 21 '18 at 16:55
add a comment |
[aA-zZ]does not only match letters, it also matches[,,],^,_,`. You must have meant[a-zA-Z]. All you need to do is removefor line in fh:and usere.finditer(pattern, fh.read())
– Wiktor Stribiżew
Nov 21 '18 at 16:55
[aA-zZ] does not only match letters, it also matches [, , ], ^, _, `. You must have meant [a-zA-Z]. All you need to do is remove for line in fh: and use re.finditer(pattern, fh.read())– Wiktor Stribiżew
Nov 21 '18 at 16:55
[aA-zZ] does not only match letters, it also matches [, , ], ^, _, `. You must have meant [a-zA-Z]. All you need to do is remove for line in fh: and use re.finditer(pattern, fh.read())– Wiktor Stribiżew
Nov 21 '18 at 16:55
add a comment |
3 Answers
3
active
oldest
votes
You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.
When reading files in, it is more convenient to use with so that you do not have to use .close():
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
add a comment |
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots . also match newlines (by default they don't) together with a non greedy .*? to make the gap between /*~-T*/ and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
Nov 21 '18 at 16:00
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
Nov 21 '18 at 17:46
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
Nov 21 '18 at 18:05
add a comment |
This is my final version where i also try to avoid duplicates
import re
fh = open('filename.c', "r")
filecontent = fh.read()
output = open("output.txt", "w")
createlist =
pattern = re.compile(r"(/*~T*/)(s*?ns*)(_[aA-zZ]*)")
for m in re.finditer(pattern, filecontent):
if m.group(3) not in createlist:
createlist.append(m.group(3))
output.write(m.group(3))
output.write('n')
output.close()
add a comment |
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53415684%2ffind-newline-with-words-starting-with-underscore-with-specific-pattern%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
3 Answers
3
active
oldest
votes
3 Answers
3
active
oldest
votes
active
oldest
votes
active
oldest
votes
You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.
When reading files in, it is more convenient to use with so that you do not have to use .close():
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
add a comment |
You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.
When reading files in, it is more convenient to use with so that you do not have to use .close():
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
add a comment |
You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.
When reading files in, it is more convenient to use with so that you do not have to use .close():
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
You need to read the file in as a whole with fh.read() and make sure you amend the pattern to only match letters since [aA-zZ] matches more than just letters.
The pattern I suggest is
(/*~T*/)([^Sn]*ns*)(_[a-zA-Z]*)
See the regex demo. Note that I deliberately subtracted n from the first s* to make matching more efficient.
When reading files in, it is more convenient to use with so that you do not have to use .close():
import re
pattern = re.compile(r'(/*~T*/)(s*?ns*)(_[aA-zZ]*)')
with open('filename.c', "r") as fh:
contents = fh.read()
with open("output.txt", "w") as output:
output.write("n".join([x.group(3) for x in pattern.finditer(contents)]))
answered Nov 21 '18 at 18:25
Wiktor StribiżewWiktor Stribiżew
324k16146226
324k16146226
add a comment |
add a comment |
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots . also match newlines (by default they don't) together with a non greedy .*? to make the gap between /*~-T*/ and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
Nov 21 '18 at 16:00
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
Nov 21 '18 at 17:46
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
Nov 21 '18 at 18:05
add a comment |
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots . also match newlines (by default they don't) together with a non greedy .*? to make the gap between /*~-T*/ and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
Nov 21 '18 at 16:00
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
Nov 21 '18 at 17:46
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
Nov 21 '18 at 18:05
add a comment |
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots . also match newlines (by default they don't) together with a non greedy .*? to make the gap between /*~-T*/ and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
The reason you do not find anything is that your pattern crosses multiple lines but you are only looking at your file one line at a time.
Consider using this:
t = """
if(condition)
/*~-*/
{
/*~T*/
_getmethis = FALSE;
/*~-*/
}
..........
/*~T*/
_findmethis = FALSE;
/*~T*/
do_not_findme_this = FALSE;
"""
import re
pattern = re.compile(r'/*~T*/.*?ns+(_[aA-zZ]*)', re.MULTILINE|re.DOTALL)
for m in re.finditer(pattern, t): # use the whole file here - not line-wise
print(m.group(1))
The pattern uses 2 flags that tell regex to use multiline matches and that dots . also match newlines (by default they don't) together with a non greedy .*? to make the gap between /*~-T*/ and the following group minimal large.
Printout:
_getmethis
_findmethis
Doku:
- re.MULTILINE
- re.DOTALL
edited Nov 21 '18 at 18:03
answered Nov 21 '18 at 15:56
Patrick ArtnerPatrick Artner
25.4k62544
25.4k62544
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
Nov 21 '18 at 16:00
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
Nov 21 '18 at 17:46
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
Nov 21 '18 at 18:05
add a comment |
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
Nov 21 '18 at 16:00
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
Nov 21 '18 at 17:46
@fastlearner Then adjust the pattern? So the(_[aA-zZ]*)is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...
– Patrick Artner
Nov 21 '18 at 18:05
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
Nov 21 '18 at 16:00
I am so silly of it that i always check the regex but not the python. I will try this
– fastlearner
Nov 21 '18 at 16:00
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
Nov 21 '18 at 17:46
but this also finds the words if the underscore is in the middle of a variable
– fastlearner
Nov 21 '18 at 17:46
@fastlearner Then adjust the pattern? So the
(_[aA-zZ]*) is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...– Patrick Artner
Nov 21 '18 at 18:05
@fastlearner Then adjust the pattern? So the
(_[aA-zZ]*) is only allowed after a newline and spaces? See edit ... if you want to play with regex, use regex101.com and put it to python mode - copy your text and pattern in it and modify it until it fits. Your example text did not contian any pattern "to be excluded" ...– Patrick Artner
Nov 21 '18 at 18:05
add a comment |
This is my final version where i also try to avoid duplicates
import re
fh = open('filename.c', "r")
filecontent = fh.read()
output = open("output.txt", "w")
createlist =
pattern = re.compile(r"(/*~T*/)(s*?ns*)(_[aA-zZ]*)")
for m in re.finditer(pattern, filecontent):
if m.group(3) not in createlist:
createlist.append(m.group(3))
output.write(m.group(3))
output.write('n')
output.close()
add a comment |
This is my final version where i also try to avoid duplicates
import re
fh = open('filename.c', "r")
filecontent = fh.read()
output = open("output.txt", "w")
createlist =
pattern = re.compile(r"(/*~T*/)(s*?ns*)(_[aA-zZ]*)")
for m in re.finditer(pattern, filecontent):
if m.group(3) not in createlist:
createlist.append(m.group(3))
output.write(m.group(3))
output.write('n')
output.close()
add a comment |
This is my final version where i also try to avoid duplicates
import re
fh = open('filename.c', "r")
filecontent = fh.read()
output = open("output.txt", "w")
createlist =
pattern = re.compile(r"(/*~T*/)(s*?ns*)(_[aA-zZ]*)")
for m in re.finditer(pattern, filecontent):
if m.group(3) not in createlist:
createlist.append(m.group(3))
output.write(m.group(3))
output.write('n')
output.close()
This is my final version where i also try to avoid duplicates
import re
fh = open('filename.c', "r")
filecontent = fh.read()
output = open("output.txt", "w")
createlist =
pattern = re.compile(r"(/*~T*/)(s*?ns*)(_[aA-zZ]*)")
for m in re.finditer(pattern, filecontent):
if m.group(3) not in createlist:
createlist.append(m.group(3))
output.write(m.group(3))
output.write('n')
output.close()
answered Nov 21 '18 at 20:33
fastlearnerfastlearner
3117
3117
add a comment |
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53415684%2ffind-newline-with-words-starting-with-underscore-with-specific-pattern%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
[aA-zZ]does not only match letters, it also matches[,,],^,_,`. You must have meant[a-zA-Z]. All you need to do is removefor line in fh:and usere.finditer(pattern, fh.read())– Wiktor Stribiżew
Nov 21 '18 at 16:55