RegEx to match string between two strings in Powershell
Here is my sample data:
Option failonnomatch on
Option batch on
Option confirm Off
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
get File*.txt localpathClientFile.txt
mv File*.txt /remote/archive/
close
exit
I would like to create a powershell script to extract pieces of information out of this text file.
List of items I need:
- Username
- Password
- Host
- Port
- ssh key
- File Name
- Local Path
- Remote Path
I'm hoping that if I learn how to do a couple of these, the method will be applicable to all items. I attempted to extract the ssh key with the following powershell/regex:
$doc -match '(?<=hostkey=")(.*)(?=")'
$doc being the sample data
but it appears to be returning the whole line. Any help would be greatly appreciated. Thank you.
regex powershell
add a comment |
Here is my sample data:
Option failonnomatch on
Option batch on
Option confirm Off
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
get File*.txt localpathClientFile.txt
mv File*.txt /remote/archive/
close
exit
I would like to create a powershell script to extract pieces of information out of this text file.
List of items I need:
- Username
- Password
- Host
- Port
- ssh key
- File Name
- Local Path
- Remote Path
I'm hoping that if I learn how to do a couple of these, the method will be applicable to all items. I attempted to extract the ssh key with the following powershell/regex:
$doc -match '(?<=hostkey=")(.*)(?=")'
$doc being the sample data
but it appears to be returning the whole line. Any help would be greatly appreciated. Thank you.
regex powershell
1
If their all key/value like that, just use(?<=bkey=")([^"]*)(?=")
Or, you could do a global match using(?<=bw+=")([^"]*)(?=")
– sln
Nov 15 at 21:56
1
Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
– LotPings
Nov 15 at 22:12
what part of the last line is the "file" and what part is the "path"? theFile*.txt
looks like a file specification. the next part seems to be the full file name. i presume you want that broken intoSERVERPathClient
&File.txt
but i'm unsure of that.
– Lee_Dailey
Nov 15 at 22:20
add a comment |
Here is my sample data:
Option failonnomatch on
Option batch on
Option confirm Off
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
get File*.txt localpathClientFile.txt
mv File*.txt /remote/archive/
close
exit
I would like to create a powershell script to extract pieces of information out of this text file.
List of items I need:
- Username
- Password
- Host
- Port
- ssh key
- File Name
- Local Path
- Remote Path
I'm hoping that if I learn how to do a couple of these, the method will be applicable to all items. I attempted to extract the ssh key with the following powershell/regex:
$doc -match '(?<=hostkey=")(.*)(?=")'
$doc being the sample data
but it appears to be returning the whole line. Any help would be greatly appreciated. Thank you.
regex powershell
Here is my sample data:
Option failonnomatch on
Option batch on
Option confirm Off
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
get File*.txt localpathClientFile.txt
mv File*.txt /remote/archive/
close
exit
I would like to create a powershell script to extract pieces of information out of this text file.
List of items I need:
- Username
- Password
- Host
- Port
- ssh key
- File Name
- Local Path
- Remote Path
I'm hoping that if I learn how to do a couple of these, the method will be applicable to all items. I attempted to extract the ssh key with the following powershell/regex:
$doc -match '(?<=hostkey=")(.*)(?=")'
$doc being the sample data
but it appears to be returning the whole line. Any help would be greatly appreciated. Thank you.
regex powershell
regex powershell
edited Nov 19 at 20:42
asked Nov 15 at 21:51
Michael S Palatsi
133
133
1
If their all key/value like that, just use(?<=bkey=")([^"]*)(?=")
Or, you could do a global match using(?<=bw+=")([^"]*)(?=")
– sln
Nov 15 at 21:56
1
Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
– LotPings
Nov 15 at 22:12
what part of the last line is the "file" and what part is the "path"? theFile*.txt
looks like a file specification. the next part seems to be the full file name. i presume you want that broken intoSERVERPathClient
&File.txt
but i'm unsure of that.
– Lee_Dailey
Nov 15 at 22:20
add a comment |
1
If their all key/value like that, just use(?<=bkey=")([^"]*)(?=")
Or, you could do a global match using(?<=bw+=")([^"]*)(?=")
– sln
Nov 15 at 21:56
1
Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
– LotPings
Nov 15 at 22:12
what part of the last line is the "file" and what part is the "path"? theFile*.txt
looks like a file specification. the next part seems to be the full file name. i presume you want that broken intoSERVERPathClient
&File.txt
but i'm unsure of that.
– Lee_Dailey
Nov 15 at 22:20
1
1
If their all key/value like that, just use
(?<=bkey=")([^"]*)(?=")
Or, you could do a global match using (?<=bw+=")([^"]*)(?=")
– sln
Nov 15 at 21:56
If their all key/value like that, just use
(?<=bkey=")([^"]*)(?=")
Or, you could do a global match using (?<=bw+=")([^"]*)(?=")
– sln
Nov 15 at 21:56
1
1
Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
– LotPings
Nov 15 at 22:12
Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
– LotPings
Nov 15 at 22:12
what part of the last line is the "file" and what part is the "path"? the
File*.txt
looks like a file specification. the next part seems to be the full file name. i presume you want that broken into SERVERPathClient
& File.txt
but i'm unsure of that.– Lee_Dailey
Nov 15 at 22:20
what part of the last line is the "file" and what part is the "path"? the
File*.txt
looks like a file specification. the next part seems to be the full file name. i presume you want that broken into SERVERPathClient
& File.txt
but i'm unsure of that.– Lee_Dailey
Nov 15 at 22:20
add a comment |
2 Answers
2
active
oldest
votes
If -match
is returning a whole line, the implication is that the LHS of your -match
operation is an array, which in turn suggests that you used Get-Content
without -Raw
, which yields the input as an array of lines, in which case -match
acts as a filter.
Instead, read your file as a single, multi-line string with Get-Content -Raw
; with a scalar LHS, -match
then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches
(a hashtable whose 0
entry contains the overall match, 1
what the 1st capture group matched, ...):
# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt
if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}
With your sample input, the above yieldsssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)
); the following example uses such named capture groups to extract several of the tokens of interest:
if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}
With your sample input, the above yields:
username
password
host.name.net
You can extend the above to capture all tokens of interest.
Note that .
by default doesn't match n
(newline) characters.
Optional reading: Using the x
(IgnoreWhiteSpace
) option to make regexes more readable:
Extracting that many tokens can result in a complex regex that is hard to read, in which case the x
(IgnoreWhiteSpace
) regex option, can help (as an inline option, (?x)
at the start of the regex):
if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}
Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,
or [ ]
, or s
to match any whitespace char.)
With your sample input, the above yields the following:
Name Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username
Note that the reason the capture groups are out of order is that $Matches
is a hash table (of type [hashtable]
), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.
However, random access to capture groups works just fine; e.g., $Matches.port
will return 22
.
I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you.(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
– Michael S Palatsi
Nov 19 at 19:54
@MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default.
doesn't matchn
(newlines)). Please see my update for using theIgnoreWhiteSpace
regex option to make complex expressions more manageable.
– mklement0
Nov 19 at 21:45
Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
– Michael S Palatsi
Nov 20 at 4:24
Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
– mklement0
Nov 20 at 5:02
add a comment |
this uses named matches with flags set to singleline, multiline, case insensitive
and then uses $Matches.MatchName
to get the items into a custom object.
# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
get File*.txt SERVERPathClientFile.txt
'@
$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'
[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}
output ...
UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt
It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how-match
works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
– mklement0
Nov 18 at 0:17
1
@mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
– Lee_Dailey
Nov 18 at 0:53
Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
– Michael S Palatsi
Nov 19 at 18:40
@MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
– Lee_Dailey
Nov 19 at 19:56
@Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
– Michael S Palatsi
Nov 19 at 20:43
|
show 1 more comment
Your Answer
StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");
StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});
function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});
}
});
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53328394%2fregex-to-match-string-between-two-strings-in-powershell%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
If -match
is returning a whole line, the implication is that the LHS of your -match
operation is an array, which in turn suggests that you used Get-Content
without -Raw
, which yields the input as an array of lines, in which case -match
acts as a filter.
Instead, read your file as a single, multi-line string with Get-Content -Raw
; with a scalar LHS, -match
then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches
(a hashtable whose 0
entry contains the overall match, 1
what the 1st capture group matched, ...):
# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt
if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}
With your sample input, the above yieldsssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)
); the following example uses such named capture groups to extract several of the tokens of interest:
if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}
With your sample input, the above yields:
username
password
host.name.net
You can extend the above to capture all tokens of interest.
Note that .
by default doesn't match n
(newline) characters.
Optional reading: Using the x
(IgnoreWhiteSpace
) option to make regexes more readable:
Extracting that many tokens can result in a complex regex that is hard to read, in which case the x
(IgnoreWhiteSpace
) regex option, can help (as an inline option, (?x)
at the start of the regex):
if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}
Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,
or [ ]
, or s
to match any whitespace char.)
With your sample input, the above yields the following:
Name Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username
Note that the reason the capture groups are out of order is that $Matches
is a hash table (of type [hashtable]
), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.
However, random access to capture groups works just fine; e.g., $Matches.port
will return 22
.
I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you.(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
– Michael S Palatsi
Nov 19 at 19:54
@MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default.
doesn't matchn
(newlines)). Please see my update for using theIgnoreWhiteSpace
regex option to make complex expressions more manageable.
– mklement0
Nov 19 at 21:45
Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
– Michael S Palatsi
Nov 20 at 4:24
Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
– mklement0
Nov 20 at 5:02
add a comment |
If -match
is returning a whole line, the implication is that the LHS of your -match
operation is an array, which in turn suggests that you used Get-Content
without -Raw
, which yields the input as an array of lines, in which case -match
acts as a filter.
Instead, read your file as a single, multi-line string with Get-Content -Raw
; with a scalar LHS, -match
then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches
(a hashtable whose 0
entry contains the overall match, 1
what the 1st capture group matched, ...):
# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt
if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}
With your sample input, the above yieldsssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)
); the following example uses such named capture groups to extract several of the tokens of interest:
if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}
With your sample input, the above yields:
username
password
host.name.net
You can extend the above to capture all tokens of interest.
Note that .
by default doesn't match n
(newline) characters.
Optional reading: Using the x
(IgnoreWhiteSpace
) option to make regexes more readable:
Extracting that many tokens can result in a complex regex that is hard to read, in which case the x
(IgnoreWhiteSpace
) regex option, can help (as an inline option, (?x)
at the start of the regex):
if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}
Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,
or [ ]
, or s
to match any whitespace char.)
With your sample input, the above yields the following:
Name Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username
Note that the reason the capture groups are out of order is that $Matches
is a hash table (of type [hashtable]
), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.
However, random access to capture groups works just fine; e.g., $Matches.port
will return 22
.
I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you.(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
– Michael S Palatsi
Nov 19 at 19:54
@MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default.
doesn't matchn
(newlines)). Please see my update for using theIgnoreWhiteSpace
regex option to make complex expressions more manageable.
– mklement0
Nov 19 at 21:45
Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
– Michael S Palatsi
Nov 20 at 4:24
Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
– mklement0
Nov 20 at 5:02
add a comment |
If -match
is returning a whole line, the implication is that the LHS of your -match
operation is an array, which in turn suggests that you used Get-Content
without -Raw
, which yields the input as an array of lines, in which case -match
acts as a filter.
Instead, read your file as a single, multi-line string with Get-Content -Raw
; with a scalar LHS, -match
then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches
(a hashtable whose 0
entry contains the overall match, 1
what the 1st capture group matched, ...):
# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt
if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}
With your sample input, the above yieldsssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)
); the following example uses such named capture groups to extract several of the tokens of interest:
if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}
With your sample input, the above yields:
username
password
host.name.net
You can extend the above to capture all tokens of interest.
Note that .
by default doesn't match n
(newline) characters.
Optional reading: Using the x
(IgnoreWhiteSpace
) option to make regexes more readable:
Extracting that many tokens can result in a complex regex that is hard to read, in which case the x
(IgnoreWhiteSpace
) regex option, can help (as an inline option, (?x)
at the start of the regex):
if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}
Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,
or [ ]
, or s
to match any whitespace char.)
With your sample input, the above yields the following:
Name Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username
Note that the reason the capture groups are out of order is that $Matches
is a hash table (of type [hashtable]
), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.
However, random access to capture groups works just fine; e.g., $Matches.port
will return 22
.
If -match
is returning a whole line, the implication is that the LHS of your -match
operation is an array, which in turn suggests that you used Get-Content
without -Raw
, which yields the input as an array of lines, in which case -match
acts as a filter.
Instead, read your file as a single, multi-line string with Get-Content -Raw
; with a scalar LHS, -match
then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches
(a hashtable whose 0
entry contains the overall match, 1
what the 1st capture group matched, ...):
# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt
if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}
With your sample input, the above yieldsssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)
); the following example uses such named capture groups to extract several of the tokens of interest:
if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}
With your sample input, the above yields:
username
password
host.name.net
You can extend the above to capture all tokens of interest.
Note that .
by default doesn't match n
(newline) characters.
Optional reading: Using the x
(IgnoreWhiteSpace
) option to make regexes more readable:
Extracting that many tokens can result in a complex regex that is hard to read, in which case the x
(IgnoreWhiteSpace
) regex option, can help (as an inline option, (?x)
at the start of the regex):
if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}
Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,
or [ ]
, or s
to match any whitespace char.)
With your sample input, the above yields the following:
Name Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username
Note that the reason the capture groups are out of order is that $Matches
is a hash table (of type [hashtable]
), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.
However, random access to capture groups works just fine; e.g., $Matches.port
will return 22
.
edited Nov 19 at 21:58
answered Nov 15 at 22:32
mklement0
126k20239267
126k20239267
I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you.(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
– Michael S Palatsi
Nov 19 at 19:54
@MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default.
doesn't matchn
(newlines)). Please see my update for using theIgnoreWhiteSpace
regex option to make complex expressions more manageable.
– mklement0
Nov 19 at 21:45
Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
– Michael S Palatsi
Nov 20 at 4:24
Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
– mklement0
Nov 20 at 5:02
add a comment |
I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you.(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
– Michael S Palatsi
Nov 19 at 19:54
@MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default.
doesn't matchn
(newlines)). Please see my update for using theIgnoreWhiteSpace
regex option to make complex expressions more manageable.
– mklement0
Nov 19 at 21:45
Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
– Michael S Palatsi
Nov 20 at 4:24
Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
– mklement0
Nov 20 at 5:02
I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you.
(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
– Michael S Palatsi
Nov 19 at 19:54
I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you.
(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
– Michael S Palatsi
Nov 19 at 19:54
@MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default
.
doesn't match n
(newlines)). Please see my update for using the IgnoreWhiteSpace
regex option to make complex expressions more manageable.– mklement0
Nov 19 at 21:45
@MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default
.
doesn't match n
(newlines)). Please see my update for using the IgnoreWhiteSpace
regex option to make complex expressions more manageable.– mklement0
Nov 19 at 21:45
Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
– Michael S Palatsi
Nov 20 at 4:24
Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
– Michael S Palatsi
Nov 20 at 4:24
Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
– mklement0
Nov 20 at 5:02
Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
– mklement0
Nov 20 at 5:02
add a comment |
this uses named matches with flags set to singleline, multiline, case insensitive
and then uses $Matches.MatchName
to get the items into a custom object.
# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
get File*.txt SERVERPathClientFile.txt
'@
$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'
[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}
output ...
UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt
It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how-match
works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
– mklement0
Nov 18 at 0:17
1
@mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
– Lee_Dailey
Nov 18 at 0:53
Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
– Michael S Palatsi
Nov 19 at 18:40
@MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
– Lee_Dailey
Nov 19 at 19:56
@Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
– Michael S Palatsi
Nov 19 at 20:43
|
show 1 more comment
this uses named matches with flags set to singleline, multiline, case insensitive
and then uses $Matches.MatchName
to get the items into a custom object.
# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
get File*.txt SERVERPathClientFile.txt
'@
$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'
[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}
output ...
UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt
It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how-match
works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
– mklement0
Nov 18 at 0:17
1
@mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
– Lee_Dailey
Nov 18 at 0:53
Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
– Michael S Palatsi
Nov 19 at 18:40
@MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
– Lee_Dailey
Nov 19 at 19:56
@Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
– Michael S Palatsi
Nov 19 at 20:43
|
show 1 more comment
this uses named matches with flags set to singleline, multiline, case insensitive
and then uses $Matches.MatchName
to get the items into a custom object.
# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
get File*.txt SERVERPathClientFile.txt
'@
$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'
[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}
output ...
UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt
this uses named matches with flags set to singleline, multiline, case insensitive
and then uses $Matches.MatchName
to get the items into a custom object.
# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"
get File*.txt SERVERPathClientFile.txt
'@
$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'
[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}
output ...
UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt
answered Nov 15 at 22:39
Lee_Dailey
1,412177
1,412177
It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how-match
works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
– mklement0
Nov 18 at 0:17
1
@mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
– Lee_Dailey
Nov 18 at 0:53
Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
– Michael S Palatsi
Nov 19 at 18:40
@MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
– Lee_Dailey
Nov 19 at 19:56
@Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
– Michael S Palatsi
Nov 19 at 20:43
|
show 1 more comment
It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how-match
works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
– mklement0
Nov 18 at 0:17
1
@mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
– Lee_Dailey
Nov 18 at 0:53
Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
– Michael S Palatsi
Nov 19 at 18:40
@MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
– Lee_Dailey
Nov 19 at 19:56
@Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
– Michael S Palatsi
Nov 19 at 20:43
It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how
-match
works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.– mklement0
Nov 18 at 0:17
It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how
-match
works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.– mklement0
Nov 18 at 0:17
1
1
@mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
– Lee_Dailey
Nov 18 at 0:53
@mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
– Lee_Dailey
Nov 18 at 0:53
Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
– Michael S Palatsi
Nov 19 at 18:40
Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
– Michael S Palatsi
Nov 19 at 18:40
@MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
– Lee_Dailey
Nov 19 at 19:56
@MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
– Lee_Dailey
Nov 19 at 19:56
@Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
– Michael S Palatsi
Nov 19 at 20:43
@Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
– Michael S Palatsi
Nov 19 at 20:43
|
show 1 more comment
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Some of your past answers have not been well-received, and you're in danger of being blocked from answering.
Please pay close attention to the following guidance:
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53328394%2fregex-to-match-string-between-two-strings-in-powershell%23new-answer', 'question_page');
}
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
1
If their all key/value like that, just use
(?<=bkey=")([^"]*)(?=")
Or, you could do a global match using(?<=bw+=")([^"]*)(?=")
– sln
Nov 15 at 21:56
1
Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
– LotPings
Nov 15 at 22:12
what part of the last line is the "file" and what part is the "path"? the
File*.txt
looks like a file specification. the next part seems to be the full file name. i presume you want that broken intoSERVERPathClient
&File.txt
but i'm unsure of that.– Lee_Dailey
Nov 15 at 22:20