RegEx to match string between two strings in Powershell












1














Here is my sample data:




Option failonnomatch on

Option batch on

Option confirm Off

open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"



get File*.txt localpathClientFile.txt

mv File*.txt /remote/archive/



close

exit




I would like to create a powershell script to extract pieces of information out of this text file.



List of items I need:




  • Username

  • Password

  • Host

  • Port

  • ssh key

  • File Name

  • Local Path

  • Remote Path


I'm hoping that if I learn how to do a couple of these, the method will be applicable to all items. I attempted to extract the ssh key with the following powershell/regex:



$doc -match '(?<=hostkey=")(.*)(?=")' 


$doc being the sample data



but it appears to be returning the whole line. Any help would be greatly appreciated. Thank you.










share|improve this question




















  • 1




    If their all key/value like that, just use (?<=bkey=")([^"]*)(?=") Or, you could do a global match using (?<=bw+=")([^"]*)(?=")
    – sln
    Nov 15 at 21:56








  • 1




    Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
    – LotPings
    Nov 15 at 22:12










  • what part of the last line is the "file" and what part is the "path"? the File*.txt looks like a file specification. the next part seems to be the full file name. i presume you want that broken into SERVERPathClient & File.txt but i'm unsure of that.
    – Lee_Dailey
    Nov 15 at 22:20
















1














Here is my sample data:




Option failonnomatch on

Option batch on

Option confirm Off

open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"



get File*.txt localpathClientFile.txt

mv File*.txt /remote/archive/



close

exit




I would like to create a powershell script to extract pieces of information out of this text file.



List of items I need:




  • Username

  • Password

  • Host

  • Port

  • ssh key

  • File Name

  • Local Path

  • Remote Path


I'm hoping that if I learn how to do a couple of these, the method will be applicable to all items. I attempted to extract the ssh key with the following powershell/regex:



$doc -match '(?<=hostkey=")(.*)(?=")' 


$doc being the sample data



but it appears to be returning the whole line. Any help would be greatly appreciated. Thank you.










share|improve this question




















  • 1




    If their all key/value like that, just use (?<=bkey=")([^"]*)(?=") Or, you could do a global match using (?<=bw+=")([^"]*)(?=")
    – sln
    Nov 15 at 21:56








  • 1




    Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
    – LotPings
    Nov 15 at 22:12










  • what part of the last line is the "file" and what part is the "path"? the File*.txt looks like a file specification. the next part seems to be the full file name. i presume you want that broken into SERVERPathClient & File.txt but i'm unsure of that.
    – Lee_Dailey
    Nov 15 at 22:20














1












1








1


1





Here is my sample data:




Option failonnomatch on

Option batch on

Option confirm Off

open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"



get File*.txt localpathClientFile.txt

mv File*.txt /remote/archive/



close

exit




I would like to create a powershell script to extract pieces of information out of this text file.



List of items I need:




  • Username

  • Password

  • Host

  • Port

  • ssh key

  • File Name

  • Local Path

  • Remote Path


I'm hoping that if I learn how to do a couple of these, the method will be applicable to all items. I attempted to extract the ssh key with the following powershell/regex:



$doc -match '(?<=hostkey=")(.*)(?=")' 


$doc being the sample data



but it appears to be returning the whole line. Any help would be greatly appreciated. Thank you.










share|improve this question















Here is my sample data:




Option failonnomatch on

Option batch on

Option confirm Off

open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024
00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"



get File*.txt localpathClientFile.txt

mv File*.txt /remote/archive/



close

exit




I would like to create a powershell script to extract pieces of information out of this text file.



List of items I need:




  • Username

  • Password

  • Host

  • Port

  • ssh key

  • File Name

  • Local Path

  • Remote Path


I'm hoping that if I learn how to do a couple of these, the method will be applicable to all items. I attempted to extract the ssh key with the following powershell/regex:



$doc -match '(?<=hostkey=")(.*)(?=")' 


$doc being the sample data



but it appears to be returning the whole line. Any help would be greatly appreciated. Thank you.







regex powershell






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 19 at 20:42

























asked Nov 15 at 21:51









Michael S Palatsi

133




133








  • 1




    If their all key/value like that, just use (?<=bkey=")([^"]*)(?=") Or, you could do a global match using (?<=bw+=")([^"]*)(?=")
    – sln
    Nov 15 at 21:56








  • 1




    Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
    – LotPings
    Nov 15 at 22:12










  • what part of the last line is the "file" and what part is the "path"? the File*.txt looks like a file specification. the next part seems to be the full file name. i presume you want that broken into SERVERPathClient & File.txt but i'm unsure of that.
    – Lee_Dailey
    Nov 15 at 22:20














  • 1




    If their all key/value like that, just use (?<=bkey=")([^"]*)(?=") Or, you could do a global match using (?<=bw+=")([^"]*)(?=")
    – sln
    Nov 15 at 21:56








  • 1




    Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
    – LotPings
    Nov 15 at 22:12










  • what part of the last line is the "file" and what part is the "path"? the File*.txt looks like a file specification. the next part seems to be the full file name. i presume you want that broken into SERVERPathClient & File.txt but i'm unsure of that.
    – Lee_Dailey
    Nov 15 at 22:20








1




1




If their all key/value like that, just use (?<=bkey=")([^"]*)(?=") Or, you could do a global match using (?<=bw+=")([^"]*)(?=")
– sln
Nov 15 at 21:56






If their all key/value like that, just use (?<=bkey=")([^"]*)(?=") Or, you could do a global match using (?<=bw+=")([^"]*)(?=")
– sln
Nov 15 at 21:56






1




1




Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
– LotPings
Nov 15 at 22:12




Your command will ony return $true/$false. To return a value you need to evaluate the $Matches collection. Also to what file do you refer? edit your question to contain some sample data.
– LotPings
Nov 15 at 22:12












what part of the last line is the "file" and what part is the "path"? the File*.txt looks like a file specification. the next part seems to be the full file name. i presume you want that broken into SERVERPathClient & File.txt but i'm unsure of that.
– Lee_Dailey
Nov 15 at 22:20




what part of the last line is the "file" and what part is the "path"? the File*.txt looks like a file specification. the next part seems to be the full file name. i presume you want that broken into SERVERPathClient & File.txt but i'm unsure of that.
– Lee_Dailey
Nov 15 at 22:20












2 Answers
2






active

oldest

votes


















0














If -match is returning a whole line, the implication is that the LHS of your -match operation is an array, which in turn suggests that you used Get-Content without -Raw, which yields the input as an array of lines, in which case -match acts as a filter.



Instead, read your file as a single, multi-line string with Get-Content -Raw; with a scalar LHS,
-match then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches (a hashtable whose 0 entry contains the overall match, 1 what the 1st capture group matched, ...):



# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt

if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}


With your sample input, the above yields
ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00





You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)); the following example uses such named capture groups to extract several of the tokens of interest:



if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}


With your sample input, the above yields:



username
password
host.name.net


You can extend the above to capture all tokens of interest.

Note that . by default doesn't match n (newline) characters.





Optional reading: Using the x (IgnoreWhiteSpace) option to make regexes more readable:



Extracting that many tokens can result in a complex regex that is hard to read, in which case the x (IgnoreWhiteSpace) regex option, can help (as an inline option, (?x) at the start of the regex):



if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}


Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,   or [ ], or s to match any whitespace char.)



With your sample input, the above yields the following:



Name                           Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username


Note that the reason the capture groups are out of order is that $Matches is a hash table (of type [hashtable]), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.



However, random access to capture groups works just fine; e.g., $Matches.port will return 22.






share|improve this answer























  • I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you. (?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
    – Michael S Palatsi
    Nov 19 at 19:54










  • @MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default . doesn't match n (newlines)). Please see my update for using the IgnoreWhiteSpace regex option to make complex expressions more manageable.
    – mklement0
    Nov 19 at 21:45










  • Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
    – Michael S Palatsi
    Nov 20 at 4:24










  • Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
    – mklement0
    Nov 20 at 5:02





















1














this uses named matches with flags set to singleline, multiline, case insensitive and then uses $Matches.MatchName to get the items into a custom object.



# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"

get File*.txt SERVERPathClientFile.txt
'@

$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'

[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}


output ...



UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt





share|improve this answer





















  • It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how -match works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
    – mklement0
    Nov 18 at 0:17






  • 1




    @mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
    – Lee_Dailey
    Nov 18 at 0:53










  • Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
    – Michael S Palatsi
    Nov 19 at 18:40












  • @MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
    – Lee_Dailey
    Nov 19 at 19:56










  • @Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
    – Michael S Palatsi
    Nov 19 at 20:43











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53328394%2fregex-to-match-string-between-two-strings-in-powershell%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























2 Answers
2






active

oldest

votes








2 Answers
2






active

oldest

votes









active

oldest

votes






active

oldest

votes









0














If -match is returning a whole line, the implication is that the LHS of your -match operation is an array, which in turn suggests that you used Get-Content without -Raw, which yields the input as an array of lines, in which case -match acts as a filter.



Instead, read your file as a single, multi-line string with Get-Content -Raw; with a scalar LHS,
-match then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches (a hashtable whose 0 entry contains the overall match, 1 what the 1st capture group matched, ...):



# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt

if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}


With your sample input, the above yields
ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00





You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)); the following example uses such named capture groups to extract several of the tokens of interest:



if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}


With your sample input, the above yields:



username
password
host.name.net


You can extend the above to capture all tokens of interest.

Note that . by default doesn't match n (newline) characters.





Optional reading: Using the x (IgnoreWhiteSpace) option to make regexes more readable:



Extracting that many tokens can result in a complex regex that is hard to read, in which case the x (IgnoreWhiteSpace) regex option, can help (as an inline option, (?x) at the start of the regex):



if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}


Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,   or [ ], or s to match any whitespace char.)



With your sample input, the above yields the following:



Name                           Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username


Note that the reason the capture groups are out of order is that $Matches is a hash table (of type [hashtable]), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.



However, random access to capture groups works just fine; e.g., $Matches.port will return 22.






share|improve this answer























  • I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you. (?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
    – Michael S Palatsi
    Nov 19 at 19:54










  • @MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default . doesn't match n (newlines)). Please see my update for using the IgnoreWhiteSpace regex option to make complex expressions more manageable.
    – mklement0
    Nov 19 at 21:45










  • Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
    – Michael S Palatsi
    Nov 20 at 4:24










  • Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
    – mklement0
    Nov 20 at 5:02


















0














If -match is returning a whole line, the implication is that the LHS of your -match operation is an array, which in turn suggests that you used Get-Content without -Raw, which yields the input as an array of lines, in which case -match acts as a filter.



Instead, read your file as a single, multi-line string with Get-Content -Raw; with a scalar LHS,
-match then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches (a hashtable whose 0 entry contains the overall match, 1 what the 1st capture group matched, ...):



# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt

if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}


With your sample input, the above yields
ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00





You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)); the following example uses such named capture groups to extract several of the tokens of interest:



if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}


With your sample input, the above yields:



username
password
host.name.net


You can extend the above to capture all tokens of interest.

Note that . by default doesn't match n (newline) characters.





Optional reading: Using the x (IgnoreWhiteSpace) option to make regexes more readable:



Extracting that many tokens can result in a complex regex that is hard to read, in which case the x (IgnoreWhiteSpace) regex option, can help (as an inline option, (?x) at the start of the regex):



if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}


Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,   or [ ], or s to match any whitespace char.)



With your sample input, the above yields the following:



Name                           Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username


Note that the reason the capture groups are out of order is that $Matches is a hash table (of type [hashtable]), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.



However, random access to capture groups works just fine; e.g., $Matches.port will return 22.






share|improve this answer























  • I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you. (?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
    – Michael S Palatsi
    Nov 19 at 19:54










  • @MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default . doesn't match n (newlines)). Please see my update for using the IgnoreWhiteSpace regex option to make complex expressions more manageable.
    – mklement0
    Nov 19 at 21:45










  • Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
    – Michael S Palatsi
    Nov 20 at 4:24










  • Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
    – mklement0
    Nov 20 at 5:02
















0












0








0






If -match is returning a whole line, the implication is that the LHS of your -match operation is an array, which in turn suggests that you used Get-Content without -Raw, which yields the input as an array of lines, in which case -match acts as a filter.



Instead, read your file as a single, multi-line string with Get-Content -Raw; with a scalar LHS,
-match then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches (a hashtable whose 0 entry contains the overall match, 1 what the 1st capture group matched, ...):



# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt

if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}


With your sample input, the above yields
ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00





You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)); the following example uses such named capture groups to extract several of the tokens of interest:



if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}


With your sample input, the above yields:



username
password
host.name.net


You can extend the above to capture all tokens of interest.

Note that . by default doesn't match n (newline) characters.





Optional reading: Using the x (IgnoreWhiteSpace) option to make regexes more readable:



Extracting that many tokens can result in a complex regex that is hard to read, in which case the x (IgnoreWhiteSpace) regex option, can help (as an inline option, (?x) at the start of the regex):



if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}


Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,   or [ ], or s to match any whitespace char.)



With your sample input, the above yields the following:



Name                           Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username


Note that the reason the capture groups are out of order is that $Matches is a hash table (of type [hashtable]), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.



However, random access to capture groups works just fine; e.g., $Matches.port will return 22.






share|improve this answer














If -match is returning a whole line, the implication is that the LHS of your -match operation is an array, which in turn suggests that you used Get-Content without -Raw, which yields the input as an array of lines, in which case -match acts as a filter.



Instead, read your file as a single, multi-line string with Get-Content -Raw; with a scalar LHS,
-match then returns a [bool]
, and the results of the matching operation are reported in automatic variable $Matches (a hashtable whose 0 entry contains the overall match, 1 what the 1st capture group matched, ...):



# Read file as a whole, into a single, multi-line string.
$doc = Get-Content -Raw file.txt

if ($doc -match '(?<=hostkey=")(.*)(?=")') {
# Output what the 1st capture group captured
$Matches[1]
}


With your sample input, the above yields
ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00





You can then extend the approach to capture multiple tokens, in which case I suggest using named capture groups ((?<name>...)); the following example uses such named capture groups to extract several of the tokens of interest:



if ($doc -match '(?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+)'){
# Output the named capture-group values.
# Note that index notation (['username']) and property
# notation (.username) can be used interchangeably.
$Matches.username
$Matches.password
$Matches.host
}


With your sample input, the above yields:



username
password
host.name.net


You can extend the above to capture all tokens of interest.

Note that . by default doesn't match n (newline) characters.





Optional reading: Using the x (IgnoreWhiteSpace) option to make regexes more readable:



Extracting that many tokens can result in a complex regex that is hard to read, in which case the x (IgnoreWhiteSpace) regex option, can help (as an inline option, (?x) at the start of the regex):



if ($doc -match '(?x)
(?<=sftp://)(?<username>[^:]+)
:(?<password>[^@]+)
@(?<host>[^:]+)
:(?<port>d+)
s+hostkey="(?<sshkey>.+?)"
n+get File*.txt (?<localpath>.+)
nmv File*.txt (?<remotepath>.+)
'){
# Output the named capture-group values.
$Matches.GetEnumerator() | ? Key -ne 0
}


Note how the whitespace used for making the regex more readable (spreading it across multiple lines) is ignored while matching, whereas whitespace to be matched in the input must be escaped (e.g., to match a single space,   or [ ], or s to match any whitespace char.)



With your sample input, the above yields the following:



Name                           Value
---- -----
host host.name.net
localpath localpathClientFile.txt
port 22
sshkey ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
remotepath /remote/archive/
password password
username username


Note that the reason the capture groups are out of order is that $Matches is a hash table (of type [hashtable]), whose key enumeration order is an implementation artifact: no particular enumeration order is guaranteed.



However, random access to capture groups works just fine; e.g., $Matches.port will return 22.







share|improve this answer














share|improve this answer



share|improve this answer








edited Nov 19 at 21:58

























answered Nov 15 at 22:32









mklement0

126k20239267




126k20239267












  • I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you. (?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
    – Michael S Palatsi
    Nov 19 at 19:54










  • @MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default . doesn't match n (newlines)). Please see my update for using the IgnoreWhiteSpace regex option to make complex expressions more manageable.
    – mklement0
    Nov 19 at 21:45










  • Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
    – Michael S Palatsi
    Nov 20 at 4:24










  • Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
    – mklement0
    Nov 20 at 5:02




















  • I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you. (?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
    – Michael S Palatsi
    Nov 19 at 19:54










  • @MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default . doesn't match n (newlines)). Please see my update for using the IgnoreWhiteSpace regex option to make complex expressions more manageable.
    – mklement0
    Nov 19 at 21:45










  • Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
    – Michael S Palatsi
    Nov 20 at 4:24










  • Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
    – mklement0
    Nov 20 at 5:02


















I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you. (?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
– Michael S Palatsi
Nov 19 at 19:54




I like this method as the regex seems to make a little more sense but I'm getting stuck when I go down to grab the file name. I think it's because I'm moving to a new line but I'm not sure how to include that in the regex. Thank you. (?<=sftp://)(?<username>[^:]+):(?<password>[^@]+)@(?<host>[^:]+):(?<port>[^-]+) -hostkey="(?<sshkey>[^"]+)(?<=get )(?<filename>[^/])
– Michael S Palatsi
Nov 19 at 19:54












@MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default . doesn't match n (newlines)). Please see my update for using the IgnoreWhiteSpace regex option to make complex expressions more manageable.
– mklement0
Nov 19 at 21:45




@MichaelSPalatsi: You need to match intervening whitespace as well (and, as stated, by default . doesn't match n (newlines)). Please see my update for using the IgnoreWhiteSpace regex option to make complex expressions more manageable.
– mklement0
Nov 19 at 21:45












Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
– Michael S Palatsi
Nov 20 at 4:24




Awesome! That will certainly clean things up. I believe I have one last question. Say I have a group of files and I intend on using this regex against all of those files BUT in some files, one of my groupings is likely to not match anything. How can I handle that?
– Michael S Palatsi
Nov 20 at 4:24












Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
– mklement0
Nov 20 at 5:02






Glad to hear it, @MichaelSPalatsi. As for your follow-up question: that's hard to answer in the abstract. I suggest you create a new question that focuses just on that problem with specific examples. Feel free to ping me here once you have done so, and I'm happy to take a look.
– mklement0
Nov 20 at 5:02















1














this uses named matches with flags set to singleline, multiline, case insensitive and then uses $Matches.MatchName to get the items into a custom object.



# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"

get File*.txt SERVERPathClientFile.txt
'@

$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'

[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}


output ...



UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt





share|improve this answer





















  • It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how -match works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
    – mklement0
    Nov 18 at 0:17






  • 1




    @mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
    – Lee_Dailey
    Nov 18 at 0:53










  • Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
    – Michael S Palatsi
    Nov 19 at 18:40












  • @MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
    – Lee_Dailey
    Nov 19 at 19:56










  • @Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
    – Michael S Palatsi
    Nov 19 at 20:43
















1














this uses named matches with flags set to singleline, multiline, case insensitive and then uses $Matches.MatchName to get the items into a custom object.



# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"

get File*.txt SERVERPathClientFile.txt
'@

$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'

[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}


output ...



UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt





share|improve this answer





















  • It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how -match works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
    – mklement0
    Nov 18 at 0:17






  • 1




    @mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
    – Lee_Dailey
    Nov 18 at 0:53










  • Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
    – Michael S Palatsi
    Nov 19 at 18:40












  • @MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
    – Lee_Dailey
    Nov 19 at 19:56










  • @Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
    – Michael S Palatsi
    Nov 19 at 20:43














1












1








1






this uses named matches with flags set to singleline, multiline, case insensitive and then uses $Matches.MatchName to get the items into a custom object.



# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"

get File*.txt SERVERPathClientFile.txt
'@

$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'

[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}


output ...



UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt





share|improve this answer












this uses named matches with flags set to singleline, multiline, case insensitive and then uses $Matches.MatchName to get the items into a custom object.



# fake reading in a text file as one string
# in real life, use Get-Content -Raw
$InStuff = @'
open sftp://username:password@host.name.net:22 hostkey="ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00"

get File*.txt SERVERPathClientFile.txt
'@

$Null = $InStuff -match '(?smi).+//(?<UserName>.+):(?<Password>.+)@(?<HostName>.+):(?<Port>.+) hostkey="(?<SshKey>.+)".+get .+ (?<FullFileName>\.+)$'

[PSCustomObject]@{
UserName = $Matches.UserName
Password = $Matches.Password
Port = $Matches.Port
SshKey = $Matches.SshKey
PathName = Split-Path -Path $Matches.FullFileName -Parent
FileName = Split-Path -Path $Matches.FullFileName -Leaf
}


output ...



UserName : username
Password : password
Port : 22
SshKey : ssh-rsa 1024 00:00:00:00:00:00:00:00:00:00:00:00:00:00:00:00
PathName : SERVERPathClient
FileName : File.txt






share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 15 at 22:39









Lee_Dailey

1,412177




1,412177












  • It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how -match works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
    – mklement0
    Nov 18 at 0:17






  • 1




    @mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
    – Lee_Dailey
    Nov 18 at 0:53










  • Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
    – Michael S Palatsi
    Nov 19 at 18:40












  • @MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
    – Lee_Dailey
    Nov 19 at 19:56










  • @Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
    – Michael S Palatsi
    Nov 19 at 20:43


















  • It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how -match works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
    – mklement0
    Nov 18 at 0:17






  • 1




    @mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
    – Lee_Dailey
    Nov 18 at 0:53










  • Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
    – Michael S Palatsi
    Nov 19 at 18:40












  • @MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
    – Lee_Dailey
    Nov 19 at 19:56










  • @Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
    – Michael S Palatsi
    Nov 19 at 20:43
















It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how -match works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
– mklement0
Nov 18 at 0:17




It's an effective solution (+1), but if you provide a complete solution that is specific to the OP's exact scenario without addressing the misconceptions implied by the question (around how -match works), you'll make the OP very happy, but future readers with similar misconceptions - but different scenarios - won't necessarily benefit.
– mklement0
Nov 18 at 0:17




1




1




@mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
– Lee_Dailey
Nov 18 at 0:53




@mklement0 - i see what you mean ... i took the mention of "as one string" covered that idea. yours is far more detailed on the subject. i'll try to keep that in mind. [grin]
– Lee_Dailey
Nov 18 at 0:53












Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
– Michael S Palatsi
Nov 19 at 18:40






Hi Lee, I failed to mention I have additional lines preceding and following the given sample. How can I accommodate those lines? Thank you.
– Michael S Palatsi
Nov 19 at 18:40














@MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
– Lee_Dailey
Nov 19 at 19:56




@MichaelSPalatsi - you will need to add a complete text to your original post so that folks can have a realistic sample to code against. if the text is too long, post it to Pastebin or Gist.GitHub and add a link to it into your OP.
– Lee_Dailey
Nov 19 at 19:56












@Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
– Michael S Palatsi
Nov 19 at 20:43




@Lee_Dailey That makes sense. Sorry, I'm new. :) I've updated the OP.
– Michael S Palatsi
Nov 19 at 20:43


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53328394%2fregex-to-match-string-between-two-strings-in-powershell%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to change which sound is reproduced for terminal bell?

Can I use Tabulator js library in my java Spring + Thymeleaf project?

Title Spacing in Bjornstrup Chapter, Removing Chapter Number From Contents