PowerShell Tips: LDAP String Manipulation and why not to use Trim
I've been working with PowerShell a lot recently and stumbled a little bit with String Manipulation.  So, I've decided to post some details of what I've learned, to hopefully help others avoid my mistakes.
String manipulation can be very important when scripting, especially if the purpose of the script is to output a human readable report. First, there's a great TechNet article on string manipulation in PowerShell. It covers the basics in great detail; my intent is to supplement that article with some of the pain points that I came across while using those techniques.
How to Trim a String
Trimming a string simply means removing characters from the ends of that string. There are 3 Trim commands: .Trim() .TrimStart() and .TrimEnd(). As the names imply, TrimStart() removes characters from the start of a string, TrimEnd() removes characters from the end of a string, and Trim() removes characters from both ends of the string. Those Trim commands are just methods of the PowerShell string object, so invoking them is easy. If you have a string called $myString, you can invoke Trim on it by using $myString.Trim().
So, what does Trim() do? If you call it like that, it defaults to trimming spaces. If using Trim(), it removes all spaces from the start and end of the string; the TrimStart() and TrimEnd() methods allow you to target one end or the other. The really useful thing about Trim is that you can pass it characters to remove, by including them inside the paranthesis in the form of TrimStart("X"). Bear in mind that it is removing characters, not strings... but more on that later.
Let's look at an example that I've come across - LDAP strings. When interacting with Active Directory, we'll often find ourselves dealing with strings like the following: "CN=Jason Coleman,OU=Consultants,OU=Accounts,DC=Company,DC=Local". How do we extract just the user name from that string?
First, we'll prep a string for the example:
PS C:\> $myStr = "CN=Jason Coleman,OU=Consultants,OU=Accounts,DC=Company,DC=Local"
We basically want to remove the "CN=" from that string and everything after the comma. We'll deal with the post-comma stuff later; for now lets focus on the "CN=". I've seen people use TrimStart("CN=") for this purpose, and it will seem to work... but you will find situations where it doesn't behave the way that you'd like. Look at these commands:
PS C:> $myStr.TrimStart("CN=")
Jason Coleman,OU=Consultants,OU=Accounts,DC=Company,DC=Local
Great! It worked! Right? Well, try this command:
PS C:> $myStr.TrimStart("N=C")
Jason Coleman,OU=Consultants,OU=Accounts,DC=Company,DC=Local
It's the same output... so what's going on? Remember, the Trim() methods deal in characters, not in strings. So, when we pass it "CN=" we're really just passing it the characters "c", "n" and "=". We're passing it the exact same thing when we pass it "N=C", or "NC=" or even "NN==cccCnN=N" for that matter. TrimStart() is just looking at the first character of the string, checking if it matches one of the characters that it was passed, and removes it. Then, it moves on to the next character and does the same thing. And the next, etc. until it finds a character that isn't on its "hit list". At that point, its job is done and it returns the modified string.
So, what happens if our string is the following:
PS C:\> $myStr = "CN=Citrix Users,OU=Consultants,OU=Accounts,DC=Company,DC=Local"
... and then we use TrimStart() to remove the leading CN= from the LDAP string:
PS C:> $myStr.TrimStart("CN=")
itrix Users,OU=Consultants,OU=Accounts,DC=Company,DC=Local
Oops, that wasn't our desired output. Since the "C" in "Citrix" is on TrimStart()'s hit list, it went ahead and removed that character as well. What do we do about that? Well, it seems to me that the best solution is to not use Trim(). The way I see it, there are 2 good options here. The best in this case (since we know that all of our strings are going to begin with 3 characters that we don't want) is to use substring() instead. Substring() takes a number as its input and returns a string starting at that character (and a second number, optionally, for the length of the substring).
PS C:> $myStr.Substring(3)
Citrix Users,OU=Consultants,OU=Accounts,DC=Company,DC=Local
Since these are LDAP strings, we're pretty sure that there are going to be 3 characters at the start that we want to get rid of. If we wanted to specifically target the "CN=" and get rid of it, we could use the Replace() function instead (although it replaces all instances of the string).
 
PS C:> $myStr.Replace("CN=","")
Citrix Users,OU=Consultants,OU=Accounts,DC=Company,DC=Local
So, that very easily deals with the "CN=", but how do we strip out the rest of the string? Well, remember how Substring() can take both a starting position and length? We just need to find the location of that first comma, and use that to determine the length of the substring. We can easily find the position of that comma by using the .indexof() method.
Because we know that IndexOf will return the following:
PS C:> $myStr.IndexOf(",")
15
We can use it for the length of a substring as follows:
PS C:> $myStr.Substring(0,$myStr.IndexOf(","))
CN=Citrix Users
And combine it with our earlier logic to start after the 3rd character (and adjust for the now shorter length) as such:
 
PS C:> $myStr.Substring(3,$myStr.IndexOf(",")-3)
Citrix Users
I hope that this helps!
String manipulation can be very important when scripting, especially if the purpose of the script is to output a human readable report. First, there's a great TechNet article on string manipulation in PowerShell. It covers the basics in great detail; my intent is to supplement that article with some of the pain points that I came across while using those techniques.
How to Trim a String
Trimming a string simply means removing characters from the ends of that string. There are 3 Trim commands: .Trim() .TrimStart() and .TrimEnd(). As the names imply, TrimStart() removes characters from the start of a string, TrimEnd() removes characters from the end of a string, and Trim() removes characters from both ends of the string. Those Trim commands are just methods of the PowerShell string object, so invoking them is easy. If you have a string called $myString, you can invoke Trim on it by using $myString.Trim().
So, what does Trim() do? If you call it like that, it defaults to trimming spaces. If using Trim(), it removes all spaces from the start and end of the string; the TrimStart() and TrimEnd() methods allow you to target one end or the other. The really useful thing about Trim is that you can pass it characters to remove, by including them inside the paranthesis in the form of TrimStart("X"). Bear in mind that it is removing characters, not strings... but more on that later.
Let's look at an example that I've come across - LDAP strings. When interacting with Active Directory, we'll often find ourselves dealing with strings like the following: "CN=Jason Coleman,OU=Consultants,OU=Accounts,DC=Company,DC=Local". How do we extract just the user name from that string?
First, we'll prep a string for the example:
PS C:\> $myStr = "CN=Jason Coleman,OU=Consultants,OU=Accounts,DC=Company,DC=Local"
We basically want to remove the "CN=" from that string and everything after the comma. We'll deal with the post-comma stuff later; for now lets focus on the "CN=". I've seen people use TrimStart("CN=") for this purpose, and it will seem to work... but you will find situations where it doesn't behave the way that you'd like. Look at these commands:
PS C:> $myStr.TrimStart("CN=")
Jason Coleman,OU=Consultants,OU=Accounts,DC=Company,DC=Local
Great! It worked! Right? Well, try this command:
PS C:> $myStr.TrimStart("N=C")
Jason Coleman,OU=Consultants,OU=Accounts,DC=Company,DC=Local
It's the same output... so what's going on? Remember, the Trim() methods deal in characters, not in strings. So, when we pass it "CN=" we're really just passing it the characters "c", "n" and "=". We're passing it the exact same thing when we pass it "N=C", or "NC=" or even "NN==cccCnN=N" for that matter. TrimStart() is just looking at the first character of the string, checking if it matches one of the characters that it was passed, and removes it. Then, it moves on to the next character and does the same thing. And the next, etc. until it finds a character that isn't on its "hit list". At that point, its job is done and it returns the modified string.
So, what happens if our string is the following:
PS C:\> $myStr = "CN=Citrix Users,OU=Consultants,OU=Accounts,DC=Company,DC=Local"
... and then we use TrimStart() to remove the leading CN= from the LDAP string:
PS C:> $myStr.TrimStart("CN=")
itrix Users,OU=Consultants,OU=Accounts,DC=Company,DC=Local
Oops, that wasn't our desired output. Since the "C" in "Citrix" is on TrimStart()'s hit list, it went ahead and removed that character as well. What do we do about that? Well, it seems to me that the best solution is to not use Trim(). The way I see it, there are 2 good options here. The best in this case (since we know that all of our strings are going to begin with 3 characters that we don't want) is to use substring() instead. Substring() takes a number as its input and returns a string starting at that character (and a second number, optionally, for the length of the substring).
PS C:> $myStr.Substring(3)
Citrix Users,OU=Consultants,OU=Accounts,DC=Company,DC=Local
Since these are LDAP strings, we're pretty sure that there are going to be 3 characters at the start that we want to get rid of. If we wanted to specifically target the "CN=" and get rid of it, we could use the Replace() function instead (although it replaces all instances of the string).
PS C:> $myStr.Replace("CN=","")
Citrix Users,OU=Consultants,OU=Accounts,DC=Company,DC=Local
So, that very easily deals with the "CN=", but how do we strip out the rest of the string? Well, remember how Substring() can take both a starting position and length? We just need to find the location of that first comma, and use that to determine the length of the substring. We can easily find the position of that comma by using the .indexof() method.
Because we know that IndexOf will return the following:
PS C:> $myStr.IndexOf(",")
15
We can use it for the length of a substring as follows:
PS C:> $myStr.Substring(0,$myStr.IndexOf(","))
CN=Citrix Users
And combine it with our earlier logic to start after the 3rd character (and adjust for the now shorter length) as such:
PS C:> $myStr.Substring(3,$myStr.IndexOf(",")-3)
Citrix Users
I hope that this helps!
Split also works well for this:
ReplyDelete$myStr.Split(',')[0].Split('=')[1]
Good call, thanks!
Delete