Wednesday, January 13, 2010

Powershell – Piping

One of the most often-used features I’ve seen so far in Powershell has been the concept of piping the results of commands into other commands to ultimately return something formatted, limited, or otherwise morphed into the desired output. The easiest examples might be something like:
dir | more
dir | sort-object length –desc | format-table -autosize

The first command returns a listing of the files in the current folder one screen at a time, pausing for you to read each screen and tell the command when to advance.  The second is a string of commands to get a listing of files, sort by the file size from largest to smallest, then auto-size each column for a best fit based on the data. Because each result set is passed as an object through the pipeline, you don’t have to handle the format of the text or other conversions. Powershell handles the objects and consumes them until you reach the end. By default Powershell appends a hidden “| Out-Default” command to display the results on screen. If you want to see more options for output, you can run:

Get-Command Out*
Get-Command Export*

Obviously there are a lot of other commands that can be used in the pipeline, but these seem to be especially useful if you want to output the results to something other than the screen.  Also note that if you convert your results along the way using Format-Table or Out-String, you will change the results from an object into an array of text.  (Out-String is the only output command that you can insert into a pipeline and not stop the pipeline. All other "Out-" commands will terminate the pipeline.)

Blocking vs. Non-Blocking

As with other languages, some Cmdlets may be “blocking” other parts of the pipeline. For example, a sort cannot pass its results on until the entire result set has been sorted. The “more” Cmdlet in Powershell is also blocking, but the equivalent “Out-Host –paging” will do the same thing without blocking. When to use various commands is an important consideration if you pipe multiple Cmdlets together, especially if you will be dealing with large objects or result sets.


Filtering is accomplished by piping your resultset to Cmdlets such as Where-Object, Select-Object, ForEach-Object, or Get-Unique.
  • Where-Object allows you to examine all objects and return just those matching your criteria. This is very similar to a WHERE clause in SQL Server.
  • Select-Object acts in a manner similar to the SELECT clause of a SQL Statement and allows you to pick which properties are displayed. It can also handle things such as “TOP 5” or “Distinct” (not exact commands), but can also do some interesting handling of values in the array. For example, you can choose to display every other line of the result set or the first, last, and middle rows by examining the array.  (Note that this will create a new object in a lot of cases because you're filtering the columns.)
  • ForEach-Object operates on each result in the resultset to run commands against them.
  • Get-Unique eliminates duplicates in the resultset.
Filtering would be considered a Blocking operation in many cases and can sometimes be done more efficiently through the native commands or Cmdlets. For example, if you need to pull back just the files of type “.txt”, you may be better off filtering that out in your dir command rather than passing the entire resultset of all files to the Where-Object Cmdlet.

It’s worth mentioning this Cmdlet because it could be really useful in troubleshooting. A simple example would be:

dir | Tee-Object -variable t1 | Select-Object Name,Length | Tee-Object -variable t2 | Sort-Object Length –descending

This would pipe the output of the “dir” command into a variable called $t1. It would then pass the output to Select-Object to limit the results down to Name and Length, then pass that resultset to a variable called $t2. Finally, it would output your results ordered by length descending.  You would now have two variables to examine results as you stepped through the pipeline. If you are getting unexpected results, this would be very valuable in troubleshooting.  You can also use “-filepath” instead of “-variable” to store results to a file instead of a variable.

There is a lot of information available for piping commands because this is one of the true strengths of Powershell. I'm not going to go into a lot of details at this point. There will be more examples in future posts and there are lots of other posts detailing the process.  Piping takes a set of results and passes that to the next command, and to the next, and to the next, until it gets to the end of the pipeline. At that point, the results are returned, stored, or discarded (if you chose to output to Out-Null). You can store these results along the way for comparison or even store them in a file for later analysis.

I plan to spend a pretty significant amount of time familiarizing myself with the various uses for the pipeline. Because it's used so heavily, this is going to be a key component to understanding and getting the most benefit from Powershell.

No comments:

Post a Comment