One of the more common approaches to ETL is using comma separated (csv) files. It’s easy to understand copying files and the file contents too. In this example we are using a few cmdlets to import a CSV file into a Datatable. We will grab the header info from the file and use it for the Datatable column names so this code can be used for a different file with the same structure (comma, header, quoted text).
WorldWideImporters Query and CSV File
To create our sample CSV file we used a query on the WorldWideImporters (WWI) Database. For our example purposes we are saving the file using comma separated fields where the text fields are quoted (“”) and the first line contains the header field.
This is the SQL Query for this file:
use [WideWorldImporters];
SELECT TOP (100)
po.[PurchaseOrderID]
,ps.SupplierName
,po.[OrderDate]
,po.[DeliveryMethodID]
,p.FullName
,po.[ExpectedDeliveryDate]
,po.[SupplierReference]
,po.[IsOrderFinalized]
FROM [Purchasing].[PurchaseOrders] AS po
INNER JOIN [Purchasing].[Suppliers] AS ps
ON po.SupplierID = ps.SupplierID
INNER JOIN [Application].[People] AS p
ON po.ContactPersonID = p.PersonID
The results are saved to a CSV file called “orders.csv” and when exporting from SSMS, I am including the header information.
Create A New Datatable
$Datatable = New-Object System.Data.DataTable
Using Powershell Import-Csv
The default setting for Import-Csv will grab the first line as the Header for the file. This is what we are working with but if you need to define the headers it can be done using the “-header” parameter.
We can import the CSV file to a variable this way:
$csvall = Import-Csv -Path "C:\Users\user\Documents\orders.csv"
Use Get-Member For Headers
We need to create the headers in our Datatable but “Import-Csv” does not provide a method to get the Headers. We need to use another cmdlet called “Get-Member” and grab the (db tales) property called “NoteProperty”. Then we separate the data and the header as we need to create the columns based on the headers first.
Using Get-Member we assign another variable with only the header records:
$csvheaders = $csvall | Get-member -MemberType NoteProperty
At this point we can create the Columns based on each header entry:
Foreach ($header in $csvheaders ) {
$Datatable.Columns.Add($header.Name)
}
Import CSV Data to the Datatable
Now that our columns are setup we can add the rows from our original CSV file object.
Foreach($csv in $csvall) {
$row = $Datatable.NewRow()
$row.DeliveryMethodID = $csv.DeliveryMethodID
$row.ExpectedDeliveryDate = $csv.ExpectedDeliveryDate
$row.FullName = $csv.FullName
$row.IsOrderFinalized = $csv.IsOrderFinalized
$row.OrderDate = $csv.OrderDate
$row.PurchaseOrderID = $csv.PurchaseOrderID
$row.SupplierName = $csv.SupplierName
$row.SupplierReference = $csv.SupplierReference
$Datatable.Rows.Add($row)
}
$Datatable
Export Datatable Using Export-Csv
and of course how do we go back to the csv file?
by using the “Export-csv” cmdlet
$Datatable | export-csv -path "C:\Users\user\Documents\orders_export.csv" -Encoding UTF8
Complete Powershell Code
$Datatable = New-Object System.Data.DataTable
## By Default Import-csv uses first row as Header
$csvall = Import-Csv -sdgfsd -Path "C:\Users\user\Documents\orders.csv"
$csvheaders = $csvall | Get-member -MemberType NoteProperty
Foreach ($header in $csvheaders) {
$Datatable.Columns.Add($header.Name)
}
Foreach($csv in $csvall) {
$row = $Datatable.NewRow()
$row.DeliveryMethodID = $csv.DeliveryMethodID
$row.ExpectedDeliveryDate = $csv.ExpectedDeliveryDate
$row.FullName = $csv.FullName
$row.IsOrderFinalized = $csv.IsOrderFinalized
$row.OrderDate = $csv.OrderDate
$row.PurchaseOrderID = $csv.PurchaseOrderID
$row.SupplierName = $csv.SupplierName
$row.SupplierReference = $csv.SupplierReference
$Datatable.Rows.Add($row)
}
$Datatable | Format-Table