BearFF

In my last post on using ActiveX Data Objects (ADO) with Excel VBA, I demonstrated some code to load a Recordset, filter the Recordset using the Recordset’s Filter Prpoerty, and copy the Filtered Recordset to a Worksheet using the CopyFromRecordset Method of the Range Object.

I put a link to the blog post on the Excel VBA and Users Group on LinkedIn. To our good fortune, James Wilson was reading. James responded with some nice comments and some very good code of his own. I was impressed and asked James if he would like to do a write up to post on the blog.

James kindly accepted my offer as follows in James’ words. Take it away James!



Bringing the full power of SQL to bear in Excel

James Wilson
September 13, 2014

“I feel the need, the need for speed.” Top Gun

I love Excel, but sometimes you just want a bit more power to analyse your data. My favourite tool for analysing large quantities of data has always been SQL. While Microsoft includes MS Query in Excel out-of-the-box, it does have many limitations and is relatively slow. Using VBA and ADO is the next logical step.

The code below is the latest incarnation of a general purpose SQL function I’ve been using for the last five years or so. For me the data is the thing – I want to be able to start querying my data using SQL without having to start coding from scratch each time. Just copy and paste into a module in your workbook, and you’re ready to go.

Code first then some explanation:

Function SQL(ByVal SQLstr As String, ByVal Destination As String, Optional ByVal ConnectionString As String) As Boolean

    On Error GoTo ErrorHandler
    Application.ScreenUpdating = False
    Application.Calculation = xlCalculationManual
    Application.DisplayAlerts = False

    Dim myConnection As Object
    Dim myRecordSet As Object
    Dim myQueryTable As QueryTable

    ThisWorkbook.Sheets(Destination).Activate
    ThisWorkbook.Sheets(Destination).Cells.Delete

    If ConnectionString = "" Then ConnectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" & ThisWorkbook.FullName & ";Extended Properties=Excel 12.0"
    
    Set myConnection = CreateObject("ADODB.Connection")
    Set myRecordSet = CreateObject("ADODB.Recordset")
    
    myConnection.ConnectionString = ConnectionString
    myConnection.Open
    myRecordSet.ActiveConnection = ConnectionString
    myRecordSet.Source = SQLstr
    myRecordSet.Open
    
    Set myQueryTable = Sheets(Destination).QueryTables.Add(Connection:=myRecordSet, Destination:=Range("'" & Destination & "'!a1"))
    myQueryTable.Refresh
    
    If myRecordSet.State <> adStateClosed Then myRecordSet.Close
    If Not myRecordSet Is Nothing Then Set myRecordSet = Nothing
    If Not myConnection Is Nothing Then Set myConnection = Nothing
    
    Err.Clear
ErrorHandler:
    If Err Then
        Sheets(Destination).Cells(1, 1) = "SQL Error: " & Err.Description
        SQL = False
    Else
        SQL = True
    End If
    
    Application.ScreenUpdating = True
    Application.Calculation = xlCalculationAutomatic
    Application.DisplayAlerts = True
End Function

I wrote this as a VBA function rather than a sub procedure, because I wanted to be able to trap errors in the calling procedure. It is a really a matter of preference whether you like your functions to return True or False (lines 28 and 30). I’ve chosen to return True if it worked, so I’d call it using coding something like:

Sub test()
    DidItWork = SQL("SELECT * FROM [Sheet1$] WHERE [Sheet1$].[date] > #01/03/2014#", "Sheet2")
    If DidItWork = False Then … 'Do some error handling
End Sub

Basically this function sets up an ADO link to a data source, executes an SQL query on that data, and returns the results of that query to a worksheet of our choice in the current workbook using a query table.

When you call this function (line 1), you pass a string with your SQL query, a string with the name of the worksheet you want the data to be returned to, and optionally a connection string to the data source. By default the data source is the workbook the code is in. That’s right – you can use SQL to query data in other tabs in the same workbook (make sure your workbook is saved first).

If you find that most of the time you are querying a corporate database or other data source, then you’d just tweak line 15 of the coding to default to the connection string to the data source you are using most often.

Lines 4 to 6 and 42 to 44 are just the standard VBA codes that you’d put in to speed up any bit of coding. If you are calling this function and have these bits of coding in the calling procedure, then you can safely delete these lines from this function.

Lines 8 to 10 are to set up a local connection and recordset object (we’re going to use ADO to get our data), and a query table (which we are going to use to return the results of the SQL query to Excel).

The way this function is written, your data output is always going to be a worksheet in the current workbook in Cell A1. You can have no other data in this worksheet as Lines 12 to 13 delete the contents of the worksheet, before it is refreshed again with the query table set up in Line 26.

Line 17 to 27 is the meat of the function, setting up an ADO link and returning the data using a query table.

Lines 29 to 31 are to tidy up objects and connections. Line 36 is to give you a clue if you’ve made an error in your SQL.

So I’ve a personal library function that allows me to use SQL in Excel without much further thought – what do I do with it? Let me give a few simple examples to give an idea of the possibilities.

  1. Treat my current spreadsheet a bit like a mini-database and run queries on it – that would be much harder to do just using VBA or manually copying and pasting.
  2. Suck data out of multiple corporate databases and spreadsheets and join it together. You don’t need even to open the spreadsheets to get the data (as I said to start with – it’s all about speed and power). So for example let’s say you have one spreadsheet from your sales guys with sales volumes, and you have another spreadsheet with the confidential prices for each customer, then you can do a bit of SQL coding like:
    DidItWork = SQL(“SELECT A.*, B.[Price], A.[Volume]*B.[Price] as [Revenue] from [C:\Sales Volumes.xlsx].[Data$] A LEFT OUTER JOIN [F:\Prices.xlsx].[Sheet1$] B ON A.[Product] = B.[Product] AND A.[Customer No] = B.[Customer] “, “Sales Forecast”)
    So I’m using SQL aliases A and B for brevity, and by using multipart identifiers specifying the full path and filename of the Excel workbooks I can suck data out of any file I have access to. Note if you specify the data source fully, the connection string ADO uses is virtually irrelevant.
  3. By using For…Next loops in VBA and a bit of text manipulation and the SQL command UNION I can consolidate multiple similar data sources simply. So using a string variable like below in a loop:

mySQLstring = mySQLstring & ” UNION ” & …
Good for consolidating budgets submitted in a similar format.

The limitation is really your knowledge of SQL. Beware of missing spaces and extra commas in your SQL if you are using the VBA & _ to join long strings together to form your SQL.


Tidy Up

Thanks James – great job! How do you use ADO, SQL, Recordsets and QueryTables in your Projects?

, , , , , , , , ,

SNLFinal

In Coffee Talk on Satruday Night Live, Linda Richman, played by Mike Myers would hurl out “Discussion Topics”. Here are a few of my favorites:

  1. “The Partridge Family were neither partridges, nor a family. Discuss.”
  2. “The radical reconstruction of the South after the Civil War was neither radical nor a reconstruction. Discuss.”
  3. “The jelly bean is neither made of jelly nor is it a bean. Discuss.”

However, today’s topic is not about SNL or the radical reconstruction of the south. It is about Excel.

Today’s discussion topic:

In their landmark tome, Professional Excel Development, 2nd Edition, Bovey et. al. contend that Row 1 and Column A should be left empty (Bovey et. al. pg. 70). I posit that it is fine to use Row 1 Column A (R1C1) for tabs in the Data Layer of the Workbook – Discuss.

,

DandelionFinal2

Dandelions are weeds. I’ve spent more hours weeding them out of my family’s yard and my grandparents’ yards than I care to count. Yet, when you come across them in a meadow, they are very nice to look at and add a kind of tranquility in their own right. They are edible and make a nice tasting wine. Here’s a recipe from AllRecipes. Let us know how it turns out.

Today’s post, however, is not about dandelions or wine making – it about the Filter Property of ADO Recordsets.


    edit: As is my usual practice, I am going to use Late Binding in the sample snippets below. A discussion on Late / Early Binding is beyond the scope of this post. Please see these links for a detailed explanation of Late / Early Binding.

    1. Beyond Excel
    2. JP Software Technologies


In my last post on ADO Recordsets, I demonstrated some VBA to load a Worksheet Range into a Recordset – check it out here.

First Step

Before I filter the Recordset, I would like to manually filter the dataset so I can determine what the final results should be so I can compare to make sure everything goes correctly with the Recordset Filter.

RecordsAllFinal

Here’s some data I borrowed from Marco Russo and Alberto Ferrari. It looks as though it may have come from the AdventureWorks Database. There are a total of 60,398 records.

Now I’ll filter on the SalesManager field to look at records that are not related to Marco.

RecordsFilteredFinal

OK, 25,109 records remain after I filter out records for Marco so When I filter the Recordset, I should receive 25,109 records. I removed the AutoFilter, now I am ready to Filter the Recordset

Global Constants

In my last post. on ADO Recordsets, I began by adding some Global Constants to a module named “M_Globals”. I’m going to add a few new constants for the Filter Group Enumeration. I may use them, I may not. But at least I have them defined if I do need them.

'Filter Group Enumeration
    Public Const gcladFilterNone = 0                 'No filter. This value removes the current filter and restores all records to view.
    Public Const gcladFilterPendingRecords = 1       'Use the pending records. This value allows viewing only those records that have changed but have not yet been sent to the server. This value is only applicable for batch update mode.
    Public Const gcladFilterAffectedRecords = 2      'Use only records affected by the last Delete, Resync, UpdateBatch, or CancelBatch call.
    Public Const gcladFilterFetchedRecords = 3       'Use the last fetched records. This value allows viewing the records in the current cache returned as a result of the last call to retrieve records (implying a resynchronization).
    Public Const gcladFilterConflictingRecords = 5   'Use the conflicting records. This value allows viewing only those records that failed the last batch update.

Load The Recordset

I won’t clutter this tutorial by reposting the same code I posted in my last article on ADO Recordsets, check the Sub() out here.

Filter The Recordset

Now that I have a Recordset, I just need to add a bit of code to filter it. Recall, I am interested in all records where the Sales Manager is not Marco. So my criteria string will be something like “SalesManager <> ‘Marco Russo'”

I just need to add 6 lines to my original Sub() and of those, 2 lines are comment lines (I could use fewer lines, I’m using additional lines for clarity)

        Dim strFilter As String
        'Filter string
            strFilter = "SalesManager <> 'Marco Russo'"
        'Filter the Recordset and display the filter record and field count to check results
            rs.Filter = strFilter
            Debug.Print "The filtered recordset contains " & Format(rs.RecordCount, "##,##0") & " records and " & rs.Fields.Count & " fields"

Returns:

The original recordset contains 60,398 records and 23 fields
The filtered recordset contains 25,109 records and 23 fields

Perfect! The Filtered Recordset matches with the results I obtained earlier by manually filtering the Range.

Gimmee The Data…

Most likely, we want to return the dataset back to the user in either a new workbook or a new worksheet. For today, I’ll return the Filtered Recordset back to the same Workbook on a new Worksheet.

Add A Worksheet

I’ll create a Function to add a worksheet to a workbook so that I have a safe place to return the results of the Filtered Recordset

Public Function AddWorksheet(wb As Workbook) As Worksheet

    'Declare variables
        Dim wsNew As Worksheet
        
    'Add worksheet to end of other worksheets in the workbook
        With wb
            Set wsNew = .Worksheets.Add _
                                    (After:=.Worksheets(.Worksheets.Count))
                            
        End With
        
    'Return object to function
        Set AddWorksheet = wsNew
        
    'Tidy up
        Set wsNew = Nothing
        
End Function

And I call the Function here:

        'Add a worksheet for the filtered results
            Set wsResults = AddWorksheet(wb:=wb)

CopyFromRecordset Method

The Range Object has a CopyFromRecordset Method, so I’ll use that:

        'Copy the filtered recordset to the results range
        'The CopyFromRecordset Method does not include headers
            wsResults.Cells(1, 1).CopyFromRecordset rs

And the output:
RecordsetOutFinal

It’s looking good. The record count matches with what I expected from the manual filter process at the top of the post. The only problem is that the CopyFromRecordset Method does not include the field headers, so I’ll need a small Sub() to get the field headers and then output the Recordset to Cell(2,1) instead of Cell(1,1).

The Fields Collection

The Recordset Object has a Fields Collection, so I can loop through the Fields Collection to get the Field Names. The gotcha here is that the Fields Collection begins as zero – so be aware of that.

Here’s the Sub():

Public Sub GetRSFieldNames(ws As Worksheet, _
                           rs As Object)
                           
    'Declare variables
        Dim x As Long
        
    'Get field names
        For x = 0 To rs.Fields.Count - 1
            ws.Cells(1, x + 1).Value = rs.Fields(x).Name
        Next x
    
End Sub

Here is how I called the Sub():

        'Output Recordset Field Names to the worksheet
            Call GetRSFieldNames(ws:=wsResults, _
                                 rs:=rs)

And the Output:
RecordWFieldsFinal

Looks pretty good – I just want to add some formatting to improve readability.

CharlieDanielsFinal

Fiddle Factor

No – not Charlie Daniels pictured here sawing on a fiddle and playing it hot as in The Devil Went Down To Georgia. Rather, Fiddle Factor is a term I learned from one of my supervisors which refers to the amount of time and energy spent formatting an Excel Report. The more time and energy spent – the higher the Fiddle Factor.

But I think formatting is very important. Not only does it make data and information easier to read and understand, but if it is done well, it actually draws or invites the reader in. Stephen Few has quite a bit to say about well-done formatting on his blog, Perceptual Edge.

Enough of my soap box, my goal in this case is not so lofty. I just want to add a bit of color to the header row, fit the column width to the data and maybe play with the zoom level:

Here is the Sub() to format the output:

Sub FormatOutput(ws As Worksheet)

    'Declare variables
        Dim LastColumn As Long
        Dim rngHeader As Range
        Dim lngColor As Long
        
    'initialize
        lngColor = RGB(68, 84, 106)
        
    'Get last column of header row range
        LastColumn = GetLast(ws:=ws, _
                             strType:="c")
                             
    'Create Range Object - header row range
        With ws
            Set rngHeader = .Range(.Cells(1, 1), .Cells(1, LastColumn))
        End With
        
    'Format the header row range
        With rngHeader
            .Interior.Color = lngColor
            .Font.Bold = True
            .Font.Color = vbWhite
        End With
        
    'Format Dates
        With ws
            .Range("L2").EntireColumn.NumberFormat = "MM/DD/YYYY"
        End With
               
    'Set zoom level
        ws.Activate
        ActiveWindow.Zoom = 75
        
    'Fit column width to data
        Columns.AutoFit

End Sub

And here is how I call the Sub():

        'Format the output
            Call FormatOutput(ws:=wsResults)

The Full Monty

Here is the main Sub() with the additions to Filter the Recordset, Output the Recordset to a new Worksheet, and Format the data:

Sub FilterRecordset()

    'Declare variables
        Dim wb As Workbook
        Dim wbADO As Workbook
        Dim ws As Worksheet
        Dim wsResults As Worksheet
        Dim rng As Range
        Dim rngResults As Range
        Dim cn As Object
        Dim rs As Object
        Dim cmd As Object
        Dim strWorksheet As String
        Dim strSQL As String
        Dim strWorkbookADO As String
        Dim strFilter As String
                
    'Excel environemnt
        With Application
            .ScreenUpdating = False
            .DisplayAlerts = False
            .EnableEvents = False
            .Calculation = xlCalculationManual
        End With

    'Initialize
        Set wb = ThisWorkbook
        
        'Get worksheet to be loaded into recordset
            strWorksheet = GetSelectedSheet(strPrompt:="Select a cell on the worksheet to be loaded into the recordset", _
                                            strTitle:="Worksheet To Recordset")
                                            
        'Create a new workbook to hold all data from the selected worksheet
            Set wbADO = Workbooks.Add
            
        'Copy everything from the selected worksheet to the new workbook
            Call CopyData(wbSource:=wb, _
                          wbDestination:=wbADO, _
                          strSource:=strWorksheet)
                          
        'Cleanup the destination workbook
            Call CleanupWorkbook(wb:=wbADO)
            
        'Save and close the data workbook
            With wbADO
                .SaveAs wb.Path & "\" & Mid(wb.Name, 1, Len(wb.Name) - 5) & "_ADO.xlsx", FileFormat:=xlOpenXMLWorkbook
                strWorkbookADO = wbADO.FullName
                .Close
            End With

        'Create a range object to measure source data against final recordset data
            Set ws = wb.Worksheets(strWorksheet)
            Set rng = ws.Range("A1").CurrentRegion

        'SQL string
            strSQL = "SELECT * FROM [Data$]"
            
        'Filter string
            strFilter = "SalesManager <> 'Marco Russo'"

        'Create ADO Connection Object
            Set cn = GetADOConnection()
            cn.Open ("Provider=Microsoft.ACE.OLEDB.12.0;" & _
                     "Data Source=" & strWorkbookADO & ";" & _
                     "Extended Properties='Excel 12.0 Xml;HDR=YES;IMEX=1'")

        'Create ADO Command Object
            Set cmd = GetCommand()
            Set cmd.ActiveConnection = cn
            cmd.CommandType = gcladCmdText
            cmd.CommandText = strSQL                        'Pass SQL String to the command object

        'Create ADO Recordset Object and load records
            Set rs = GetRecordset()
            With rs
                .CursorLocation = gcladUseClient
                .CursorType = gcladOpenDynamic
                .LockType = gcladLockOptimistic
                .Open cmd
            End With

        'Compare recordset results to original data
            Debug.Print "The original recordset contains " & Format(rs.RecordCount, "##,##0") & " records and " & rs.Fields.Count & " fields"
            Debug.Print "The range contains " & Format(rng.Rows.Count - 1, "##,##0") & " rows and " & rng.Columns.Count & " columns" '-1 to discount header row
            
        'Filter the Recordset
            rs.Filter = strFilter
            
        'Add a worksheet for the filtered results
            Set wsResults = AddWorksheet(wb:=wb)
            
        'Output Recordset Field Names to the worksheet
            Call GetRSFieldNames(ws:=wsResults, _
                                 rs:=rs)
            
        'Copy the filtered recordset to the results range
        'The CopyFromRecordset Method does not include headers
            wsResults.Cells(2, 1).CopyFromRecordset rs
            
        'Format the output
            Call FormatOutput(ws:=wsResults)
    
            
        'Tidy up
            'Close objects
                rs.Close
                cn.Close
                
            'Destroy objects
                Set rs = Nothing
                Set cmd = Nothing
                Set cn = Nothing
                Set rng = Nothing
                Set ws = Nothing
                Set wsResults = Nothing
                Set wbADO = Nothing
                Set wb = Nothing
                
            'Excel environemnt
                With Application
                    .ScreenUpdating = True
                    .DisplayAlerts = True
                    .EnableEvents = True
                    .Calculation = xlCalculationAutomatic
                End With
                    
End Sub

And the final output:

RecordWFieldsAutoFitComplete

Tidy up

Final Thoughts

This post was about Filtering ADO Recordsets. The Filter I used was very simple and only scratches the surface of what is possible. You may use the Filters in combinations with AND, OR, LIKE and Wildcard Characters. Make sure you check out the link to Recordset Filter Property. Lots of great information.

I don’t like that I hard coded the Filter String inside the Sub(). It would be better to offer a user form at run-time to read the fields in the recordset and prompt the user to make choices through Combo Boxes, Check Boxes, etc…

Other Recordset Posts At dataprose.org

Additional Resources

Downloads

Download the file from OneDrive. The filename is Excel – Recordset_v3.xlsm

Credits

Data courtesy Microsoft Excel 2013 Building Data Models with PowerPivot by Alberto Ferrari and Marco Russo (Mar 25, 2013)

FemmesFinal

I hope you know that this will go down on your permanent record
Oh yeah? Well, don’t get so distressed
Did I happen to mention that I’m impressed?

So go the lyrics from Kiss Off by the Violent Femmes pictured here. Great song from the 80’s. If you are not familiar with the Femmes, check ’em out – highly recommended.

However, today’s post is not about the Alternative Rock scene of the early 80’s. Rather, it is about ActiveX Data Objects (ADO) Recordsets.


    edit: As is my usual practice, I am going to use Late Binding in the sample snippets below. A discussion on Late / Early Binding is beyond the scope of this post. Please see these links for a detailed explanation of Late / Early Binding.

    1. Beyond Excel
    2. JP Software Technologies


Global Constants

Because I am using Late Binding, I am going to setup Global Constants for the various Enumerations I need for Command Types, Cursor Locations, Cursor Types and Lock Types. It takes a little bit of extra work, but I think it makes the code clearer:

Clear:

cmd.CommandType = gcladCmdText

Not clear:

cmd.CommandType = 1

See? The Constant is self-documenting and makes the code clearer.

Here’s all the Global Contants. I put them in a Module named “M_Globals” because I’m creative that way.

'Command Type Enumeration Values
    Public Const gcladCmdUnspecified = -1       'Unspecified type of command
    Public Const gcladCmdText = 1               'Evaluates CommandText as a textual definition of a command or stored procedure call
    Public Const gcladCmdTable = 2              'Evaluates CommandText as a table name whose columns are returned by an SQL query
    Public Const gcladCmdStoredProc = 4         'Evaluates CommandText as a stored procedure name
    Public Const gcladCmdUnknown = 8            'Default. Unknown type of command
    Public Const gcladCmdFile = 256             'Evaluates CommandText as the file name of a persistently stored Recordset. Used with Recordset.Open or Requery only.
    Public Const gcladCmdTableDirect = 512      'Evaluates CommandText as a table name whose columns are all returned. Used with Recordset.Open or Requery only. To use the Seek method, the Recordset must be opened with adCmdTableDirect. Cannot be combined with the ExecuteOptionEnum value adAsyncExecute.

'Cursor Location Enumeration Values
    Public Const gcladUseNone = 1               'OBSOLETE (appears only for backward compatibility). Does not use cursor services
    Public Const gcladUseServer = 2             'Default. Uses a server-side cursor
    Public Const gcladUseClient = 3             'Uses a client-side cursor supplied by a local cursor library. For backward compatibility, the synonym adUseClientBatch is also supported
    
'Cursor Type Enumeration Values
    Public Const gcladOpenUnspecified = -1      'Unspecified type of cursor
    Public Const gcladOpenForwardOnly = 0       'Default. A forward-only cursor. This improves performance when you need to make only one pass through a Recordset
    Public Const gcladOpenKeyset = 1            'A keyset cursor. Like a dynamic cursor, except that you can't see records that other users add, although records that other users delete are inaccessible from your Recordset. Data changes by other users are still visible.
    Public Const gcladOpenDynamic = 2           'A dynamic cursor. Additions, changes, and deletions by other users are visible, and all types of movement through the Recordset are allowed
    Public Const gcladOpenStatic = 3            'A static cursor. A static copy of a set of records that you can use to find data or generate reports. Additions, changes, or deletions by other users are not visible.

'Lock Type Enumeration Values
    Public Const gcladLockUnspecified = -1      'Unspecified type of lock. Clones inherits lock type from the original Recordset.
    Public Const gcladLockReadOnly = 1          'Default. Read-only records
    Public Const gcladLockPessimistic = 2       'Pessimistic locking, record by record. The provider lock records immediately after editing
    Public Const gcladLockOptimistic = 3        'Optimistic locking, record by record. The provider lock records only when calling update
    Public Const gcladLockBatchOptimistic = 4   'Optimistic batch updates. Required for batch update mode

Let The User Choose Which Worksheet To Load To The Recordset

In the function below, I use an InputBox Type:=8 to let the user choose a cell on the worksheet that contains the data that should be loaded into the recordset:

Public Function GetSelectedSheet(strPrompt As String, _
                                 strTitle As String) As String
     
    'Declare variables
        Dim ws                          As Worksheet
        Dim rng                         As Range
     
    'Users - select a cell on a worksheet
        Set rng = Application.InputBox( _
                                       Prompt:=strPrompt, _
                                       Title:=strTitle, _
                                       Default:=ActiveCell.Address, _
                                       Type:=8) 'Range selection
                     
    'Get the parent worksheet of the selected cell
        Set ws = rng.Parent
         
    'Pass the name of the worksheet to the function
        GetSelectedSheet = ws.Name
     
    'Tidy up
        Set rng = Nothing
        Set ws = Nothing
 End Function

And I call the function like this:

        'Get worksheet to be loaded into recordset
            strWorksheet = GetSelectedSheet(strPrompt:="Select a cell on the worksheet to be loaded into the recordset", _
                                            strTitle:="Worksheet To Recordset")

Save The Data To A New Workbook

I could not get the ADO code to work with data in the same workbook, so in the end I decided to just save the data the user selected out to a new workbook:

First, I added a workbook:

        'Create a new workbook to hold all data from the selected worksheet
            Set wbADO = Workbooks.Add

Then I sent the original workbook, the new workbook and the the worksheet that the user selected to a Private Sub to handle the copying:

Call the sub:

        'Copy everything from the selected worksheet to the new workbook
            Call CopyData(wbSource:=wb, _
                          wbDestination:=wbADO, _
                          strSource:=strWorksheet)

The Sub() to copy the entire worksheet from one workbook to another:

Private Sub CopyData(wbSource As Workbook, _
                     wbDestination As Workbook, _
                     strSource As String)
                     
    wbSource.Worksheets(strSource).Copy wbDestination.Worksheets(1)

End Sub

And then the cleanup:

        'Cleanup the destination workbook
            Call CleanupWorkbook(wb:=wbADO)

The Sub() to handle any cleanup chores:

Private Sub CleanupWorkbook(wb As Workbook)

    'Declare variables
        Dim i As Long
        
    'Rename worksheets
    'Delete unneeded worksheets
        With wb
            .Worksheets(1).Name = "Data"
            For i = .Sheets.Count To 2 Step -1
                .Sheets(i).Delete
            Next i
        End With

End Sub

Lastly, I saved and closed the new workbook, since the ADO Process will want to open the workbook

        'Save and close the data workbook
            With wbADO
                .SaveAs wb.Path & "\" & Mid(wb.Name, 1, Len(wb.Name) - 5) & "_ADO.xlsx", FileFormat:=xlOpenXMLWorkbook
                strWorkbookADO = wbADO.FullName
                .Close
            End With

Create A Range Object To Measure Inputs

I would like to measure the amount of rows and columns in the input Range so that after I load the Recordset I can compare the the Range Dimensions to the Recordset Dimensions.

        'Create a range object to measure source data against final recordset data
            Set ws = wb.Worksheets(strWorksheet)
            Set rng = ws.Range("A1").CurrentRegion

SQL String

I like to create a SQL string and then pass the SQL string to the CommandText Property of the Command Object. I think this makes troubleshooting and tuning the SQL easier:

        'SQL string
            strSQL = "SELECT * FROM [Data$]"

Create The ADO Connection Object

I like to encapsulate any objects I am creating. Here is the function to create the ADO Connection Object:

Public Function GetADOConnection() As Object

    Set GetADOConnection = CreateObject("ADODB.Connection")
  
End Function

And here is how I call it:

        'Create ADO Connection Object
            Set cn = GetADOConnection()

ADO Connection Strings

ADO Connection Strings can be a little challenging, luckily, we have ConnectionStrings.com to help us out. Link is at the bottom of the post. I am using Office 2013 with a workbook in xlOpenXMLWorkbook format (.xlsx). So this is the connection string I’ll be using:

cn.Open ("Provider=Microsoft.ACE.OLEDB.12.0;" & _
         "Data Source=" & strWorkbookADO & ";" & _
         "Extended Properties='Excel 12.0 Xml;HDR=YES;IMEX=1'")

Create The ADO Command Object

Next, I’ll create the ADO Command Object and set some of the properties of the object. Note here that I am using one of the Global Contants that I declared earlier. I also pass my SQL string here:

        'Create ADO Command Object
            Set cmd = GetCommand()
            Set cmd.ActiveConnection = cn
            cmd.CommandType = gcladCmdText
            cmd.CommandText = strSQL                        'Pass SQL String to the command object

And here is the function to create the Command Object :

Public Function GetCommand() As Object

    Set GetCommand = CreateObject("ADODB.Command")
  
End Function

Create And Load The ADO Recordset

Next I need to create and load the recordset.

'Create ADO Recordset Object and load records
            Set rs = GetRecordset()
            With rs
                .CursorLocation = gcladUseClient
                .CursorType = gcladOpenDynamic
                .LockType = gcladLockOptimistic
                .Open cmd
            End With

And here is the encapsulated function that creates the ADO Recordset Object:

Public Function GetRecordset() As Object

    Set GetRecordset = CreateObject("ADODB.Recordset")
  
End Function

Check Recordset Results Against Expected Results

Lastly, I want to compare the Recordset results against expected results. Do do this I will count the number of records and fields in the Recordset and compare them against the number of rows and columns in the Range Object I created earlier:

        'Compare recordset results to original data
            Debug.Print "The recordset contains " & Format(rs.RecordCount, "##,##0") & " records and " & rs.Fields.Count & " fields"
            Debug.Print "The range contains " & Format(rng.Rows.Count - 1, "##,##0") & " rows and " & rng.Columns.Count & " columns" '-1 to discount header row

Returns:

The recordset contains 60,398 records and 23 fields
The range contains 60,398 rows and 23 columns

Everything is working as it should.

The Main Procedure

Sub PopulateRecordset()

    'Declare variables
        Dim wb As Workbook
        Dim wbADO As Workbook
        Dim ws As Worksheet
        Dim rng As Range
        Dim cn As Object
        Dim rs As Object
        Dim cmd As Object
        Dim strWorksheet As String
        Dim strSQL As String
        Dim strWorkbookADO As String
                
    'Excel environemnt
        With Application
            .ScreenUpdating = False
            .DisplayAlerts = False
            .EnableEvents = False
            .Calculation = xlCalculationManual
        End With

    'Initialize
        Set wb = ThisWorkbook
        
        'Get worksheet to be loaded into recordset
            strWorksheet = GetSelectedSheet(strPrompt:="Select a cell on the worksheet to be loaded into the recordset", _
                                            strTitle:="Worksheet To Recordset")
                                            
        'Create a new workbook to hold all data from the selected worksheet
            Set wbADO = Workbooks.Add
            
        'Copy everything from the selected worksheet to the new workbook
            Call CopyData(wbSource:=wb, _
                          wbDestination:=wbADO, _
                          strSource:=strWorksheet)
                          
        'Cleanup the destination workbook
            Call CleanupWorkbook(wb:=wbADO)
            
        'Save and close the data workbook
            With wbADO
                .SaveAs wb.Path & "\" & Mid(wb.Name, 1, Len(wb.Name) - 5) & "_ADO.xlsx", FileFormat:=xlOpenXMLWorkbook
                strWorkbookADO = wbADO.FullName
                .Close
            End With

        'Create a range object to measure source data against final recordset data
            Set ws = wb.Worksheets(strWorksheet)
            Set rng = ws.Range("A1").CurrentRegion

        'SQL string
            strSQL = "SELECT * FROM [Data$]"

        'Create ADO Connection Object
            Set cn = GetADOConnection()
            cn.Open ("Provider=Microsoft.ACE.OLEDB.12.0;" & _
                     "Data Source=" & strWorkbookADO & ";" & _
                     "Extended Properties='Excel 12.0 Xml;HDR=YES;IMEX=1'")

        'Create ADO Command Object
            Set cmd = GetCommand()
            Set cmd.ActiveConnection = cn
            cmd.CommandType = gcladCmdText
            cmd.CommandText = strSQL                        'Pass SQL String to the command object


        'Create ADO Recordset Object and load records
            Set rs = GetRecordset()
            With rs
                .CursorLocation = gcladUseClient
                .CursorType = gcladOpenDynamic
                .LockType = gcladLockOptimistic
                .Open cmd
            End With

        'Compare recordset results to original data
            Debug.Print "The recordset contains " & Format(rs.RecordCount, "##,##0") & " records and " & rs.Fields.Count & " fields"
            Debug.Print "The range contains " & Format(rng.Rows.Count - 1, "##,##0") & " rows and " & rng.Columns.Count & " columns" '-1 to discount header row
            
        'Tidy up
            'Close objects
                rs.Close
                cn.Close
                
            'Destroy objects
                Set rs = Nothing
                Set cmd = Nothing
                Set cn = Nothing
                Set rng = Nothing
                Set ws = Nothing
                Set wbADO = Nothing
                Set wb = Nothing
                
            'Excel environemnt
                With Application
                    .ScreenUpdating = True
                    .DisplayAlerts = True
                    .EnableEvents = True
                    .Calculation = xlCalculationAutomatic
                End With
                    
End Sub

Tidy up

Additional Resources

Downloads

Credits

    Data courtesy Microsoft Excel 2013 Building Data Models with PowerPivot by Alberto Ferrari and Marco Russo (Mar 25, 2013)

Final Thoughts

    I could not get the ADO to work with data in the same workbook that contains the code. That does not mean you cannot – I don’t know. I would generally invoke a FileDialogFilePicker function to select a file that contains the data for processing. That’s it for today. I’ll come back later with more stuff on working with Recordsets. Thanks, Dennis!
, , ,

AdvanceFinal500

AAAAAARRRRGGGGHHHHH!!!!! I just hate reviewing an Excel Workbook with lots of tabs and the ActiveCell is different on every tab. Let’s look at some ways to fix this. Before I can get going, I need to:

  1. Add several copies of a worksheet to the active workbook
  2. Randomize the ActiveCell on each worksheet



    edit: While my main point of this post is the GoTo Method of the Excel Application Object, I touch on many other items as well

  1. Copy Method of the Worksheet Object
  2. Cells Property of the Range Object
  3. Cells Property of the Worksheet Object
  4. Find Method of the Range Object
  5. Range Property of the Worksheet Object
  6. Select..Case Statement
  7. Application Worksheet Function Randbetween
  8. InputBox Method of the Excel Application Object
  9. VBA Instr() Function


Add Worksheet Copies

I downloaded a sample P&L Statement from the Internet. Now I just want to add 29 copies of the P&L Statement to the workbook.

PLStatement

Option Explicit

Sub Foo()

    'Declare variables
        Dim wb As Workbook
        Dim wsPL As Worksheet
    
    'Excel environment
        With Application
            .ScreenUpdating = False
        End With
        
    'Initialize
        Set wb = ThisWorkbook
        Set wsPL = wb.Worksheets("PL_CC")
    
    'Add worksheets
        Call CopyWorksheets(wb:=wb, _
                            wsSource:=wsPL, _
                            NumberOfCopies:=30)

    'Tidy up
        'Destroy objects
            Set wsPL = Nothing
            Set wb = Nothing
            
        'Excel environment
            With Application
                .ScreenUpdating = True
            End With
        
End Sub

Private Sub CopyWorksheets(wb As Workbook, _
                           wsSource As Worksheet, _
                           NumberOfCopies As Long)
    
    'Declare variables
        Dim i As Long
                           
                                   
    'Make copies of worksheet
        For i = 2 To NumberOfCopies
            wsSource.Copy _
                After:=wb.Worksheets(Worksheets.Count)
                ActiveSheet.Name = wsSource.Name & i
        Next i
End Sub

Great! Added 29 copies of the P&L Statement and renamed each worksheet.

PLStatementsMult

Randomize The ActiveCell

Since I made copies of a worksheet the ActiveCell is the same on every worksheet. I want to randomize the ActiveCell before I create some code to set the ActiveCell on every worksheet. The P&L Statement I am using has 15 Columns and 77 Rows. But I would like to determine those values with some VBA so that my code is more dynamic and will wok with any worksheet that I use in the future.

First I’ll check the UsedRange

Private Sub GetUsedRange(ws As Worksheet)

    'Declare variables
        Dim rng As Range
        
    'Create range object
        Set rng = ws.UsedRange
        
    'Print adddress
        Debug.Print rng.Address
        
    'Tidy up
        Set rng = Nothing
  
End Sub

Returns:

$A$1:$Y$77

That’s not what I want. I’m looking for $O$77 as the last used cell.

Last Used Cell Function

This function returns the correct last used cell:

Private Function GetLastUsedCell(ws As Worksheet) As Range

    'Declare variables
        Dim rngLastUsedCell As Range
        
    'Create range object from last used cell
        Set rngLastUsedCell = ActiveSheet.Cells.Find(What:="*", _
                                               After:=ws.Cells(1, 1), _
                                               LookIn:=xlFormulas, _
                                               LookAt:=xlPart, _
                                               SearchOrder:=xlByRows, _
                                               SearchDirection:=xlPrevious, _
                                               MatchCase:=False)

    'Pass object to function
        Set GetLastUsedCell = rngLastUsedCell
    
    'Tidy up
        Set rngLastUsedCell = Nothing
        
End Function

Returns:

$O$77

Perfect! That’s what I was looking for. Now I need to find the first used cell.

First Used Cell Function

In pure violation of the recommendation of Bovey et. al., in their landmark tome, Professional Excel Development: The Definitive Guide to Developing Applications Using Microsoft Excel, VBA, and .NET (2nd Edition), I generally begin all of my worksheets at $A$1, but not always. So I need a function to find the first used cell on the worksheet no matter what worksheet I throw at the function.

Private Function GetFirstUsedCell(ws As Worksheet) As Range

    'Declare variables
        Dim rng As Range
        
    'Create range object from the first cell in the UsedRange
        Set rng = ws.UsedRange.Cells(1, 1)

    'Pass object to function
        Set GetFirstUsedCell = rng
    
    'Tidy up
        Set rng = Nothing
        
End Function

Returns:

$A$1

Excellent! That’s what I was looking for. Now that I have the first and last cells of the true UsedRange, I need to use the first cell and last cell to create a range of everything.

Create A Big Range

Private Function GetBigRange(ws As Worksheet, _
                             rngStart As Range, _
                             rngStop As Range) As Range
                             
    'Declare variables
        Dim rng As Range
        
    'Creat a range object from start and stop range positions
        Set rng = ws.Range(rngStart, rngStop)
        
    'Pass range object to function
        Set GetBigRange = rng
        
    'Tidy up
        Set rng = Nothing
                             
End Function

Returns:

$A$1:$O$77

Awesome! Now, I can use the entire range to generate random values to use for the ActiveCell.

Get Values

I’ll need to call a get value function 4 times to get the first row, the last row, the first column and the last column.

Here’s the function:

Private Function GetValueFromRange(rng As Range, _
                                   strType As String) As Long
    
    'Declare variables
        Dim x As Long
        
    'Get value depending on type
        With rng
            Select Case strType
                Case "FirstRow"
                    x = .Row
                Case "LastRow"
                    x = .Rows.Count
                Case "FirstColumn"
                    x = .Column
                Case "LastColumn"
                    x = .Columns.Count
            End Select
        End With
        
    'Pass value to function
        GetValueFromRange = x
     
End Function

And here is how I called the function 4 different times:

'Get values form total used range
        BeginRow = GetValueFromRange(rng:=rngAll, _
                                     strType:="FirstRow")
                                     
        EndRow = GetValueFromRange(rng:=rngAll, _
                                   strType:="LastRow")
                                   
        BeginColumn = GetValueFromRange(rng:=rngAll, _
                                   strType:="FirstColumn")
                                   
        EndColumn = GetValueFromRange(rng:=rngAll, _
                                   strType:="LastColumn")

Returns:

  • BeginRow 1
  • EndRow 77
  • BeginColumn 1
  • EndColumn 15
  • Gnarly! Now, I have the values that I can pass to a function to generate a random cell reference.

    Get Random Values

    Now that I have high-low pairs for rows and columns, I can use the Worksheet Function Randbetween to generate some random values for the row and column numbers.

    Private Function GetRandomValue(ValueLow As Long, _
                                    ValueHigh As Long) As Long
    
        'Declare variables
            Dim x As Long
            
        'Generate a random value
            x = Application.WorksheetFunction.RandBetween(ValueLow, ValueHigh)
            
        'Pass value to the function
            GetRandomValue = x
    
    End Function
    

    And here is how I call the function to get a random row and a random column:

                        'Get random row
                            RandomRow = GetRandomValue(ValueLow:=BeginRow, _
                                                       ValueHigh:=EndRow)
                                                       
                        'Get random column
                            RandomColumn = GetRandomValue(ValueLow:=BeginColumn, _
                                                          ValueHigh:=EndColumn)
    

    Go There!

    Once I have a Row and Column number, I can use the Cells property of the Worksheet Object in conjunction with the GoTo Method of the Excel Application Object to go to the desired cell:

                        'Go to the cell
                            Application.GoTo ws.Cells(RandomRow, RandomColumn), _
                                                      Scroll:=True
    

    ActiveCellOutput

    Bam! I sent the Worksheet Name and the ActiveCell Address to the Immediate Window.

    Let’s Go!

    Recall, the end in mind is to set the ActiveCell to the same cell on every sheet to aid in our visual review of each worksheet in the workbook. I introduced the Code Snippet just a few lines up:

                        'Go to the cell
                            Application.GoTo ws.Cells(RandomRow, RandomColumn), _
                                                      Scroll:=True
    

    So I just need to pass 2 values of datatype long and I can go to any cell I want. So to GoTo R1C1 is merely:

                        'Go to the cell
                            Application.GoTo ws.Cells(1, 1), _
                                                      Scroll:=True
    

    Pow! The ActiveCell on every Worksheet is now R1C1 (A1). The review is much more pleasurable.

    But I want flexibility!

    What if you want to set the ActiveCell as $R$10 on every Worksheet, next time you want $C$12 – you get the point – enter the InputBox.

    Get Input

    I previously demonstrated a function to allow the user to select a cell here. Read more about using InputBoxes and the various types on MSDN

    For my purposes, I want the user to select a cell, so I will use Type:=8 for my InputBox:

    Public Function GetCell(ws As Worksheet) As Range
         
        'Declare variables
            Dim rng                         As Range
         
        'Users - select a cell on a worksheet
            Set rng = Application.InputBox( _
                                           Prompt:="Please select a cell on the worksheet", _
                                           Title:="Select a cell", _
                                           Default:=ActiveCell.Address, _
                                           Type:=8) 'Range selection
             
        'Pass the range object to the function
            Set GetCell = rng
         
        'Tidy up
            Set rng = Nothing
    
     End Function
    

    Now, I will test the GetCell Function:

    Option Explicit
    
    Sub testit()
    
        Dim wb As Workbook
        Dim ws As Worksheet
        Dim C As Range
        
        Set wb = ThisWorkbook
        Set ws = wb.Worksheets("PL_CC")
        Set C = GetCell(ws:=ws)
        
        Debug.Print C.Address
        
        Set C = Nothing
        Set ws = Nothing
        Set wb = Nothing
    
    End Sub
    

    And spin the test code:

    SelectCellPrompt

    I am prompted to select a cell on the worksheet. I clicked on cell $C$8.

    ClickCellOutput

    Shazam! That worked perfectly! I can move the Function into production and dynamically let the user choose what should be the ActiveCell on every Worksheet.

    Option Explicit
    
    Sub SetActiveCell()
        
        'Declare variables
            Dim wb As Workbook
            Dim ws As Worksheet
            Dim wsPL As Worksheet
            Dim C As Range
            Dim userRow As Long
            Dim userColumn As Long
        
        'Initialize
            Set wb = ThisWorkbook
            Set wsPL = wb.Worksheets("PL_CC")
            Set C = GetCell(ws:=wsPL)
        
            With C
                userRow = .Row
                userColumn = .Column
            End With
        
        'Set the ActiveCell on each worksheet
            For Each ws In wb.Worksheets
                If InStr(ws.Name, "PL") Then
                    Application.GoTo ws.Cells(userRow, userColumn), _
                                    Scroll:=True
                End If
            Next ws
        
        'Tidy up
            Set C = Nothing
            Set wsPL = Nothing
            Set wb = Nothing
            
    End Sub
    

    ActiveCellC8

    All worksheets with “PL” as part of the name are now active on cell $C$8. Much easier to review!

    Tidy Up

    I’m outta here. I just landed on Boardwalk and the kid has the monopoly with a hotel – I’m pretty sure I’m bankrupt!

    BankruptFinal

    , , , , , , , , ,

    UndauntedFinalII

    Some time back, Shane Devenshire and I added our two cents to a question a user asked on LinkedIn here. So when I decided to write a post regarding UnPivot using nested For..Next Loops, it only made sense to ask Shane if he would like to write a guest-post and tutorial on his process and Normalization Utility.

    So, without further ado, take it away Shane!

    Un-pivoting Excel Spreadsheet data – Creating a Normalized Table

    Introduction:
    Recently I was working with Tableau and discovered that the company had an Excel add-in for taking data laid out like a pivot table or standard spreadsheet layout:

    SDTable1

    And converting it to data laid out like a table (normalized):

    SDTable2

    When I tested the add-in on a large data it took over an hour. I knew I could do this faster, even manually, but I decided to automate the process, which uses a number of Excel’s built-in features. The attached is the initial result of this endeavor.

    When one wants to convert spreadsheet data, laid out as in a pivot table, into what is called a normalized database or table one can use a VBA approach which loops through cells rearranging the data and copying repeated data as many times as necessary. You could do this manually but it’s rather labor intensive. However, even the looping approach is not very efficient, which becomes apparent when the data sets are large. An alternate approach using some of Excel’s less familiar features can speed up the process dramatically.

    I created the three step wizard shown below to provide some enhancements:

    In the first step, see Figure 1, the user indicates the top left corner of the values area, the sample show what we are looking for. This allows the VBA routine to decide where the column titles are and which columns will be retained in their default structure and which column will be “transposed” (placed in a singe “values” column.

    Figure 1 – First step of the Wizard:

    SDTable3

    If the user clicks Finish, the code accepts all the defaults and created the normalized table. If the user clicks Next the second step of the Wizard is displayed, see Figure 2.

    Figure 2 – This is the second step of the Wizard:

    SDTable4

    Here the Wizard displays the first ten fields to the left of the values area allowing the user to choose which ones to include. If there are dates at the top of the values area a default name for the new field is Date which is automatically entered in the Field Name box which the user can change or leave blank. The Delimiter box allows the user to indicate what delimiter to use when concatenating the fields together (discussed later). The delimiter should not be a character found in the chosen fields on the left in the dialog box.

    Clicking Next brings you to the third step of the wizard, shown in Figure 3.

    Figure 3 – This is the third step of the Wizard:

    SDTable5

    The first two options on this screen allow the user to retain or remove rows of data in the output table which have either a 0 (zero) or are blank in the values column. This reduces the file size when appropriate. Because the Excel feature used to normalize the data generates an Excel “table”, the last three options allow the user to turn on or off various features of tables or convert the table to a range.

    Overview:

    You can create a pivot table in Excel with a feature called Multiple Consolidation Ranges. You can then use the Show Details command on the Grand/Grand Total to produce a normalized table. Both of these commands are very fast! One limitation of this command is that is was designed to be used with data which has only one label field to the left of the values area. To bypass this limitation one can combine all the label fields by using a concatenation formula, but to later break it apart a delimiter is need. Then the multiple consolidation command is run against this one field and all of the value columns. The concatenated field of the output is then parsed using Excel’s Text to Columns command with the delimiter specified by the user. These steps are very fast. As noted in the discussion of step 3 of the Wizard, I have also incorporated a number of additional options.

    I create two modules and a user form to control this process. All of the code and form can be found in the attached file. I have tried not to obfuscate the code. There are four main components to consider

    Option Explicit		
    Dim k	As Integer	number of fields to be output
    		
    Sub NormalizeData()		
    Dim sTempSheet	As String	Name of the sheet where the pivot table is placed
    Dim j	As Integer	loop counter
    Dim i	As Integer	loop counter
    Dim sMyFormula	As String	the concatenated formula
    Dim sMySheet	As String	Name of the sheet where initial data is
    Dim lMyColumn	As Long	column number of column used for concatenated formula
    Dim sDataSheet	As String	Name of the sheet where the final output is placed
    Dim sSource	As String	Source range for the pivot table
    Dim sMyTitles	As String	The concatenated titles
        		
        'Speeds things up		
        With Application		
            .ScreenUpdating = False	
            .Calculation = xlCalculationManual	
        End With		
        		
        sMySheet = ActiveSheet.Name	
        		
        'uses the ref edit address to place the cursor in the top left corner of the values to be transposed
        Range(frmNormalize.refFirstData).Activate
        		
        'Inserts new blank column for concatenation formula
        ActiveCell.EntireColumn.Insert	
        lMyColumn = ActiveCell.Column	
        		
        'initializes loop counters		
        j = 1		
        k = 0		
        		
        'loops through each column and concatenates it if that column was choosen
        'this loop runs a max of ten times	
        For i = iTotCols To 1 Step -1	
            If frmNormalize.Controls("Checkbox" & j) = True Then
                If sMyFormula = "" Then	
                    sMyFormula = "=RC[" & -i & "]"	
                Else		
                    sMyFormula = sMyFormula & "&" & """" & sDelimiter & """" & "&RC[" & -i & "]"
                End If		
                k = k + 1		
            End If		
            j = j + 1		
        Next i		
    

    The above code creates a concatenation formula of the forma =A5&”*”&B5&”*”&C5, which is then converted to values like 2008*Coffee*Seattle. The above code loops once to build the concatenated formula for a single row, and then fills it down. All very fast. The delimiter must be unique because it will be used by the Text to Columns command to parse the data.

        'Selects the title row and inserts the formula there.
        'Here it will be the concatenated titles	
        ActiveCell.Offset(-1, 0).Select	
        ActiveCell = sMyFormula		
        Range(ActiveCell, Cells(lLastRow, ActiveCell.Column)).FillDown
        ActiveSheet.Calculate		
        		
        'converts the range to values	
        Selection = Selection.Value	
        		
        'the concatenated titles is saved for later use since it would be lost during the next commands
        sMyTitles = ActiveCell		
        		
        'Creates "consolidated ranges" pivot table, which will later be removed
        sSource = Range(Cells(lFirstDataRow - 1, lFirstDataColumn), Cells(lLastRow, lLastColumn)).Address(, , xlR1C1)
        ActiveWorkbook.PivotCaches.Create(SourceType:=xlConsolidation, _
            SourceData:=sSource).CreatePivotTable _
            TableDestination:="", _	
           TableName:="NPT"		
        On Error Resume Next	
        ActiveSheet.PivotTables("NPT").PivotFields("Count of Value").Function = xlSum
    

    The above code is one of the steps that make this process so fast. Excel builds a “consolidated range” pivot table. If some of the data is non-numeric the data field would use “Count” which we want to convert to “Sum”, this is the reason for lines 67 and 68. Manually you can reach this command by pressing Alt+D+P to expose the PivotTable and PivotChart Wizard from Excel 2003, shown in Figure 4, below:

    Figure 4 – This is the first step of Microsoft’s Wizard, which the code is emulating:

    SDTable6

    The command we are using is Multiple consolidation ranges. Excel gives us a pivot table which Is, in and of itself, not very useful. But in the following code, the ShowDetail command used on the Grand/Grand Total creates the desired output. And it’s very fast!

        'creates database detail from the above pivot table, which is pretty much what we want
        sTempSheet = ActiveSheet.Name	
        ActiveCell.SpecialCells(xlLastCell).ShowDetail = True   'Generates normalized database
        		
        'you can turn off these alerts earlier in the macro but its often safer to turn them of and on only where needed
        ‘The pivot table sheet is no longer need and can be deleted
        With Application		
            .DisplayAlerts = False		
            Sheets(sTempSheet).Delete	
            .DisplayAlerts = True
    

    The following code first inserts some blank columns and then uses Excel’s Text to Columns command to convert the concatenated field back to its original individual fields. Yet another very fast command!

            'inserts as many columns as need to display all the concatenated fields
            Range("B1:" & Cells(1, k).Address).EntireColumn.Insert
            'puts the concatenated titles in cell A1
            Range("A1") = sMyTitles
            Range("A1", [A1].End(xlDown)).Select
            
            'the text to columns command overwrites the table's default titles
            .DisplayAlerts = False
        
            'converts concatenated field to a set of columns
            Selection.TextToColumns _
                Destination:=Range("A1"), _
                DataType:=xlDelimited, _
                ConsecutiveDelimiter:=False, _
                Other:=True, _
                OtherChar:=sDelimiter
            .DisplayAlerts = True
        End With
        
        'best fit the data
        Range("A1:" & Cells(1, iTotCols).Address).EntireColumn.AutoFit
        
        'enters the field name for the transposed row
        Range("A1").Offset(0, k) = frmNormalize.txtNewFieldName
        
        'Options - removes rows With blanks or zeros in the value field
        If frmNormalize.cbBlanks = True Then
            RemoveBlanks
        End If
        If frmNormalize.cbZeros = True Then
            RemoveZeros
        End If
        ConvertToRange
        
        'Cleanup
        sDataSheet = ActiveSheet.Name
        Sheets(sMySheet).Activate
    
       ‘removes the concatenated formula column from the original data.
        Cells(1, lMyColumn).EntireColumn.Delete    
        Sheets(sDataSheet).Activate
        Application.Calculation = xlCalculationAutomatic
        
    End Sub
    

    Deleting rows can be done by looping from the bottom and deleting each row that meets a given condition. This is very slow! If you can select all the rows which meet some condition you can then delete them with one command. This is faster. However, if Excel needs to adjust cell references based on the deleted rows this can also slow things down. One solution is to move all the rows you want to delete to the bottom of the dataset and then delete them with a single command. In the following two modules something like that is done. For removing rows with blanks in the values column you simple need to sort by the values and the use the Go to Special, Blanks command to select all the rows.

    RemoveBlanks Subroutine

    Removes rows with blanks in value column
    Private Sub RemoveBlanks()
        
        'puts all blank cells at the bottom of the table making deleting rows faster
        With ActiveSheet.Sort
            .SortFields.Clear
            .SortFields.Add _
                Key:=Cells(1, k + 2), _
                SortOn:=xlSortOnValues, _
                Order:=xlAscending
            .SetRange ActiveCell.CurrentRegion
            .Header = xlYes
            .Orientation = xlTopToBottom
            .Apply
        End With
        
        'this command deletes all rows with blanks in the values column
        Selection.Offset(0, k + 1).SpecialCells(xlCellTypeBlanks).EntireRow.Delete
    End Sub
    

    Removing zeros is a little trickier, because, if the user has chosen not to remove blanks you can’t clear all cells with 0’s and then run the above command, extra row may be deleted. In this case you can replace the 0’s with text, then clear the cells with zeros and then use line 18 above to remove the rows. Finally you can clear the cells containing text. I chose not to do that; instead I entered the formula =1/0 in all the cells with 0 and then sorted putting this group near the bottom. Then I selected an deleted all the cells with error formulas with one step, line 25.

    RemoveZeros Subroutine

    This code removes all rows with zeros in the values field
    Private Sub RemoveZeros()
        
        'replaces all cells with 0 with a formula generating an DIV/0! error
        Range("A1").CurrentRegion.Resize(, 1).Offset(0, k + 1).Select
            Selection.Replace _
                What:="0", _
                Replacement:="=1/0", _
                LookAt:=xlWhole
            
        'this sorts all cells with errors to the bottom making deleting rows faster
        With ActiveSheet.Sort
            .SortFields.Clear
            .SortFields.Add _
                Key:=Cells(1, k + 2), _
                SortOn:=xlSortOnValues, _
                Order:=xlAscending
            .SetRange ActiveCell.CurrentRegion
            .Header = xlYes
            .Orientation = xlTopToBottom
            .Apply
        End With
        
        'this command deletes all rows with errors
        Selection.SpecialCells(xlCellTypeFormulas, 16).EntireRow.Delete
    End Sub
    

    ConvertToRange Subroutine

    'The user can keep the table or convert it to a regular spreadsheet range
    'The user can keep or remove the autofilters in either case
    'The user can keep or remove the formatting in either case
    Private Sub ConvertToRange()
        Dim rList As Range
     
        With ActiveSheet.ListObjects(1)
            Set rList = .Range
            If frmNormalize.cbConvertToRange = True Then
                .Unlist                                                                     'converts a table to a range
                If frmNormalize.cbFilter = False Then Range("A1").CurrentRegion.AutoFilter  'turns on  autofilter
                If frmNormalize.cbFormatting = True Then                                    'removes table formatting
                    With rList
                        .Interior.ColorIndex = xlColorIndexNone
                        .Font.ColorIndex = xlColorIndexAutomatic
                        .Borders.LineStyle = xlLineStyleNone
                    End With
                End If
            Else
                If frmNormalize.cbFilter = True Then .Range.AutoFilter  'Turns off autofilter
                If frmNormalize.cbFormatting = True Then                'Turns off table formatting
                    .TableStyle = ""
                End If
            End If
        End With
    End Sub
    

    Tidy Up

    That’s it for today. HUGE thanks to Shane Devenshire for sharing his Data Normalization Utility with us and allowing me to post it on my blog. I hope you enjoy it and find it useful in your daily work.

    Downloads

    Download the file from OneDrive. The zip archive contains the Data Normalizer in .xlsm format as well as a Microsoft Word version of the tutorial.

    Other posts regarding data normalization or “UnPivot” at dataprose.org

      Undaunted by UnPivot

    Other posts regarding data normalization or “UnPivot” around the horn

      UnPivot Shootout – DDoE
      Reshape Data
    , ,

    UndauntedFinal

    Well, maybe an UnPivot or Reverse CrossTab process in Excel with VBA does not require the same amount of courage as Meriwether Lewis (pictured here) must have had as he and Clark set out on their exploration of the Louisiana Territory – nonetheless, the process can seem a bit overwhelming to a newcomer. I’ll do my best to explain it in a straight-forward manner.

    Many times, we receive data in a crosstab format but we want it in a normalized format suitable for use with Pivot Tables or Excel Tables :

    Crosstab

    Here’s a fairly standard layout to retrieve data from a P&L Cube in Essbase using the Essbase Spreadsheet Add-in. This layout, however, is not conducive for creating PivotTables or for analyzing with Excel Tables. There are ways in Essbase to get this is a better “Normalized” format, but I have this at hand so it suits my needs for now.

    Tell Me What You Need….

    CrosstabMarkup

    I marked-up the spreadsheet to show which fields I am interested in and in what order I would like them in the final output :

    1. Organization – Some division, region, district, store in the company under analysis
    2. Scenario – Budget, Forecast, Actual and various versions of the aforementioned
    3. Time – I’m showing periods in the screen shot, could be any measure of time
    4. Accounts – The name and or number of the account
    5. Measure – What we are really interested in : Amounts, statistics, ratios, etc…

    How Would You Like That….

    Here is the desired output :

    CrosstabOutput

    Some of the values in the desired output are static, that is, the same value from the same cell must be output over and over again. Other times, I need to move across columns and still others I need to move down rows

    Segue To Some Functions…

    I’ll need a coupe of Functions so that my code is dynamic to some extent. I do not know the last column or the last row during development stage so I will need to determine those values at run-time.

    Last Used Column

    This function has only one argument, a worksheet, it will return the column number of the last used cell on the worksheet

    Public Function GetLastColumn(ws As Worksheet) As Long
    
        'Input  :   Worksheet
        'Output :   A column number of type long
        
        'Declare variables
            Dim rng As Range
            Dim lngColumn As Long
            
        'Get range address of last cell on worksheet
            Set rng = ws.Cells.SpecialCells(xlCellTypeLastCell)
        
        'Get column number of last cell
            lngColumn = rng.Column
            
        'Pass value to function
            GetLastColumn = lngColumn
        
        'Tidy up
            Set rng = Nothing
            
    End Function
    

    Last Used Row

    This function has only one argument, a worksheet, it will return the row number of the last used cell in Column A :

    Public Function GetRows(ws As Worksheet) As Long
    
        'Input          :   Worksheet
        'Output         :   A row number of type long
        'Assumptions    :   First column (A)
        
        'Declare variables
            Dim r As Long
        
        'Get last row
            With ws
                r = .Cells(Rows.Count, 1).End(xlUp).Row
            End With
            
        'Pass value to function
            GetRows = r
            
    End Function
    

    The Main Procedure

    Here is the main procedure. The main thing to pay attention to is the different variable and static cell values used to get the correct output. Take a look at the code inside the For..Next Loops :

    Option Explicit
    
    Sub UnPivotData()
    
        'Purpose    :   Convert crosstab data to normalized data
        'Author     :   Winston Snyder
        'Date       :   5/26/2014
    
        'Declare variables
            Dim wb As Workbook
            Dim wsData As Worksheet
            Dim wsDataNormalized As Worksheet
            Dim MaxColumns As Long
            Dim MaxRows As Long
            Dim i As Long
            Dim j As Long
            Dim k As Long
    
        'Excel environment - speed things up
            With Application
                .ScreenUpdating = False
                .DisplayAlerts = False
                .EnableEvents = False
                .Calculation = xlCalculationManual
            End With
    
        'Objects
            Set wb = ThisWorkbook
            With wb
                Set wsData = .Worksheets("Data")
                Set wsDataNormalized = .Worksheets("Normal")
            End With
    
        'Initialize
            wsDataNormalized.UsedRange.ClearContents
            MaxColumns = GetLastColumn(ws:=wsData)
            MaxRows = GetRows(ws:=wsData)
            k = 2
    
        'Convert cross-tab report to normalized data table structure
            With wsDataNormalized
                For i = 6 To MaxRows 'Begin with first row of (Measures)
                    For j = 2 To MaxColumns 'Begin with first column of data (Measures)
                        .Cells(k, 1).Value = wsData.Cells(4, 1).Value   'Organization
                        .Cells(k, 2).Value = wsData.Cells(2, 1).Value   'Scenario
                        .Cells(k, 3).Value = wsData.Cells(5, j).Value  'Time
                        .Cells(k, 4).Value = wsData.Cells(i, 1).Value  'Account
                        .Cells(k, 5).Value = wsData.Cells(i, j).Value  'Measure
                        k = k + 1
                    Next j
                Next i
    
                'Add headers
                .Range("A1").Value = "Organization"
                .Range("B1").Value = "Scenario"
                .Range("C1").Value = "Time"
                .Range("D1").Value = "Account"
                .Range("E1").Value = "Measure"
            End With
    
        'Tidy up
            'Destroy objects
                Set wsDataNormalized = Nothing
                Set wsData = Nothing
                Set wb = Nothing
    
            'Restore Excel environment
                With Application
                    .ScreenUpdating = True
                    .DisplayAlerts = True
                    .EnableEvents = True
                    .Calculation = xlCalculationAutomatic
                End With
    End Sub
    

    CrosstabOutputNormal

    Perfect Ready to Pivot or analyze with Excel Tables an Structured Reference Formulas.

    It’s About Time….

    I’m going to add a couple of lines to the Sub() to see how long it takes. I’ll use the GetTickCount function from the Windows kernel32 library :

    Option Explicit
    Public Declare Function GetTickCount Lib "kernel32.dll" () As Long
    
    Sub UnPivotData()
    
        'Purpose    :   Convert crosstab data to normalized data
        'Author     :   Winston Snyder
        'Date       :   5/26/2014
        
        'Start timer
            Dim t As Long
            t = GetTickCount
    
        'Declare variables
            Dim wb As Workbook
            Dim wsData As Worksheet
            Dim wsDataNormalized As Worksheet
            Dim MaxColumns As Long
            Dim MaxRows As Long
            Dim i As Long
            Dim j As Long
            Dim k As Long
    
        'Excel environment - speed things up
            With Application
                .ScreenUpdating = False
                .DisplayAlerts = False
                .EnableEvents = False
                .Calculation = xlCalculationManual
            End With
    
        'Objects
            Set wb = ThisWorkbook
            With wb
                Set wsData = .Worksheets("Data")
                Set wsDataNormalized = .Worksheets("Normal")
            End With
    
        'Initialize
            wsDataNormalized.UsedRange.ClearContents
            MaxColumns = GetLastColumn(ws:=wsData)
            MaxRows = GetRows(ws:=wsData)
            k = 2
    
        'Convert cross-tab report to normalized data table structure
            With wsDataNormalized
                For i = 6 To MaxRows 'Begin with first row of (Measures)
                    For j = 2 To MaxColumns 'Begin with first column of data (Measures)
                        .Cells(k, 1).Value = wsData.Cells(4, 1).Value   'Organization
                        .Cells(k, 2).Value = wsData.Cells(2, 1).Value   'Scenario
                        .Cells(k, 3).Value = wsData.Cells(5, j).Value  'Time
                        .Cells(k, 4).Value = wsData.Cells(i, 1).Value  'Account
                        .Cells(k, 5).Value = wsData.Cells(i, j).Value  'Measure
                        k = k + 1
                    Next j
                Next i
    
                'Add headers
                .Range("A1").Value = "Organization"
                .Range("B1").Value = "Scenario"
                .Range("C1").Value = "Time"
                .Range("D1").Value = "Account"
                .Range("E1").Value = "Measure"
            End With
    
        'Tidy up
            'Destroy objects
                Set wsDataNormalized = Nothing
                Set wsData = Nothing
                Set wb = Nothing
    
            'Restore Excel environment
                With Application
                    .ScreenUpdating = True
                    .DisplayAlerts = True
                    .EnableEvents = True
                    .Calculation = xlCalculationAutomatic
                End With
                
        'Timer
            MsgBox GetTickCount - t & " Milliseconds", , " Milliseconds"
    
    End Sub
    

    I ran three trials of the revised code to check the timer. It ran in 62 ms, 78 ms and 62 ms respectively.

    Tidy Up

      UnPivot

        UnPivot Shootout – DDoE
        Reshape Data

      Essbase

        Hyperion Essbase
        Hyperion Essbase Spreadsheet Add-In

      kernel32.dll

        kernel32.dll
        kernel32 documentation
      ,

    MrCleanFinalFinal

    In my last post on working with strings, I demonstrated some VBA with a call to the Regular Expression (RegExp) library to split alpha characters from numeric characters where no delimiter was present. Today I received a data set that contained 10’s of thousand of strings that contained some trailing stuff that I wanted to remove. My initial thought was to use the RegExp engine with the correct pattern, but I discovered a better way.

    The Requirements

    Upon review of the strings, the pattern I discovered:

    1. The string always begin with an alpha or numeric character with a mix of upper and lower case
    2. The string I need to preserve end with an alpha or numeric character with a mix of upper and lower case
    3. Everything from the beginning of the of the first alphanumeric to the last alphanumeric must be preserved as is, spaces, case, special characters, whatever
    4. Everything trailing to right of the last alphanumeric may safely be removed, special characters, non-printable characters, spaces, whatever
    5. Strings are of random lengths both input and output

    So, I need to :

    1. Search from the end (right) of the string
    2. Find the first alphanumeric character irregardless of case
    3. Return the string beginning from the first character to the character position determined in the previous step

    Quick Segue – The Functions

    Before I get to the Sub Procedure, I would like to review all of the Functions I am using to make the process fairly dynamic :

    GetSelectedSheet

    Here I am using an InputBox to allow the user to select a worksheet at run-time. The InputBox Method has 1 required parameter and 7 optional parameters. If the optional parameters are not utilized, then the InputBox returns a text value. However, The optional Type parameter makes the InputBox more powerful. In the function, I am using Type:=8 to return a cell reference as a Range Object. You can read more about the InputBox Method here.

    Public Function GetSelectedSheet() As String
        
        'Declare variables
            Dim ws                          As Worksheet
            Dim rng                         As Range
        
        'Users - select a cell on a worksheet
            Set rng = InputBox( _
                        Prompt:="Please select a cell on a worksheet", _
                        Title:="Select a worksheet", _
                        Default:=ActiveCell.Address, _
                        Type:=8) 'Range selection
                        
        'Get the parent worksheet of the selected cell
            Set ws = rng.Parent
            
        'Pass the name of the worksheet to the function
            GetSelectedSheet = ws.Name
        
        'Tidy up
            Set rng = Nothing
            Set ws = Nothing
     End Function
    

    GetRows

    Fairly straight forward, I pass the selected worksheet to the function and it returns the maximum rows of data based on Column 1. I could make this more dynamic by passing a column number to the function as well, but I generally always use Column 1.

    Public Function GetRows(ws As Worksheet) As Long
    
        Dim r As Long
        
        With ws
            r = .Cells(Rows.Count, 1).End(xlUp).Row
        End With
        
        GetRows = r
    End Function
    

    GetColumns

    Straight forward, I pass the selected worksheet to the function and it returns the maximum columns of data based on Row 1. I could make this more dynamic by passing a row number to the function as well, but I generally always use Row 1.

    Public Function GetColumns(ws As Worksheet) As Long
    
         'Declare variables    
               Dim c As Long
        
         'Get column count, store it in a variable    
              With ws
                   c = .Cells(1, Columns.Count).End(xlToLeft).Column
              End With
    
         'Pass the variable value to the function
              GetColumns = c
    End Function
    

    GetUserInput

    Again, fairly straight forward. Get a text value from the user to search for in the next function.
    I call the function like this :

        'User - What is search term?
            strSearchTerm = GetUserInput(strPrompt:="What is the search term?", _
                                         strTitle:="Find Column Number")
    
    Public Function GetUserInput(strPrompt As String, _
                                 strTitle As String) As String
           
        'Declare variables
             Dim strUserInput As String
           
        'Call the InputBox Method, pass user input to a variable
              strUserInput = InputBox(Prompt:=strPrompt, _
                                      Title:=strTitle)
    
        'Pass the variable value to the function                                 
             GetUserInput = strUserInput
    
    End Function
    

    GetColumnNumber

    The function has 2 arguments, a worksheet and a string value that I got from the user in the last function. The function will create a Range Object and search that Range for the term supplied by the user. Again, I am using Row 1 here, but I could make it more dynamic by passing a row number to the function as one of its arguments. Below, I am using the Named Argument, LookAt:= and passing the value xlPart instead of xlWhole. You may want to consider this in your VBA Projects as you program defensively around what the user may input. Since I am using this for myself, I am not too concerned for now,

    Public Function GetColumnNumber(ws As Worksheet, _
                                    strSearchTerm As String) As Long
    
        'Declare variables
            Dim rng As Range
            Dim MaxColumns As Long
            Dim lngField As Long
            
        'Initialize
            MaxColumns = GetColumns(ws:=ws)
            With ws
                Set rng = .Range(.Cells(1, 1), .Cells(1, MaxColumns))
            End With
            
        'Find columns number
            lngField = rng.Find(What:=strSearchTerm, _
                                LookIn:=xlValues, _
                                LookAt:=xlPart, _
                                MatchCase:=False).Column
                                
        'Pass the column number to the function
            GetColumnNumber = lngField
            
        'Tidy up
            Set rng = Nothing
    
    End Function
    

    GetCleanAlphaNumeric

    This is the money! All other functions to this point were setup work to allow this function to do the heavy lifting. The function uses the LIKE operator to compare a character beginning at the right-most position of the string to the pattern, “[0-9A-Za-z]”. As soon as a match is found, the For..Next Loop is exited, thus saving time by not checking characters unnecessarily. I then use the MID() Function to get the string from the 1st character to the last alphanumeric character position determined in the For..Next Loop. More on the LIKE operator here.

    Public Function GetCleanAlphaNumeric(strChar As String) As String
        
        'Comments   :   Remove non-alpha numeric characters from end of string
       
        'Declare variables
            Dim i As Long
            Dim lngLengthString As Long
            Dim blnTest As Boolean
            Dim posLastAlphaNumeric As Long
            Dim strClean As String
       
        'Initialize
            blnTest = False
    
        'Length of string to check
            lngLengthString = Len(CStr(strChar))
            
        'Compare each charcter to pattern
        'Begin at end of string
        'Stop as soon as find alphanumeric
            For posLastAlphaNumeric = lngLengthString To 1 Step -1
                blnTest = Mid(CStr(strChar), posLastAlphaNumeric, 1) Like "[0-9A-Za-z]"
                If blnTest = True Then Exit For
            Next posLastAlphaNumeric
            
        'posLastAlphaNumeric is the position of last AlphaNumeric character
        'Use the position of the last alphanumeric to get the final length of the string
        'Assign the value to the range
            strClean = CStr(Mid(strChar, 1, posLastAlphaNumeric))
            
        'Pass the clean string to the function
            GetCleanAlphaNumeric = strClean
    
     End Function
    

    The Main Procedure

    Here is the main procedure that calls all of the functions. Note: Screen updating must be on for the user to select a cell on a worksheet. Turn ScreenUpdating off after the user selects a cell on a worksheet.

    Option Explicit
    Sub CleanStrings()
         
        'Author:        Winston Snyder
        'Date:          3/28/14
        'Purpose:       Get string excluding non-alphanumeric trailing characters
        '---------------------------------------------------------------------------------------------------------------------------------------------
         
        'Declare variables
            Dim wb                                  As Workbook
            Dim ws                                  As Worksheet
            Dim rng                                 As Range
            Dim C                                   As Range
            Dim strSearchTerm                       As String
            Dim strStringToBeCleaned                As String
            Dim lngColumnNumber                     As Long
            Dim MaxRows                             As Long
         
        'Excel environment - speed things up
            With Application
                .DisplayAlerts = False
                .EnableEvents = False
                .Calculation = xlCalculationManual
            End With
         
        'Initialize
            Set wb = ThisWorkbook
            Set ws = wb.Worksheets(GetSelectedSheet)
            Application.ScreenUpdating = False
            
        'Get maximum number of rows on the worksheet
            MaxRows = GetRows(ws:=ws)
             
        'User - What is search term?
            strSearchTerm = GetUserInput(strPrompt:="What is the search term?", _
                                         strTitle:="Find Column Number")
                                         
        'Get the column number based on the search term
            lngColumnNumber = GetColumnNumber(ws:=ws, _
                                              strSearchTerm:=strSearchTerm)
                                              
        'Define the range that contains strings to be cleaned
            With ws
                Set rng = .Range(.Cells(2, lngColumnNumber), .Cells(MaxRows, lngColumnNumber))
            End With
            
        'Clean each string in the range
            For Each C In rng
                strStringToBeCleaned = CStr(C.Value)
                C.Value = GetCleanAlphaNumeric(strChar:=strStringToBeCleaned)
            Next C
    
        'Tidy up
                 
            'Destroy objects
                Set C = Nothing
                Set rng = Nothing
                Set ws = Nothing
                Set wb = Nothing
                 
            'Restore Excel environment
                With Application
                    .ScreenUpdating = True
                    .DisplayAlerts = True
                    .EnableEvents = True
                    .Calculation = xlCalculationAutomatic
                End With
    End Sub
    

    Tidy up

      Final Thoughts

      That’s it today. I like the LIKE operator. This process is fast, reviewed 25K strings, and updated them when needed in no time. Awesome!

    ,

    SpyVsSpyLg

    In January, I wrote a post that demonstrated how to combine Excel Workbooks. That post assumed that each of the source workbooks contained at least one ListObject Object.

    Today, I will revise that code a bit to add a ListObject Object (LO) if the lo does not exist. Then I will demonstrate the code to merge Workbooks using Range Objects. Lastly, I will compare both processes to see which is the fastest using the Windows High Resolution Timer.


      edit: As is my usual practice, I am going to use Late Binding in the sample snippets below. A discussion on Late / Early Binding is beyond the scope of this post. Please see these links for a detailed explanation of Late / Early Binding.

      1. Beyond Excel
      2. JP Software Technologies


    The ListObject Object (LO)

    One of the very useful features the LO, is that it offers us the three distinct properties: HeaderRowRange, DataBodyRange, TotalsRowRange. Once we have a LO, we can leverage these properties to streamline the heavy lifting.

    ListObjectRanges

    Check If The ListObject Object Exists

    I can check if the LO exists, if not, I can add one while I’m looping through workbooks:

    Option Explicit
    
    Sub AddListObjectIfDoesNotExist()
    
        'Does a  ListObject Object exist?
        
        Dim wb As Workbook
        Dim ws As Worksheet
        Dim lo As ListObject
        
        Set wb = ThisWorkbook
        For Each ws In wb.Worksheets
            Set lo = ws.Range("A1").ListObject
            If lo Is Nothing Then
                Set lo = ws.ListObjects.Add( _
                                        SourceType:=xlSrcRange, _
                                        Source:=ws.Range("A1").CurrentRegion, _
                                        Destination:=ws.Range("A1"))
            Else
                'ListObject already exists - do nothing
            End If
        Next ws
        
        'Tidy up
            Set lo = Nothing
            Set wb = Nothing
    End Sub
    

    Now that I can add an LO dynamically, I can exploit its properties with assurance.

    Here is the fully revised code to merge Excel Workbooks using ListObject Objects:

    Option Explicit
     
    Sub MergeWorbooksUsingListObjects()
         
        'Log:
        'Date               Author                      Action                  Comment
        '-------------------------------------------------------------------------------------------------------------------
        '1/26/2014           ws                         Created                 Merge workbooks using ListObjects from source files
        '3/23/2014           ws                         Modified                If ListObject does not exist, add ListObject
        '-------------------------------------------------------------------------------------------------------------------
         
        'Delare variables
            Dim wb                  As Workbook
            Dim wbData              As Workbook
            Dim ws                  As Worksheet
            Dim wsData              As Worksheet
            Dim rngData             As Range
            Dim rngDestination      As Range
            Dim lo                  As ListObject
            Dim fso                 As Object
            Dim fsoFolder           As Object
            Dim fsoFile             As Object
            Dim strSelectedFolder   As String
            Dim strCurrentPath      As String
            Const strSpecifiedPath  As String = "C:\"
            Dim lngRows             As Long
            Dim blnFlag             As Boolean
             
        'Excel environment - speed things up
            With Application
                .ScreenUpdating = False
                .DisplayAlerts = False
                .EnableEvents = False
                .Calculation = xlCalculationManual
            End With
             
        'Initialize variables
            Set wb = ThisWorkbook
            Set ws = wb.Worksheets("Data")
            blnFlag = True
             
        'Clear data from control workbook from previous consolidations
            ws.UsedRange.ClearContents
         
        'Get the current path, so reset the path at the end of the procedure
            strCurrentPath = CurDir()
             
        'Set the target directory to get the user closer to the working folder
        'This will minimize the time the user must spend drilling into the file system
        'once they are presented with the FileDialog
            ChDir (strSpecifiedPath)
         
        'Create a FileSystemObject
            Set fso = GetFSO
     
        'Prompt the user to select a folder
        'Return the path of the selected folder
            strSelectedFolder = GetSelectedFolder
             
        'Get the FSO Folder of the selected folder
            Set fsoFolder = fso.GetFolder(strSelectedFolder)
             
        'Loop each file in folder
        'Copy data from each file to control workbook
            For Each fsoFile In fsoFolder.Files
                Set wbData = Workbooks.Open(fsoFile)
                Set wsData = wbData.Worksheets("Sheet1")
                 
                'Get next blank row from destination worksheet
                'If first time, need row 1, else, next blank row
                    lngRows = GetRows(ws:=ws)
                    If blnFlag = False Then lngRows = lngRows + 1
                     
                'The Destination Range
                    Set rngDestination = ws.Cells(lngRows, 1)
                 
                With wsData
                    'If a ListObject does not exist, add a ListObject
                        Set lo = .Range("A1").ListObject
                        If lo Is Nothing Then
                            Set lo = .ListObjects.Add( _
                                                  SourceType:=xlSrcRange, _
                                                  Source:=.Range("A1").CurrentRegion, _
                                                  Destination:=ws.Range("A1"))
                        Else
                            'Do nothing, ListObject already exists
                        End If
                    
                    'If first workbook, include the header row
                        For Each lo In .ListObjects
                            If blnFlag = True Then
                                Set rngData = Union(lo.HeaderRowRange, lo.DataBodyRange)
                                blnFlag = False
                            Else
                                Set rngData = lo.DataBodyRange
                            End If
                        Next lo
                End With
      
                'Copy the Data Range to the Destination Range
                    rngData.Copy
                    rngDestination.PasteSpecial xlPasteValuesAndNumberFormats
                     
                'Close the source file without saving
                    wbData.Close
             
            Next fsoFile
             
        'Tidy up
            'Restore to original path
                ChDir (strCurrentPath)
                 
            'Restore Excel environment
                With Application
                    .ScreenUpdating = True
                    .DisplayAlerts = True
                    .EnableEvents = True
                    .Calculation = xlCalculationAutomatic
                End With
                 
            'Destroy objects
                Set fsoFolder = Nothing
                Set fso = Nothing
                Set lo = Nothing
                Set rngData = Nothing
                Set rngDestination = Nothing
                Set ws = Nothing
                Set wb = Nothing
             
    End Sub
    

    And the functions:

    
    Public Function GetRows(ws As Worksheet) As Long
     
        Dim r       As Long
         
        With ws
            r = .Cells(Rows.Count, 1).End(xlUp).Row
            GetRows = r
        End With
         
    End Function
    
    '-------------------------------------------------------------------------
    Public Function GetFSO()
     
        Dim fso             As Object
         
        Set fso = CreateObject("Scripting.FileSystemObject")
        Set GetFSO = fso
         
        Set fso = Nothing
     
    End Function
    
    '-------------------------------------------------------------------------
    Public Function GetSelectedFolder() As String
         
        Dim diaFolder       As FileDialog
        Dim strFolder       As String
     
        Set diaFolder = Application.FileDialog(msoFileDialogFolderPicker)
        With diaFolder
            .AllowMultiSelect = False
            .Show
            strFolder = .SelectedItems(1)
        End With
     
        GetSelectedFolder = strFolder
    End Function
    

    Works great!

    The Range Object

    The Range Object does not have the properties: HeaderRowRange, DataBodyRange, TotalsRowRange. I am ok with the entire Range from the first Workbook when Merging multiple Workbooks. But on subsequent Workbooks, I want to exclude the header row.

    Assume I have a Range named rng with address $A$1:$G$10, from this, I would like to exclude the header row on Row 1. The first step is to count the number of rows and columns in the range

    Dim r As Long
    dim c As Long
    r = rng.Rows.Count
    c = rng.Columns.Count
    

    Next, from the rng, move down 1 Row. This means the rng address is now $A$2:$G$11

    rng.Offset(1)
    

    The $A$2 part of the address is good, but the $G$11 part means I now have a blank row. So now I need to Resize the rng to exclude the blank row:

    rng.Offset(1).Resize(r-1,c)
    

    And test the final Range Address to make sure all is as expected:

    Debug.Print rng.Address
    

    $A$2:$G$10

    Perfect! Exactly what I was looking for. I’m now ready to modify the code again to use Range Objects instead of ListObject Objects.

    Merge Workbooks Using Range Objects

    Here’s the working code to merge workbooks using Range Objects instead of ListObject Objects:

    Option Explicit
     
    Sub MergeWorbooksUsingRangeObjects()
         
        'Log:
        'Date               Author                      Action                  Comment
        '-------------------------------------------------------------------------------------------------------------------------------------------
        '1/26/2014           ws                         Created                 Merge workbooks using ListObjects from source files
        '3/23/2014           ws                         Modified                If ListObject does not exist, add ListObject
        '3/23/2014           ws                         Modified                Use Range Object, modify range to exclude header row if not File = 1
        '-------------------------------------------------------------------------------------------------------------------------------------------
         
        'Delare variables
            Dim wb                  As Workbook
            Dim wbData              As Workbook
            Dim ws                  As Worksheet
            Dim wsData              As Worksheet
            Dim rngData             As Range
            Dim rngDestination      As Range
            Dim fso                 As Object
            Dim fsoFolder           As Object
            Dim fsoFile             As Object
            Dim strSelectedFolder   As String
            Dim strCurrentPath      As String
            Const strSpecifiedPath  As String = "C:\"
            Dim lngRows             As Long
            Dim r                   As Long
            Dim c                   As Long
            Dim blnFlag             As Boolean
             
        'Excel environment - speed things up
            With Application
                .ScreenUpdating = False
                .DisplayAlerts = False
                .EnableEvents = False
                .Calculation = xlCalculationManual
            End With
             
        'Initialize variables
            Set wb = ThisWorkbook
            Set ws = wb.Worksheets("Data")
            blnFlag = True
             
        'Clear data from control workbook from previous consolidations
            ws.UsedRange.ClearContents
         
        'Get the current path, so reset the path at the end of the procedure
            strCurrentPath = CurDir()
             
        'Set the target directory to get the user closer to the working folder
        'This will minimize the time the user must spend drilling into the file system
        'once they are presented with the FileDialog
            ChDir (strSpecifiedPath)
         
        'Create a FileSystemObject
            Set fso = GetFSO
     
        'Prompt the user to select a folder
        'Return the path of the selected folder
            strSelectedFolder = GetSelectedFolder
             
        'Get the FSO Folder of the selected folder
            Set fsoFolder = fso.GetFolder(strSelectedFolder)
             
        'Loop each file in folder
        'Copy data from each file to control workbook
            For Each fsoFile In fsoFolder.Files
                Set wbData = Workbooks.Open(fsoFile)
                Set wsData = wbData.Worksheets("Sheet1")
                 
                'Get next blank row from destination worksheet
                'If first time, need row 1, else, next blank row
                    lngRows = GetRows(ws:=ws)
                    If blnFlag = False Then lngRows = lngRows + 1
                     
                'The Destination Range
                    Set rngDestination = ws.Cells(lngRows, 1)
                 
                With wsData
                    'Create the data range
                        Set rngData = .Range("A1").CurrentRegion
                        r = rngData.Rows.Count
                        c = rngData.Columns.Count
                    
                    'If first workbook, include the header row, otherwise resize the range to exclude the header row
                        If blnFlag = True Then
                            blnFlag = False
                        Else
                            Set rngData = rngData.Offset(1).Resize(r - 1, c)
                        End If
                End With
      
                'Copy the Data Range to the Destination Range
                    rngData.Copy
                    rngDestination.PasteSpecial xlPasteValuesAndNumberFormats
                     
                'Close the source file without saving
                    wbData.Close
             
            Next fsoFile
             
        'Tidy up
            'Restore to original path
                ChDir (strCurrentPath)
                 
            'Restore Excel environment
                With Application
                    .ScreenUpdating = True
                    .DisplayAlerts = True
                    .EnableEvents = True
                    .Calculation = xlCalculationAutomatic
                End With
                 
            'Destroy objects
                Set fsoFolder = Nothing
                Set fso = Nothing
                Set rngData = Nothing
                Set rngDestination = Nothing
                Set ws = Nothing
                Set wb = Nothing
             
    End Sub
    

    That worked well, next I’ll add a little bit of code to time each of the Subs() to see if there is any appreciable difference.

    Windows High-Resolution Timer (WHRT)

    To use WHRT, I’ll just need to add 3 lines of code:

    Declare a reference to the kernel32 library

    Public Declare Function GetTickCount Lib "kernel32.dll" () As Long

    Get the tick count and assign it to a variable

    t = GetTickCount

    Get the tick count at the end, subtract the first to calculate elapsed time

    debug.print GetTickCount - t & " Milliseconds"

    I’ll put the Public Function “GetTickCount” in a Module with my other Public Functions.
    I normally put the first call to the Public Function near the top of a Sub(), but in this case I’m using a FileDialog to interact with the user so I want to place the call to the Function after the FileDialog:

        'Prompt the user to select a folder
        'Return the path of the selected folder
            strSelectedFolder = GetSelectedFolder
            
        'Get the tick count
            t = GetTickCount
    

    The last call to the GetTickCount Function goes at the very end of the code:

                Set ws = Nothing
                Set wb = Nothing
                
            'Get elapsed time
                 Debug.Print GetTickCount - t & " Milliseconds"
    

    The Results

    The Sub() adding the ListObject Object dynamically:

    1420 Milliseconds
    1233 Milliseconds
    1419 Milliseconds
    1419 Milliseconds
    1248 Milliseconds
    1451 Milliseconds
    1435 Milliseconds
    1264 Milliseconds
    1653 Milliseconds
    1264 Milliseconds
    ——————–
    1380.6 Milliseconds Avg

    The Sub() Resizing the Range Object:

    1217 Milliseconds
    1217 Milliseconds
    1451 Milliseconds
    1264 Milliseconds
    1232 Milliseconds
    1232 Milliseconds
    1248 Milliseconds
    1263 Milliseconds
    1264 Milliseconds
    1248 Milliseconds
    ————————
    1263.6 Milliseconds Avg

    Tidy Up

      Final Thoughts

      The Sub() using the Range Object is a bit faster, but not much. Bear in mind, I ran the process on 3 files -each file with 50 rows and 7 columns. The results may be more pronounced on more files with more data in each file.

      Downloads

      Download the file containing both Subs() and all Functions here.

      Additional Resources

      ListObject Objects

      ListObject Object Model
      ListObjects Object Model
      Working with Tables in Excel 2013, 2010 and 2007 (VBA)
      Working with Microsoft Excel tables in VBA

      Range Object

      The Range Object – Object Model

    , , ,

    ExportCrop

    A member over on one of the LinkedIn Excel Groups had a question about how to export 1 to several worksheets from an Excel Workbook to a pdf file.

    Below is the method I use to load a Sheets Array from the list the user defined on a worksheet. There are other ways to do this such as based on the ColorIndex of the tab, some character in the name of the worksheet or the code name of the worksheet.

    Option Explicit
        
    Sub ExportXLToPDF()
    
        'Comments:
        'Assume list of worksheets to be included in output are listed in Column 1 on "List"
    
        Dim wb                  As Workbook
        Dim ws                  As Worksheet
        Dim Arr()               As String
        Dim MaxRows             As Long
        Dim i                   As Long
        Dim strPath             As String
        Dim strFileName         As String
        Const strEXTENSION      As String = ".pdf"
        
        Set wb = ThisWorkbook
        Set ws = wb.Worksheets("List")
        
        'User - where to save the output file
            strPath = GetFolder & "\"
            
        'User - what to name the output file
            strFileName = GetUserInput(strPrompt:="Please enter a name for the output file", _
                                       strTitle:="File Name")
            
        'Assume list to be included in sheets array in on worksheet named list in Column 1 beginning in Row 1
        'Total number of rows is dynamic
            MaxRows = GetRows(ws:=ws)
            
        'Redim the array to hold the name of the worksheets
            ReDim Preserve Arr(1 To MaxRows)
        
        'Load the list of sheets to be included into the array
            For i = 1 To MaxRows
                Arr(i) = ws.Cells(i, 1).Value
            Next i
            
        'Select the sheets array
            Sheets(Arr).Select
     
        'Export to the sheets array to pdf
            ActiveSheet.ExportAsFixedFormat Type:=xlTypePDF, _
                                            Filename:=strPath & strFileName & strEXTENSION, _
                                            Quality:=xlQualityStandard, _
                                            IncludeDocProperties:=True, _
                                            IgnorePrintAreas:=False, _
                                            OpenAfterPublish:=False
                        
        'Tidy up
            'Erase arrays
                Erase Arr
            
            'Destroy objects
                Set ws = Nothing
                Set wb = Nothing
    End Sub
    

    And the functions:

    
    Public Function GetRows(ws As Worksheet) As Long
     
        Dim r       As Long
         
        With ws
            r = .Cells(Rows.Count, 1).End(xlUp).Row
            GetRows = r
        End With
         
    End Function
    
    Public Function GetUserInput(strPrompt As String, _
                                 strTitle As String) As String
          
        Dim strUserInput As String
          
        strUserInput = InputBox(Prompt:=strPrompt, _
                                Title:=strTitle)
                                  
        GetUserInput = strUserInput
      
    End Function
    
    Public Function GetFolder() As String
      
        Dim fd As FileDialog
        Dim strFolderName As String
        Set fd = Application.FileDialog(msoFileDialogFolderPicker)
          
        With fd
            .Title = "Please select a folder"
            .AllowMultiSelect = False
            .Show
            strFolderName = .SelectedItems(1)
        End With
      
        GetFolder = strFolderName
          
        Set fd = Nothing
    End Function
    
    ,