BeatlesLg
I like music – lots of different kinds of music, including The Beatles pictured here. These days, I’m more into “Adult Alternative” music, but I still like to listen to some stuff from the ’60’s and ’70’s every now and again.

Around these parts, we get “Breakfast With The Beatles” with Andre Gardner every Sunday from 7am-9am. Check your local listings to see if it is available in your area or you can stream it on KSLX. Really excellent show if you are into The Beatles.

However, today’s post is not about music or The Beatles. It is about Merging Excel Workbooks. Over on the LinkedIn Excel Groups there are many questions about merging data from several Excel Workbooks into 1 Workbook. Many folks suggest linking the workbooks using a formula. Others suggest using the INDIRECT() function. I prefer to use some VBA to copy the desired data from the Source Workbooks to the Destination Workbook.

All testing, screen shots, code in this post are from Office 365. If you are using another version of Excel, your results may vary (though they should not).

Linking Files

A common way to get data from one file to another is to link them. In this sample, I opened 2 Excel files, entered a value in 1 file, activated another file, typed, “=” and clicked on a cell in the 1st file:

LinkFiles2

DestinationWB.xlsx in the active workbook and cell $A$1 is selected. The formula bar give us:

  1. [SourceWB.xlsx] <- The name of the workbook
  2. Sheet1! <- The name of the worksheet
  3. $A$1 <- The cell reference

When I close the source workbook, the file path is added to the linked cell formula:

LinkedCellwPath

The folder location ‘C:\Data\ has been added to the linked formula.

There are a few things I do not like about linked cells

  1. The absolute references were added automatically. If I drag the formula to the right, the formula will still reference $A$1, not $B$1 as we may require
  2. What happens if a Row is added at Row 1 pushing all subsequent rows down by 1? This could cause an error in the linked formula, erroneous results, or at least the need for additional maintenance.
  3. What happens if the source file(s) are moved from the referenced folder location?

Here is the formula with the file in the folder ‘C:\OriginalFolder\.

PathReference

Linking Files – File Migration

One problem with linking files is that files move in our directory structures from time-to-time. We may initiate this move ourselves as we come up with a new way to organize our file structures. Other times this move may be initiated by the IT Department as they are updating or migrating storage.

Let’s see what happens to the original link formula when I move the source file.
I moved the file from ‘C:\OriginalFolder\ to ‘C:\NewFolder\ in Windows Explorer.

MoveFileFinal

Now when I try to update the link value in the Destination Workbook, I receive an error message

EditLinksErrorFinal

I can click on the change source button, but I would like a solution that provides for as little maintenance needs as possible.

Linking Files – Data Moves On Worksheet

The original linked formula is linked to cell A1 in the source file. What happens if, unbeknownst to you, a user inserts a blank row at A1 and all of the data shifts from A1 to A2?

UpdateValueFinal

The Edit Links Dialog status reads, “OK”, but it is not. The Linked Cell formula is still linked to Cell $A$1 and the updated Cell Value is now 0. We could use Find-Replace to update from $1 to $2, but that could have unintended consequences. What if other workbook moved data 2-3 rows? You see the point I’m sure.

Copy Data From Source To Destination

I propose to loop through a folder and copy some data from a worksheet for each workbook found in the folder – I would like the solution to be as dynamic as possible. As I am looping, I will paste the data on a worksheet in Top-Down fashion such that the first file results will be near the top, then the data from the second file and so on.

For today, my source files all have an Excel Table in them, though we could make the process work with Ranges as well with just a bit more work. For additional resources working with Excel Tables, see the links at the bottom of the post.

Some Functions & Properties

First, I’ll look at some VBA functions that I am going to use in the final code. These functions will make your code more dynamic and user-friendly.

CurDir

CurDir returns the current path. I use CurDir to trap the current path so I can restore it at then before I change the path to make navigating the file structure faster. Use as follows:

Option Explicit

Sub foo2()

    Dim strDirectory As String
    strDirectory = CurDir()
    Debug.Print strDirectory
    
End Sub

Output:

C:\Users\wsnyder\Documents

ChDir

ChDir changes the the current path. I use ChDir to change the current path to get the user closer the final folder they will eventually choose using the FileDialog to select a folder for processing. Use as follows:

Option Explicit

Sub foo2()

    Dim strDirectory As String
    strDirectory = "C:\Data\"
    ChDir (strDirectory)
    Debug.Print CurDir()
    
End Sub

Output:

C:\Data

FileDialog Property

The FileDialog Property of the Application Object returns a FileDialog Object. This give you the ability to interact with users at runtime by allowing the user to choose File(s) or Folder(s) to work with. The FileDialog accepts one argument, the DialogType.

There are 4 DialogTypes in the MsoFileDialogType Enumeration :

  1. msoFileDialogFilePicker
  2. msoFileDialogFolderPicker
  3. msoFileDialogOpen
  4. msoFileDialogSaveAs

For today’s purposes, I’ll use msoFileDialogFolderPicker

Set diaFolder = Application.FileDialog(msoFileDialogFolderPicker)

The Setup…

As I mentioned earlier, each source file has an Excel Table in it. I would like to loop through each file in the source folder and copy the DataBodyRange of the Excel Table to the destination workbook. Additionally, on the first pass, I would also like to copy the HeaderRowRange to create headers in the destination workbook. Lastly, I would like to add some data to the right of the data from the source files, such as the date the data was copied and the name of the source file.

Create Some Sample Data and Files

I quickly whipped up some sample data using Dick’s Random Data Generator and creatively saved the files as File1.xlsx, File2.xlsx, File3.xlsx.

TablesLayered

Again, I highlighted the data [Ctrl] + [a] and added an Excel Table [Ctrl] + [t] in each file. I applied a different Table Style to each Table simply to highlight the fact that there are 3 different Tables in 3 different Excel files. If you need to brush up on Excel Tables or need to start at the beginning:

  1. Excel Table Tutorial – Contextures
  2. Sur la Excel Table
  3. Listing Toward ListObjects

Loop Through Files In A Folder

When I need to loop through files in a folder – I use the FileSystemObject (FSO). The FileSystemObject is a top-level object in the Microsoft Scriping Runtime Library (Scrrun.dll). Here are some additional references if you need to brush up or are not familiar with FSO.

  1. JP Software Tech
  2. Chip Pearson
  3. 4 Guys From Rolla
  4. dataprose.org
  5. MSDN

edit: I am going to use Late Binding in the sample snippets below. A discussion on Late / Early Binding is beyond the scope of this post. Please see the “Additional Resources” at bottom for links to detailed explanation of Late / Early Binding.

Copying Data From Source To Destination

It’s finally time to copy the data from the source workbooks to the destination workbook. I’m using three files with 50 records each, but you could use this code with an unlimited number of records or variable number of records and unlimited number of files (3, 10, 50,…) – as long as you do not exceed 1,048,576 rows (though I would never use that many row in Excel – time to consider a database).

Option Explicit

Sub CopyDataFromSourceFiles()
    
    'Author         :           Winston Snyder
    'Created Date   :           1/26/2014
    'Comments       :           Assumes each source file contains at least one list object (Excel Table)
    
    'Delare variables
        Dim wb                  As Workbook
        Dim wbData              As Workbook
        Dim ws                  As Worksheet
        Dim wsData              As Worksheet
        Dim rngData             As Range
        Dim rngDestination      As Range
        Dim lo                  As ListObject
        Dim fso                 As Object
        Dim fsoFolder           As Object
        Dim fsoFile             As Object
        Dim strSelectedFolder   As String
        Dim strCurrentPath      As String
        Const strSpecifiedPath  As String = "C:\"
        Dim lngRows             As Long
        Dim blnFlag             As Boolean
        
    'Excel environment - speed things up
        With Application
            .ScreenUpdating = False
            .DisplayAlerts = False
            .EnableEvents = False
            .Calculation = xlCalculationManual
        End With
        
    'Initialize variables
        Set wb = ThisWorkbook
        Set ws = wb.Worksheets("Data")
        blnFlag = True
        
    'Clear data from control workbook from previous consolidations
        ws.UsedRange.ClearContents
    
    'Get the current path, so reset the path at the end of the procedure
        strCurrentPath = CurDir()
        
    'Set the target directory to get the user closer to the working folder
    'This will minimize the time the user must spend drilling into the file system
    'once they are presented with the FileDialog
        ChDir (strSpecifiedPath)
    
    'Create a FileSystemObject
        Set fso = GetFSO

    'Prompt the user to select a folder
    'Return the path of the selected folder
        strSelectedFolder = GetSelectedFolder
        
    'Get the FSO Folder of the selected folder
        Set fsoFolder = fso.GetFolder(strSelectedFolder)
        
    'Loop each file in folder
    'Copy data from each file to control workbook
        For Each fsoFile In fsoFolder.Files
            Debug.Print fsoFile.Name
            Set wbData = Workbooks.Open(fsoFile)
            Set wsData = wbData.Worksheets("Sheet1")
            
            'Get next blank row from destination worksheet
            'If first time, need row 1, else, next blank row
                lngRows = GetRows(ws:=ws)
                If blnFlag = False Then lngRows = lngRows + 1
                
            'The Destination Range
                Set rngDestination = ws.Cells(lngRows, 1)
            
            'If first time, include the header row
                With wsData
                    For Each lo In .ListObjects
                        If blnFlag = True Then
                            Set rngData = Union(lo.HeaderRowRange, lo.DataBodyRange)
                            blnFlag = False
                        Else
                            Set rngData = lo.DataBodyRange
                        End If
                    Next lo
                End With
                
            'Copy the Data Range to the Destination Range
                rngData.Copy
                rngDestination.PasteSpecial xlPasteValuesAndNumberFormats
                
            'Close the source file
                wbData.Close
        
        Next fsoFile
        
    'Tidy up
        'Restore to original path
            ChDir (strCurrentPath)
            
        'Restore Excel environment
            With Application
                .ScreenUpdating = True
                .DisplayAlerts = True
                .EnableEvents = True
                .Calculation = xlCalculationAutomatic
            End With
            
        'Destroy objects
            Set fsoFolder = Nothing
            Set fso = Nothing
            Set rngData = Nothing
            Set rngDestination = Nothing
            Set ws = Nothing
            Set wb = Nothing
        
End Sub
'------------------------------------------------------------------------------------
Private Function GetRows(ws As Worksheet) As Long

    Dim r       As Long
    
    With ws
        r = .Cells(Rows.Count, 1).End(xlUp).Row
        GetRows = r
    End With
    
End Function
'------------------------------------------------------------------------------------
Private Function GetFSO()

    Dim fso             As Object
    
    Set fso = CreateObject("Scripting.FileSystemObject")
    Set GetFSO = fso
    
    Set fso = Nothing

End Function
'-------------------------------------------------------------------------------
Private Function GetSelectedFolder() As String
    
    Dim diaFolder       As FileDialog
    Dim strFolder       As String

    Set diaFolder = Application.FileDialog(msoFileDialogFolderPicker)
    With diaFolder
        .AllowMultiSelect = False
        .Show
        strFolder = .SelectedItems(1)
    End With

    GetSelectedFolder = strFolder
End Function

MergeSourceFiles

I formatted the output and hid some rows to show that there are 151 records as expected (3 files * 50 records each + 1 header row).

My favorite part of the code is here

'If first time, include the header row
                With wsData
                    For Each lo In .ListObjects
                        If blnFlag = True Then
                            Set rngData = Union(lo.HeaderRowRange, lo.DataBodyRange)
                            blnFlag = False
                        Else
                            Set rngData = lo.DataBodyRange
                        End If
                    Next lo
                End With

The properties of the ListObject (Excel Table) such as HeaderRowRange and DataBodyRange are 2 reasons why the ListObject is far superior to the Range Object. Couple these kinds of properties with the fact that you can move the Excel Table anywhere on the worksheet you want and add rows to the Table or redact rows from the Table and the consolidation code will still work flawlessly. No Excel Hell! Awesome!

Tidy Up

, , , , , , ,