MrClean

Mr. Clean swept his way into American homes in 1958 after Proctor and Gamble bought the rights from its creator, Linwood Burton. Interestingly, Regular Expressions were first described in the 50’s by Stephen Cole Kleene. Coincidence? I think not.

However, today’s post is not about household cleaning products, nor is it a history lesson in Computer Science. It is about scrubbing data in VBA using Regular Expressions (RegExp).

Substitute ()

I see many folks using multiple nested levels of the Substitute() function to try to clean their data. This can be time consuming and can lead to some inaccuracies. Instead, we can use Regular Expressions (RegExp) to increase efficiency and accuracy as well as handle complex strings and large data sets quickly and efficiently.

User Defined Functions

I also see folks create User Defined Functions (UDF’s) to manipulate strings. Here is a nice example the other day from Doug Jenkins over at Newton Excel Bach. (btw, If you are not following Doug’s blog, you should be) But is there a better way? Enter Regular Expressions (RegExp).

Regular Expressions (RegExp)

A Regular Expression is a sequence of characters that create a pattern. The sequence could be something complicated like <([A-Z][A-Z0-9]*)\b[^>]*>(.*?) to something simple like \d . Regular Expressions are very useful in VBA for working with many different scenarios of strings and introducing automation for transforming your data before loading to target databases for OLAP such as Essbase, Power Pivot or SSAS.

Regular Expression – Methods

Regular Expressions in VBA offers 3 Methods:

  1. Test
  2. Replace
  3. Execute

For today, I will focus on the Execute Method and look at the other Methods in future posts.

An example

Here’s a silly nonsensical string, 12kj$%23fg^&*34950…345. Let’s say I’m only interested in the numbers in the string so I want to return 122334950345. To return just the numbers from the string, I’ll use the pattern [0-9].

Option Explicit

Sub RegExFoo()
    
    'Author: Winston Snyder
    'Date: 12/11/2013
    'Extract a pattern of interest from an input string

    'Declare variables
        Dim wb As Workbook
        Dim ws As Worksheet
        Dim RE As Object
        Dim Match As Object
        Dim Matches As Object
        Dim strNumber As String
        Dim i As Long
    
    'Initialize variables
        Set wb = ThisWorkbook
        Set ws = wb.Worksheets("Sheet1")
        Set RE = CreateObject("VBScript.RegExp")
    
    'Criteria for Regular Expression
        With RE
            .Pattern = "[0-9]"
            .Global = True
            Set Matches = .Execute(ws.Range("A1").Value)
        End With
        
    'Loop Matches collection to build string of all numbers in the sample string
        strNumber = ""
        For i = 0 To Matches.Count - 1
            strNumber = strNumber + Matches(i)
        Next i
        
    'Output
        MsgBox strNumber
    
    'Tidy up
        'Destroy objects
            Set Matches = Nothing
            Set RE = Nothing
            Set ws = Nothing
            Set wb = Nothing
    
End Sub

Output : 122334950345

Great! The code returned exactly what I was looking for.

Case Study: RegExp

The sample is a bit obscure and seems somewhat unlikely, but who knows – might be just the thing someone is looking for. Let’s take a look as something a bit more realistic.

Let’s say we receive a load file from Financial Analysis and Planning (FP&A) that they would like loaded to a cube as a forecast scenario. For our sample, we’ll use Customer Codes and Total Revenues. We receive the file, but we immediately see that there is a problem.

DataRegEx3

There is no delimiter between the customer code and the total revenue amount. Additionally, both sets of substrings are of varying length meaning we cannot use text to columns without some manual cleanup work. Let’s look at one way we might split these strings using Regular Expressions.

I’m going to:

  1. Read the strings into an array
  2. Loop the array
  3. Split the string into substrings – load the substrings into new arrays
  4. Output the contents of the new arrays to a worksheet

A Quick Note – Early / Late Binding

A discourse on Early / Late Binding is beyond the scope of this post. Suffice to say, I am using Late Binding as demonstrated:

 Dim RegEx                               As Object
'Create a regular epression object
        Set RegEx = GetRegEx
Private Function GetRegEx() As Object

    On Error Resume Next
    Set GetRegEx = CreateObject("VBScript.RegExp")
    
End Function

There are additional links at the bottom of the post for more information on Early / Late Binding.

The Complete Code

You can open a new workbook, launch the Visual Basic Editor (VBE) add a new module and paste the code below into the module. I broke the Subs() and Functions() into separate snippets to improve readability. Alternatively, you can download the workbook, the link is at the bottom of the post.

 Option Explicit
 Sub SplitStringNoDelimiter()
    
    'Author:        Winston Snyder
    'Date:          12/15/2013
    'Purpose:       Split string into text and value components
    'Comments:      No delimiter
    '               Loop array for output
    '---------------------------------------------------------------------------------------------------------------------------------------------
    
    Dim RegEx                               As Object
    Dim wb                                  As Workbook
    Dim wsInput                             As Worksheet
    Dim wsOutput                            As Worksheet
    
    Dim rngInput                            As Range
    Dim rngOutputDescriptions               As Range
    Dim rngOutputValues                     As Range
    
    Dim arrInput()                          As Variant
    Dim arrOutputDescriptions()             As Variant
    Dim arrOutputValues()                   As Variant
    
    Dim i                                   As Long
    Dim lngRowsData                         As Long
    
    Const strPatternDescriptions            As String = "\D+"
    Const strPatternValues                  As String = "\d+(\.\d{1,2})?"
    Const lngColumnDescriptions             As Long = 1
    Const lngColumnValues                   As Long = 2
    
    'Excel enrionment - speed things up
        With Application
            .ScreenUpdating = False
            .DisplayAlerts = False
            .EnableEvents = False
            .Calculation = xlCalculationManual
        End With
    
    'Initialize
        Set wb = ThisWorkbook
        With wb
            Set wsInput = .Worksheets("Input")
            Set wsOutput = .Worksheets("Output")
        End With
        
    'Clear previous output
        wsOutput.UsedRange.ClearContents
        
    'Input range without header
        With wsInput
            lngRowsData = GetRow(ws:=wsInput)
            Set rngInput = .Range(.Cells(2, 1), .Cells(lngRowsData, 1))
        End With
        
    'Transfer input range to input array
        arrInput = rngInput
        
    'Dimension output arrays
        ReDim arrOutputDescriptions(LBound(arrInput) To UBound(arrInput))
        ReDim arrOutputValues(LBound(arrInput) To UBound(arrInput))
        
    'Create a regular epression object
        Set RegEx = GetRegEx
        
    'Loop through each string in the input array
        For i = LBound(arrInput) To UBound(arrInput)
            
            'Pass the string to regular expression function to return the descriptive portion of the string
                arrOutputDescriptions(i) = GetSubString(objRegEx:=RegEx, _
                                                        strString:=CStr(arrInput(i, 1)), _
                                                        strPattern:=strPatternDescriptions)
                                                      
            'Pass the string to regualr expressions functions to return the value portion of the string
                arrOutputValues(i) = GetSubString(objRegEx:=RegEx, _
                                                  strString:=CStr(arrInput(i, 1)), _
                                                  strPattern:=strPatternValues)
        Next i
        
    'Output all elements of each array to an output range
    'Description in Column 1, Values in Column 2
        
        'Descriptions
            Call OutputArray(ws:=wsOutput, _
                             vTmpArray:=arrOutputDescriptions, _
                             lngColumn:=lngColumnDescriptions)
        
        'Values
            Call OutputArray(ws:=wsOutput, _
                             vTmpArray:=arrOutputValues, _
                             lngColumn:=lngColumnValues)
                             
    'Add a header to the data
        With wsOutput
            .Range("A1").EntireRow.Insert shift:=xlDown
            .Cells(1, 1) = "Descriptions"
            .Cells(1, 2) = "Values"
        End With

    'Tidy up
        'Erase arrays
            Erase arrInput
            Erase arrOutputDescriptions
            Erase arrOutputValues
            
        'Destroy objects
            Set RegEx = Nothing
            Set rngInput = Nothing
            Set wsInput = Nothing
            Set wsOutput = Nothing
            Set wb = Nothing
            
        'Restore Excel environment
            With Application
                .ScreenUpdating = True
                .DisplayAlerts = True
                .EnableEvents = True
                .Calculation = xlCalculationAutomatic
            End With
End Sub

 

Private Sub OutputArray(ws As Worksheet, _
                        vTmpArray() As Variant, _
                        lngColumn As Long)

    Dim j As Long
    
    For j = LBound(vTmpArray) To UBound(vTmpArray)
        ws.Cells(j, lngColumn).Value = vTmpArray(j)
    Next j
                      
End Sub

 

Private Function GetRow(ws As Worksheet) As Long

    GetRow = ws.Cells(Rows.Count, 1).End(xlUp).Row

End Function

 

Private Function GetRange(ws As Worksheet, _
                          lngRowsStart As Long, _
                          lngRowsEnd As Long, _
                          lngColumn As Long) As Range

    Dim rng As Range

    With ws
        Set rng = .Range(.Cells(lngRowsStart, lngColumn), .Cells(lngRowsEnd, lngColumn))
    End With
    
    Set GetRange = rng
    
End Function

 

Private Function GetRegEx() As Object

    On Error Resume Next
    Set GetRegEx = CreateObject("VBScript.RegExp")

End Function

 

Private Function GetSubString(objRegEx As Object, _
                              strString As String, _
                              strPattern As String) As String
                              
    Dim reMatches As Object
    Dim strResult As String
    
    strResult = "No Match"
    
    With objRegEx
        .Pattern = strPattern
        .Global = True

        Set reMatches = .Execute(strString)
        
        If reMatches.Count <> 0 Then
            strResult = reMatches.Item(0)
        End If
    End With
    
    GetSubString = strResult

End Function

The Results

REOutput
Great! Exactly what I was looking for.

Tidy up

    Final Thoughts

    Regular Expressions are found in nearly all programming languages and much like Duct Tape, they have a Million and one uses. You are truly only limited by your imagination and ability to concoct the correct patterns. I use Regular Expressions as I Extract data from various Data Silos to Transform and normalize data prior to Loading to target reporting databases. Let us know how you use Regular Expressions in the comments.

    Downloads

    Download the workbook from OneDrive.

    Additional Resources


Duke
 
I’m a sucker for a good western. Here’s John Wayne as Rooster Cogburn in True Grit the only film for which Wayne won the Oscar for Best Actor. Wayne and a partner owned a ranch not far from here and tales spin around the old town of Wayne and actor buddy, Lee Marvin, closing down a local cantina many a night back in the day.

But I digress, as this post is not about life out here in the open range of the wild west. Rather, it is about the Range Object of the Excel Object Model. The Range Object has 78 Methods and 96 Properties. I’ll take a look at some of these over a series of posts. I’ll begin with these methods:

  1. AdvancedFilter
  2. AutoFilter
  3. Find
  4. SpecialCells
  5. PasteSpecial

AdvancedFilter

Here’s a code snippet of the final code at the bottom that demonstrates some of the syntax for the AdvancedFilter Method:

Range.AdvancedFilter _
                Action:=xlFilterCopy, _
                CopyToRange:=wsCriteria.Cells(1, 1), _
                Unique:=True

The most important parameters here are the Action and Unique parameters. This allows us to quickly get a unique list of data from the dataset that we can use as the criteria for the AutoFilter later

AutoFilter

Here’s a code snippet of the final code at the bottom that demonstrates some of the syntax for the AutoFilter Method:

    Range.AutoFilter _
             Field:=lngField, _
             Criteria1:=varCriteria

There are additional parameters available for the AutoFilter Method and I encourage you to investigate those. See the link at the bottom of this post. For now, I want to point out that I am passing variables to the named parameters. In the next section you’ll see how I use the Find Method to determine the column number of the header that I am interested in as “lngField”.

As I loop though the list of criteria created in the AdvancedFilter process I pass the value as “varCriteria” to the AutoFilter process.

Find

Here’s a code snippet of the final code at the bottom that demonstrates some of the syntax for the Find Method:

lngField = rngHeader.Find(What:=strFieldName, _
                          LookIn:=xlValues, _
                          LookAt:=xlWhole, _
                          MatchCase:=False).Column

The Find Method allows us to get a value at run-time using an InputBox for example, find the item, and pass the column number where the item is located to our process. This makes the process more robust and dynamic.

SpecialCells

Here’s a code snippet of the final code at the bottom that demonstrates some of the syntax for the SpecialCells Method:

Range.SpecialCells(xlCellTypeVisible).Copy

When most folks begin using VBA, they tend to loop through cells. However, SpecialCells used in conjunction with AutoFilter can offer much significant performance improvements as looping through cells tends to be slower. There are many types (enumeration) of SpecialCells. Link at bottom for full list. In the code sample below, I am using xlCellTypeVisible thereby ignoring all rows hidden by the AutoFilter.

PasteSpecial

Here’s a code snippet of the final code at the bottom that demonstrates some of the syntax for the PasteSpecial Method:

wbBifurcate.Worksheets("Sheet1").Cells(1, 1).PasteSpecial xlPasteValuesAndNumberFormats

There are 11 other paste types in the enumeration. I encourage you to explore them all. See the link at the bottom of this post.

Case Study: Bifurcate Excel File

ExcelSplitA common question posted on the LinkedIn Groups is, “How to split (bifurcate) an Excel file based on some criteria within the file?”

We need some test data so I downloaded a random data generator add-in over at Daily Dose of Excel by Dick Kusleika and whipped up some data:
 
BData
 Looks good so far. Now I’d like to bifurcate the master data into separate files based on the company name.

Get a unique list of company names

I’ll leverage the “unique” parameter of the AdvancedFilter Method of the Range Object

Range.AdvancedFilter _
                Action:=xlFilterCopy, _
                CopyToRange:=wsCriteria.Cells(1, 1), _
                Unique:=True

CData

The Complete Code

Option Explicit

Sub BifurcateFile()

    'Author:Winston Snyder
    'Date: 12/7/2013
    'Purpose: Bifurcate master file into component files
    
    'Declare variables
        Dim wb As Workbook
        Dim wbBifurcate As Workbook
        Dim ws As Worksheet
        Dim wsCriteria As Worksheet
        Dim rngList As Range
        Dim rngData As Range
        Dim rngCriteria As Range
        Dim rngHeader As Range
        Dim r As Long
        Dim c As Long
        Dim i As Long
        Dim lngField As Long
        Dim rCriteria As Long
        Dim strFieldName As String
        Dim strPath As String
        Dim varCriteria As Variant
        
    'Excel environment - speed things up
        With Application
            .ScreenUpdating = False
            .DisplayAlerts = False
            .EnableEvents = False
            .Calculation = xlCalculationManual
        End With
        
    'Intialize variables
        Set wb = ThisWorkbook
        Set ws = wb.Worksheets("Data")
        strFieldName = InputBox("Please enter the term to search for?") '<-Update as needed
        strPath = DocsPath & "Load Files\" '<-File output, update as needed
        r = ws.Cells(Rows.Count, 1).End(xlUp).Row
        c = ws.Cells(1, Columns.Count).End(xlToLeft).Column
        
    'Add a worksheet to hold filtered data
    'This list will become the criteria list for bifurcating the master file
        Set wsCriteria = wb.Worksheets.Add(After:=Worksheets(Worksheets.Count))
        
    'Find column number to be filtered
        Set rngHeader = ws.Range(ws.Cells(1, 1), ws.Cells(1, c))
        lngField = rngHeader.Find(What:=strFieldName, _
                                  LookIn:=xlValues, _
                                  LookAt:=xlWhole, _
                                  MatchCase:=False).Column
        
    'Define the range to be filtered
        With ws
            If .FilterMode = True Then
                .ShowAllData
            End If
                        
            Set rngList = .Range(.Cells(1, lngField), .Cells(r, lngField))
            Set rngData = .UsedRange
        End With
        
    'Filter the range
        With rngList
            .AdvancedFilter _
                Action:=xlFilterCopy, _
                CopyToRange:=wsCriteria.Cells(1, 1), _
                Unique:=True
        End With
        
    'Define the criteria range
    'Begin with row 2 to ignore the header row
        With wsCriteria
            rCriteria = .Cells(Rows.Count, 1).End(xlUp).Row
            Set rngCriteria = .Range(.Cells(2, 1), .Cells(rCriteria, 1))
        End With
        
    'Loop through criteria range
    'Use each value as criteria to apply to autofilter for data range
    'Create file
        For i = rngCriteria.Rows.Count To 1 Step -1
            varCriteria = rngCriteria.Cells(i, 1).Value
            
            'Add a workbook to hold filtered results
                Set wbBifurcate = Workbooks.Add
                
            'Filter the original data
                With ws
                    
                    'If data is filtered, remove filter
                        If .FilterMode = True Then
                            .ShowAllData
                        End If
                    
                    'Filter the data
                        .AutoFilterMode = False
                        If Not .AutoFilterMode Then
                            rngData.AutoFilter _
                                Field:=lngField, _
                                Criteria1:=varCriteria
                        End If
                            
                    'Copy the visible range of the data range - include the header row
                        rngData.SpecialCells(xlCellTypeVisible).Copy
                        wbBifurcate.Worksheets("Sheet1").Cells(1, 1).PasteSpecial xlPasteValuesAndNumberFormats
                        
                    'Remove the autofilter
                        If .AutoFilterMode = True Then
                            .AutoFilterMode = False
                        End If
                        
                End With
                
            'Save the bifurcated data workbook
                wbBifurcate.SaveAs strPath & varCriteria & ".xlsx", FileFormat:=51
                
            'Close the workbook
                wbBifurcate.Close
            
        Next i
        
    'Tidy up
        'Delete temporary worksheet
            wsCriteria.Delete
            
        'Delete objects
            Set rngList = Nothing
            Set rngData = Nothing
            Set rngCriteria = Nothing
            Set rngHeader = Nothing
            Set wsCriteria = Nothing
            Set ws = Nothing
            Set wbBifurcate = Nothing
            Set wb = Nothing
            
        'Excel environment - restore
            With Application
                .ScreenUpdating = True
                .DisplayAlerts = True
                .EnableEvents = True
                .Calculation = xlCalculationAutomatic
            End With
End Sub
Public Function DocsPath() As String
    
    'Purpose: Get the Environ value for User Docuents
    'Returns: C:\Users\%User Name%\Documents\
    
    DocsPath = Environ$("USERPROFILE") + "\Documents\"
End Function

I’m using the DocsPath function to get the Documents folder based on which user is logged in. This makes the code a bit more robust and transportable. I should really check for the existence of a subfolder or offer the user the option to create the subfolder if it does not exist – but I feel that complicates the code which may already be complicated enough.

Tidy up

    Final Thoughts

    The sample does take a little bit of time to run as there are 100K records and 100 unique company names – therefore 100 separate files. I’ve seen varying attempts to use passwords and other techniques to allow a user to only view certain data. However, the best way to make sure a user cannot access data that you do not want them to see is to make sure it is not in the workbook in the first place.

    The process may be faster if instead of looping through a range, we first transfer the range to an array and then loop the array to pass the elements of the array as criteria for the autofilter.

    Another option might be to get the size of each range after the autofilter, set the size of the receiving range and then transfer the value of one range to another. This would bypass the Windows Clipboard which is causing the process to run a little slower using the Copy-PasteSpecial. Something like:

    Set rngSource = Range.SpecialCells(xlCellTypeVisible)
    With rngSource
         rowsSource = .Rows.Count
         columnsSource = .Columns.Count
    End With
    
    Set rngDestination = wbBifurcate.Worksheets("Sheet1").Range("A1")
    Set rngDestination = rngDestination.Resize(rowsSource,columnsSource)
    
    rngDestination.Value = rngSource.Value
    

    The process I use above in the main code, may be used in conjunction with varying methods of automated distribution such as through Outlook or TransferSharePointList. Additionally, instead of pasting values into a new workbook, I could use a template workbook that has Pivot Tables, Charts, Formulas – everything that is needed – just need to append the new data or delete any old data and paste in the new data. I’ll post samples of those processes in future posts.

    Downloads

    Download the workbook from SkyDrive.

    Additional Resources

, ,


Scribe 
 The Microsoft Scripting Runtime Library (scrrun.dll) exposes a couple of objects that we can exploit to increase the power and functionality of VBA.

    Dictionary
    Drive
    FileSystemObject
    TextStream

For today, I’ll look at the FileSystemObject. I’ll take a look at the other objects in future posts.
 
edit: I am going to use Late Binding in the sample snippets below. A discussion on Late / Early Binding is beyond the scope of this post. Please see the “Additional Resources” at bottom for links to detailed explanation of Late / Early Binding.
 
I would like a list of all files in a folder, C:\Data. You may use the FileSystemObject to loop through folders and files. In an Excel file, open the Visual Basic Editor, add a module and paste or enter this code:

Option Explicit
Sub Foo()

    'Author: Winston Snyder
    'Date: 11/26/2013
    'Purpose: Demonstrate looping through files in a folder using the FileSystemObject
    'Comment: Uses Late Binding
    '--------------------------------------

    'Declare variables
        Dim wb As Workbook
        Dim ws As Worksheet
        Dim FSO As Object
        Dim fsoFolder As Object
        Dim fsoFile As Object
        Dim strPath As String
        Dim i As Long
        
    'Excel environment - speed things up
        With Application
            .ScreenUpdating = False
            .DisplayAlerts = False
            .EnableEvents = False
            .Calculation = xlCalculationManual
        End With
        
    'Initialize variables
        strPath = "C:\Data\"
        Set wb = ThisWorkbook
        Set ws = wb.Worksheets.Add(After:=Worksheets(Worksheets.Count))
        Set FSO = CreateObject("Scripting.FileSystemObject")
        Set fsoFolder = FSO.GetFolder("C:\Data\")
        i = 2

    'List files in folder
        For Each fsoFile In fsoFolder.Files
            ws.Cells(i, 1).Value = fsoFile.Name
            i = i + 1
        Next fsoFile
    
    'Add Header
        ws.Cells(1, 1).Value = "FileName"
    
    'Tidy up
        'Destroy objects
            Set fsoFolder = Nothing
            Set FSO = Nothing
            Set ws = Nothing
            Set wb = Nothing
        
        'Restore Excel environment
            With Application
                .ScreenUpdating = True
                .DisplayAlerts = True
                .EnableEvents = True
                .Calculation = xlCalculationAutomatic
            End With
End Sub

Ole P. Erlandsen has some nice examples on his site using FileSystemObject to loop through Folders and Subfolders to get all kinds of file information.

A Business Case

OK, we looked at some fairly simple code to use the FileSystemObject (FSO) to look at how to get a list of files in a folder and some of each file’s properties. But how can we use FSO in a business case? Again, I’ll be using Late Binding. Additionally, I’ll be using the DoCmd Object of the Microsoft Access Object Model as well as some VBA functions to look at the file names and manipulate string variables.

My goal is to examine each file in a specified location, if the file meets my criteria, I want to manipulate the name of the file into a table name that I can transfer to MS Access. The code below will append “dim” to the beginning of each file name as well as remove the extension from the file name. I am using “dim” in this case because this is code I use to load dimension (dim) tables to MSAccess and later import into PowerPivot. Without further ado:

Option Compare Database

Public Sub ImportLoadFiles()

    'Author: Winston Snyder
    'Date: 11/27/2013
    'Purpose: Load Excel files to Access database tables
    'Comment: Paste the code into a standard module in the VBE in MS Access
    '----------------------------------------------------
    
    'Declare variables
        Dim strPath As String
        Dim strTableName As String
        Dim FSO As Object
        Dim fsoFolder As Object
        Dim fsoFile As Object
        Dim strFileName As String
        
    'Initialize variables
        strPath = DocsPath & "Load Files\"
        Set FSO = CreateObject("Scripting.FileSystemObject")
        Set fsoFolder = FSO.GetFolder(strPath)
        
    'Transfer all Excel files that meet criteria to Access database tables
        With DoCmd
                    
            'Turn off warnings
                .SetWarnings False
                
            'Loop and transfer files to database
                For Each fsoFile In fsoFolder.Files
                    
                    'If Excel file, create a name for  dimension (dim) table
                        If InStr(fsoFile.Name, ".xlsx") Then
                            strFileName = Left(fsoFile.Name, Len(fsoFile.Name) - 5)
                            strFileName = "dim" & strFileName
                            
                            'Transfer the spreadsheet to MS Access table
                                .TransferSpreadsheet TransferType:=acImport, _
                                                     SpreadsheetType:=acSpreadsheetTypeExcel12Xml, _
                                                     TableName:=strFileName, _
                                                     FileName:=strPath & fsoFile.Name, _
                                                     HasFieldNames:=True
                        End If
                Next fsoFile
                
            'Turn on warnings
                .SetWarnings True
        End With
        
    'Tidy up
        'Destroy objects
            Set fsoFolder = Nothing
            Set FSO = Nothing
End Sub

Note the function call on this line

strPath = DocsPath & "Load Files\"

DocsPath is a function, so we’ll need to create a function to return the USERPROFILE of the host machine which we can then use.

Public Function DocsPath() As String
    
    'Purpose: Get the Environ value for User Docuents
    'Returns: C:\Users\%User Name%\Documents\
    
    DocsPath = Environ$("USERPROFILE") + "\Documents\"
End Function

Also, please note, for simplicity I assume the folder, “Load Files” exists in the “ImportLoadFiles” snippet above. If it does not, the code will generate an error. You can make the code more robust by adding some error handling to check if folders exist and if the folder contains files that you are interested in.

An Example

ExcelToAccess

Create Load File

I need to create a load file and I have a process to do that, but I will not go too far into it in this post – I’ll review the process in a future post. For now, two things are important.

  1. The name of the file must be similar to the name of the target table
  2. The column headers in the load file must match the field names in the target table
File Name

The code above takes care of the file to table naming. Recall – the code trims the file extension and appends “dim” to the front of the file name. Therefore, the filename, “Teams.xlsx” is translated to “dimTeams” which is the name of the target table. You just need to plan your load file names to align in some fashion with the name of the target table in the Access database.

Create Access Table

Field (Column) Names

ColumnFieldName

  1. Add a table to your database
  2. Save the table as “dim” and some descriptive name such as “dimTeam”
  3. Name the first field of the table with a descriptive term plus the suffix “Key”. For example, “TeamKey”
  4. Add another field. Give it a descriptive name that describes what the field will contain such as “TeamName”.
  5. Set the Data Type to the appropriate type such as “Short Text”.
  6. While still selected on the second field, click on “Indexed” in the Field Properties pane. Click on the drop-down, change from the default value of “No” to “Yes (No Duplicates)”.
  7. Save and close the table
  8. Run the “ImportLoadFiles” process to load the table from the Excel file.

In my sample of NFL teams, I loaded 32 records in my first pass as expected. I then deleted 4 records from the table, and deleted all but the same 4 teams plus one extra from the load file. I saved the load file and ran the import process. The 4 missing records were imported as expected. the 5th record was already in the table so it was disregarded by Access.

Additional Resources

Tidy up

    Final Thoughts

    The one load file, one Access table, 32 records is a silly example. But what if you have 10 tables…20…50? Now, hopefully, you see the true value. How do you use FileSystemObject (FSO) in your Excel / Access projects? Let us know in the comments section.

    Downloads

    Download the Excel file and Access database from SkyDrive.

, , ,