FishingTrim

I’m not much of an angler these days. In my younger days, I pulled my share out of the Skunk River and lakes around central Iowa and southern Minnesota. Sit back and I’ll tell you a tale about the one that got away….

Kidding – today’s post is the third in a series about Microsoft Scripting in VBA. In the first post I covered the FileSystemObjct (FSO). In the second post, I looked at Regular Expressions (RegExp).

Previous posts:

  1. FileSystemObject (FSO)
  2. Regular Expressions

Today, I’ll look at the TextStream Object.

The TextStream Object

The TextStream Object enables you to read from and write to text files from Excel using VBA and the Microsoft Scripting Runtime Library (scrrun.dll)


    edit: I am going to use Late Binding in the sample snippets below. A discussion on Late / Early Binding is beyond the scope of this post. Please see these links for a detailed explanation of Late / Early Binding.

    1. Beyond Excel
    2. JP Software Technologies


Public Variables

Because I am using Late Binding, I’ll declare some Public Constants for working with the TextStream Object

Public Const gclForReading As Long = 1
Public Const gclForWriting As Long = 2
Public Const gclForAppending As Long = 8
Public Const gclTristateUseDefault As Long = -2
Public Const gclTristateTrue As Long = -1
Public Const gclTristateFalse As Long = 0

Tristate specifies the format of the text file:

  1. TristateUseDefault = -2 ; Opens the file using the system default
  2. TristateTrue = -1 ; Opens the file as Unicode
  3. TristateFalse = 0 ; Opens the file as ASCII

I also would like to create a function to let users choose a file or a folder. To use the function I’ll create global constants using the MsoFileDialogType Enumeration:

Public Const gclmsoFileDialogFilePicker As Long = 3
Public Const gclmsoFileDialogFolderPicker As Long = 4
Public Const gclmsoFileDialogOpen As Long = 1
Public Const gclmsoFileDialogSaveAs As Long = 2

CreateTextFile Method

I’ll use the CreateTextFile Method of the FileSystemObject (FSO) to create a TextStream Object and write a little message to the file:

Option Explicit
Sub WriteToText()

    'Author     :   Winston Snyder
    'Date       :   2/12/2014
    'Purpose    :   Write to a text file

    'Declare variables
        Dim fso             As Object
        Dim fsoFolder       As Object
        Dim fsoFile         As Object
        Dim ts              As Object
        Dim strFileName     As String
        Dim strFolderName   As String

    'Allow the user to choose a folder location to save the text file to
        strFolderName = GetFolder

    'Get file name from user
        strFileName = GetUserInput(strPrompt:="What would you like to name the file", _
                                   strTitle:="File Name")

    'Create a FileSystemObject (FSO)
        Set fso = GetFileSystemObject

    'Create a text file
        fso.CreateTextFile strFolderName & "\" & strFileName & ".txt"

    'Create an FSO file for the text file just created
        Set fsoFile = fso.GetFile(strFolderName & "\" & strFileName & ".txt")

    ' Open a TextStream for output.
        Set ts = fsoFile.OpenAsTextStream(gclForWriting, gclTristateUseDefault)

    ' Write to the TextStream
        ts.WriteLine "Jeff Weir"
        ts.WriteLine "is a prolific blogger!"
        ts.Close

    'Tidy up
        Set ts = Nothing
        Set fsoFile = Nothing
        Set fso = Nothing

End Sub

Textstream2

Append To A Text File

We don’t always want to create a new text file. Sometimes we want to append to an existing text file. We can use the OpenTextFile Method of the FileSystemObject:

Sub AppendToText()

    'Author     :   Winston Snyder
    'Date       :   2/15/2014
    'Purpose    :   Append to a text file

    'Declare variables
        Dim fso             As Object
        Dim fsoFile         As Object
        Dim ts              As Object
        Dim strFileName     As String

    'Allow the user to choose a file to append to
        strFileName = GetFile()

    'Create a FileSystemObject (FSO)
        Set fso = GetFileSystemObject

    'Create an FSO file for the text file just created
        Set fsoFile = fso.GetFile(strFileName)

    ' Open a TextStream for output.
        Set ts = fsoFile.OpenAsTextStream(gclForAppending, gclTristateUseDefault)

    ' Write to the TextStream
        ts.WriteLine
        ts.WriteLine "Mike Alexander is a handsome devil!"
        ts.Close

    'Tidy up
        Set ts = Nothing
        Set fsoFile = Nothing
        Set fso = Nothing

End Sub

At the GetFile() Dialog prompt, I chose the file I created in the “CreateTextFile Method”. I then used the constant, “gclForAppending” to specify that the new content was to be appended to the existing file. Had I used the constant, “gclForWriting”, the original contents would have been overwritten.

TextFileAppend2

The Functions

Here are the Functions() I used with the code snippets above

FileDialogFolderPicker

Public Function GetFolder() As String
 
    Dim fd As FileDialog
    Dim strFolderName As String
    Set fd = Application.FileDialog(msoFileDialogFolderPicker)
     
    With fd
        .Title = "Please select a folder"
        .AllowMultiSelect = False
        .Show
        strFolderName = .SelectedItems(1)
    End With
 
    GetFolder = strFolderName
     
    Set fd = Nothing
End Function

GetFileSystemObject

Public Function GetFileSystemObject() As Object
     
    On Error Resume Next
    Set GetFileSystemObject = CreateObject("Scripting.FileSystemObject")
     
End Function

GetUserInput

Public Function GetUserInput(strPrompt As String, _
                             strTitle As String) As String
     
    Dim strUserInput As String
     
    strUserInput = InputBox(Prompt:=strPrompt, _
                            Title:=strTitle)
                             
    GetUserInput = strUserInput
 
End Function

FileDialogFilePicker

Public Function GetFile() As String
 
    Dim fd As FileDialog
    Dim strFileName As String
    Set fd = Application.FileDialog(msoFileDialogFilePicker)
     
    With fd
        .Title = "Please select a file"
        .AllowMultiSelect = False
        .Show
        strFileName = .SelectedItems(1)
    End With
 
    GetFile = strFileName
     
    Set fd = Nothing
End Function

The sample Procedures() given are fine for learning and understanding how to create text files and how to write and append data to the files. But how can we leverage these in a business environment to solve business needs and goals?

Merge Text Files

We may have a series of text files that we wish to merge into 1. Back in November, 2013, Ken Puls showed us how to use Power Query to import multiple text files into a Data Model (Power Pivot). All well and good. Let’s see if we can merge text files using the TextStream Object.

Here are some .csv files that I would like to merge together:

TextFilesForMerge

Sub MergeTextFiles()
    
    'Author     :   Winston Snyder
    'Date       :   2/15/2014
    'Purpose    :   Merge text files in a folder
    
    'Declare variables
        Dim fso                             As Object
        Dim fsoFolder                       As Object
        Dim fsoFileOutput                   As Object
        Dim fsoFile                         As Object
        Dim tsOutput                        As Object
        Dim tsInput                         As Object
        Dim strFolderInputFiles             As String
        Dim strFolderOutputFiles            As String
        Dim strFileName                     As String
        Dim strMisc                         As String

    'Allow the user to choose a folder that contains files to be merged
        strFolderInputFiles = GetFolder(strTitle:="Folder for files to be merged")
        
    'Allow the user to choose a folder location for the output file
        strFolderOutputFiles = GetFolder(strTitle:="Folder for output files")

    'What to name the output file
        strFileName = GetUserInput(strPrompt:="What would you like to name the file", _
                                   strTitle:="File Name")

    'Create a FileSystemObject (FSO)
        Set fso = GetFileSystemObject
        
    'Get an FSO folder for the input files
        Set fsoFolder = fso.GetFolder(strFolderInputFiles)
        
    'Create a text file for output
        fso.CreateTextFile strFolderOutputFiles & "\" & strFileName & ".txt"
        
    'Create an FSO file for the text file just created
        Set fsoFileOutput = fso.GetFile(strFolderOutputFiles & "\" & strFileName & ".txt")

    'Open a TextStream for output.
        Set tsOutput = fsoFileOutput.OpenAsTextStream(gclForAppending, gclTristateUseDefault)
        
    'Loop through the files in the input folder
        For Each fsoFile In fsoFolder.Files
            Set tsInput = fsoFile.OpenAsTextStream(gclForReading, gclTristateUseDefault)
            Do Until tsInput.AtEndOfStream
                strMisc = tsInput.ReadLine                  'Read from the input file
                tsOutput.WriteLine strMisc                  'Write to the output file
            Loop
        Next fsoFile
        
    'Tidy up
        Set tsInput = Nothing
        Set tsOutput = Nothing
        Set fsoFolder = Nothing
        Set fso = Nothing

End Sub

Here is the merged file.

MergedFilesMultipleHeaders

Pretty good, except the headers are repeating. I want to revise the code a bit to only include the header from the first file and skip the header on the subsequent files. The revised code:

Sub MergeTextFilesFirstFileHeader()
    
    'Author     :   Winston Snyder
    'Date       :   2/15/2014
    'Purpose    :   Merge text files in a folder
    'Comments   :   Only uses header row from the first file
    '               subsequent files, header row is skipped
    
    'Declare variables
        Dim fso                             As Object
        Dim fsoFolder                       As Object
        Dim fsoFileOutput                   As Object
        Dim fsoFile                         As Object
        Dim tsOutput                        As Object
        Dim tsInput                         As Object
        Dim strFolderInputFiles             As String
        Dim strFolderOutputFiles            As String
        Dim strFileName                     As String
        Dim strMisc                         As String
        Dim blnTest                         As Boolean
        
    'Initialize variables
        blnTest = True                                          'First time through loop

    'Allow the user to choose a folder that contains files to be merged
        strFolderInputFiles = GetFolder(strTitle:="Folder for files to be merged")
        
    'Allow the user to choose a folder location for the output file
        strFolderOutputFiles = GetFolder(strTitle:="Folder for output files")

    'What to name the output file
        strFileName = GetUserInput(strPrompt:="What would you like to name the file", _
                                   strTitle:="File Name")

    'Create a FileSystemObject (FSO)
        Set fso = GetFileSystemObject
        
    'Get an FSO folder for the input files
        Set fsoFolder = fso.GetFolder(strFolderInputFiles)
        
    'Create a text file for output
        fso.CreateTextFile strFolderOutputFiles & "\" & strFileName & ".txt"
        
    'Create an FSO file for the text file just created
        Set fsoFileOutput = fso.GetFile(strFolderOutputFiles & "\" & strFileName & ".txt")

    'Open a TextStream for output.
        Set tsOutput = fsoFileOutput.OpenAsTextStream(gclForAppending, gclTristateUseDefault)
        
    'Loop through the files in the input folder
        For Each fsoFile In fsoFolder.Files
            Set tsInput = fsoFile.OpenAsTextStream(gclForReading, gclTristateUseDefault)
            If blnTest = True Then
                blnTest = False
            Else
                tsInput.SkipLine                            'Move the file pointer to the line below the header
            End If
            Do Until tsInput.AtEndOfStream
                strMisc = tsInput.ReadLine                  'Read from the input file
                tsOutput.WriteLine strMisc                  'Write to the output file
            Loop
        Next fsoFile
        
    'Tidy up
        Set tsInput = Nothing
        Set tsOutput = Nothing
        Set fsoFolder = Nothing
        Set fso = Nothing

End Sub

MergedFilesOneHeader

Awesome! The solution was to use the SkipLine Method of the TextStream Object everytime through the loop other than the first time:

If blnTest = True Then
   blnTest = False
Else
   tsInput.SkipLine                        'Move the file pointer to the line below the header
End If

Read From Text File

Another common problem, is to read the contents of a text file into Excel. I’ll demonstrate some code to take care of this using comma separated values (csv), but you can use any kind of delimiter such as tab, space, pipe “|”, etc….

Sub ReadTextFileIntoExcel()
    
    'Author     :   Winston Snyder
    'Date       :   2/15/2014
    'Purpose    :   Read data from text file, output to Excel
    
    'Declare variables
        Dim fso                         As Object
        Dim fsoFile                     As Object
        Dim ts                          As Object
        
        Dim vArrData()                  As Variant
        Dim strLine()                   As String
        Dim strData()                   As String
        
        Dim i                           As Long
        Dim j                           As Long
        
        Dim strTextFileName             As String
        Dim strFolderOutputFiles        As String
        Dim strExcelFileName            As String
        Dim strReadAll                  As String
        
    'User - choose a text file to read into Excel
        strTextFileName = GetFile()
        
    'User - choose a folder location for the output file
        strFolderOutputFiles = GetFolder(strTitle:="Folder for output files")

    'User - name the output file
        strExcelFileName = GetUserInput(strPrompt:="What would you like to name the file", _
                                        strTitle:="File Name")

    'Create a FileSystemObject (FSO)
        Set fso = GetFileSystemObject
        
    'Create an FSO file for the user selected text file
        Set fsoFile = fso.GetFile(strTextFileName)

    'Open a TextStream for reading
        Set ts = fsoFile.OpenAsTextStream(gclForReading, gclTristateUseDefault)
        
    'Read the text file and store it in a string variable
        strReadAll = ts.ReadAll
    
    'Split each line of the text document based on the new line delimiter
        strLine = Split(strReadAll, vbNewLine)
        
    'Get number of elements in the line
        strData = Split(strLine(0), ",")
        
    'Redim the data array
        ReDim vArrData(LBound(strData) To UBound(strData), LBound(strLine) To UBound(strLine))
        
    'Erase the strData Array
        Erase strData
        
    'Loop the strLine Array, split each line into data elements, load the data elements into the data array
        For i = LBound(strLine) To UBound(strLine)
            strData = Split(strLine(i), ",")
            For j = LBound(strData) To UBound(strData)
                vArrData(j, i) = strData(j)
            Next j
        Next i

    'Output the array to an Excel Worksheet
        Call ArrayToRange(vArr:=vArrData, _
                          strPath:=strFolderOutputFiles, _
                          strFileName:=strExcelFileName)
        
    'Tidy up
        'Erase arrays
            Erase vArrData
            Erase strLine
            Erase strData
            
        'Destroy objects
            Set ts = Nothing
            Set fsoFile = Nothing
            Set fso = Nothing

End Sub

'----------------------------------------------------------------------------------------

Public Sub ArrayToRange(ByRef vArr() As Variant, _
                        strPath As String, _
                        strFileName As String)
                        
    Dim wbNew As Workbook
    Dim wsNew As Worksheet
    Dim rngNew As Range
    Dim r As Long
    Dim c As Long
    
    c = UBound(vArr, 1)                                     '1st dimension of array
    r = UBound(vArr, 2)                                     '2nd dimension of array
    
    Set wbNew = Workbooks.Add
    Set wsNew = wbNew.Worksheets("Sheet1")
    Set rngNew = wsNew.Range("A1")

    'Resize the destination range
    'Use +1, +1 for Rows and columns since array begins at (0,0)
        rngNew.Resize(r + 1, c + 1).Value = Application.Transpose(vArr)
    
    wbNew.SaveAs strPath & "\" & strFileName & ".xlsx"
    wbNew.Close
    
    Set rngNew = Nothing
    Set wsNew = Nothing
    Set wbNew = Nothing
    
End Sub

TextToExcelFinal

Tidy Up

That’s it for today. Long post, sorry ’bout that. That last snjppet ended up being about Arrays as much as it was about the TextStream Object. Now, where did I leave my worms? I’m going fishing.

Download the file from SkyDrive

Previous Posts At dataprose.org – Scripting

  1. The FileSystemObject
  2. Regular Expressions

Microsoft Scripting

  1. Microsoft Scripting Center
  2. FileSystemObject Reference (Windows Scripting)
  3. TextStream Object
  4. 4 Guys From Rolla – FSO

Additional Resources – Arrays

  1. VBA Arrays And Worksheet Ranges
  2. Understanding Arrays
  3. Redim Statement
, , , ,


MrClean

Mr. Clean swept his way into American homes in 1958 after Proctor and Gamble bought the rights from its creator, Linwood Burton. Interestingly, Regular Expressions were first described in the 50’s by Stephen Cole Kleene. Coincidence? I think not.

However, today’s post is not about household cleaning products, nor is it a history lesson in Computer Science. It is about scrubbing data in VBA using Regular Expressions (RegExp).

Substitute ()

I see many folks using multiple nested levels of the Substitute() function to try to clean their data. This can be time consuming and can lead to some inaccuracies. Instead, we can use Regular Expressions (RegExp) to increase efficiency and accuracy as well as handle complex strings and large data sets quickly and efficiently.

User Defined Functions

I also see folks create User Defined Functions (UDF’s) to manipulate strings. Here is a nice example the other day from Doug Jenkins over at Newton Excel Bach. (btw, If you are not following Doug’s blog, you should be) But is there a better way? Enter Regular Expressions (RegExp).

Regular Expressions (RegExp)

A Regular Expression is a sequence of characters that create a pattern. The sequence could be something complicated like <([A-Z][A-Z0-9]*)\b[^>]*>(.*?) to something simple like \d . Regular Expressions are very useful in VBA for working with many different scenarios of strings and introducing automation for transforming your data before loading to target databases for OLAP such as Essbase, Power Pivot or SSAS.

Regular Expression – Methods

Regular Expressions in VBA offers 3 Methods:

  1. Test
  2. Replace
  3. Execute

For today, I will focus on the Execute Method and look at the other Methods in future posts.

An example

Here’s a silly nonsensical string, 12kj$%23fg^&*34950…345. Let’s say I’m only interested in the numbers in the string so I want to return 122334950345. To return just the numbers from the string, I’ll use the pattern [0-9].

Option Explicit

Sub RegExFoo()
    
    'Author: Winston Snyder
    'Date: 12/11/2013
    'Extract a pattern of interest from an input string

    'Declare variables
        Dim wb As Workbook
        Dim ws As Worksheet
        Dim RE As Object
        Dim Match As Object
        Dim Matches As Object
        Dim strNumber As String
        Dim i As Long
    
    'Initialize variables
        Set wb = ThisWorkbook
        Set ws = wb.Worksheets("Sheet1")
        Set RE = CreateObject("VBScript.RegExp")
    
    'Criteria for Regular Expression
        With RE
            .Pattern = "[0-9]"
            .Global = True
            Set Matches = .Execute(ws.Range("A1").Value)
        End With
        
    'Loop Matches collection to build string of all numbers in the sample string
        strNumber = ""
        For i = 0 To Matches.Count - 1
            strNumber = strNumber + Matches(i)
        Next i
        
    'Output
        MsgBox strNumber
    
    'Tidy up
        'Destroy objects
            Set Matches = Nothing
            Set RE = Nothing
            Set ws = Nothing
            Set wb = Nothing
    
End Sub

Output : 122334950345

Great! The code returned exactly what I was looking for.

Case Study: RegExp

The sample is a bit obscure and seems somewhat unlikely, but who knows – might be just the thing someone is looking for. Let’s take a look as something a bit more realistic.

Let’s say we receive a load file from Financial Analysis and Planning (FP&A) that they would like loaded to a cube as a forecast scenario. For our sample, we’ll use Customer Codes and Total Revenues. We receive the file, but we immediately see that there is a problem.

DataRegEx3

There is no delimiter between the customer code and the total revenue amount. Additionally, both sets of substrings are of varying length meaning we cannot use text to columns without some manual cleanup work. Let’s look at one way we might split these strings using Regular Expressions.

I’m going to:

  1. Read the strings into an array
  2. Loop the array
  3. Split the string into substrings – load the substrings into new arrays
  4. Output the contents of the new arrays to a worksheet

A Quick Note – Early / Late Binding

A discourse on Early / Late Binding is beyond the scope of this post. Suffice to say, I am using Late Binding as demonstrated:

 Dim RegEx                               As Object
'Create a regular epression object
        Set RegEx = GetRegEx
Private Function GetRegEx() As Object

    On Error Resume Next
    Set GetRegEx = CreateObject("VBScript.RegExp")
    
End Function

There are additional links at the bottom of the post for more information on Early / Late Binding.

The Complete Code

You can open a new workbook, launch the Visual Basic Editor (VBE) add a new module and paste the code below into the module. I broke the Subs() and Functions() into separate snippets to improve readability. Alternatively, you can download the workbook, the link is at the bottom of the post.

 Option Explicit
 Sub SplitStringNoDelimiter()
    
    'Author:        Winston Snyder
    'Date:          12/15/2013
    'Purpose:       Split string into text and value components
    'Comments:      No delimiter
    '               Loop array for output
    '---------------------------------------------------------------------------------------------------------------------------------------------
    
    Dim RegEx                               As Object
    Dim wb                                  As Workbook
    Dim wsInput                             As Worksheet
    Dim wsOutput                            As Worksheet
    
    Dim rngInput                            As Range
    Dim rngOutputDescriptions               As Range
    Dim rngOutputValues                     As Range
    
    Dim arrInput()                          As Variant
    Dim arrOutputDescriptions()             As Variant
    Dim arrOutputValues()                   As Variant
    
    Dim i                                   As Long
    Dim lngRowsData                         As Long
    
    Const strPatternDescriptions            As String = "\D+"
    Const strPatternValues                  As String = "\d+(\.\d{1,2})?"
    Const lngColumnDescriptions             As Long = 1
    Const lngColumnValues                   As Long = 2
    
    'Excel enrionment - speed things up
        With Application
            .ScreenUpdating = False
            .DisplayAlerts = False
            .EnableEvents = False
            .Calculation = xlCalculationManual
        End With
    
    'Initialize
        Set wb = ThisWorkbook
        With wb
            Set wsInput = .Worksheets("Input")
            Set wsOutput = .Worksheets("Output")
        End With
        
    'Clear previous output
        wsOutput.UsedRange.ClearContents
        
    'Input range without header
        With wsInput
            lngRowsData = GetRow(ws:=wsInput)
            Set rngInput = .Range(.Cells(2, 1), .Cells(lngRowsData, 1))
        End With
        
    'Transfer input range to input array
        arrInput = rngInput
        
    'Dimension output arrays
        ReDim arrOutputDescriptions(LBound(arrInput) To UBound(arrInput))
        ReDim arrOutputValues(LBound(arrInput) To UBound(arrInput))
        
    'Create a regular epression object
        Set RegEx = GetRegEx
        
    'Loop through each string in the input array
        For i = LBound(arrInput) To UBound(arrInput)
            
            'Pass the string to regular expression function to return the descriptive portion of the string
                arrOutputDescriptions(i) = GetSubString(objRegEx:=RegEx, _
                                                        strString:=CStr(arrInput(i, 1)), _
                                                        strPattern:=strPatternDescriptions)
                                                      
            'Pass the string to regualr expressions functions to return the value portion of the string
                arrOutputValues(i) = GetSubString(objRegEx:=RegEx, _
                                                  strString:=CStr(arrInput(i, 1)), _
                                                  strPattern:=strPatternValues)
        Next i
        
    'Output all elements of each array to an output range
    'Description in Column 1, Values in Column 2
        
        'Descriptions
            Call OutputArray(ws:=wsOutput, _
                             vTmpArray:=arrOutputDescriptions, _
                             lngColumn:=lngColumnDescriptions)
        
        'Values
            Call OutputArray(ws:=wsOutput, _
                             vTmpArray:=arrOutputValues, _
                             lngColumn:=lngColumnValues)
                             
    'Add a header to the data
        With wsOutput
            .Range("A1").EntireRow.Insert shift:=xlDown
            .Cells(1, 1) = "Descriptions"
            .Cells(1, 2) = "Values"
        End With

    'Tidy up
        'Erase arrays
            Erase arrInput
            Erase arrOutputDescriptions
            Erase arrOutputValues
            
        'Destroy objects
            Set RegEx = Nothing
            Set rngInput = Nothing
            Set wsInput = Nothing
            Set wsOutput = Nothing
            Set wb = Nothing
            
        'Restore Excel environment
            With Application
                .ScreenUpdating = True
                .DisplayAlerts = True
                .EnableEvents = True
                .Calculation = xlCalculationAutomatic
            End With
End Sub

 

Private Sub OutputArray(ws As Worksheet, _
                        vTmpArray() As Variant, _
                        lngColumn As Long)

    Dim j As Long
    
    For j = LBound(vTmpArray) To UBound(vTmpArray)
        ws.Cells(j, lngColumn).Value = vTmpArray(j)
    Next j
                      
End Sub

 

Private Function GetRow(ws As Worksheet) As Long

    GetRow = ws.Cells(Rows.Count, 1).End(xlUp).Row

End Function

 

Private Function GetRange(ws As Worksheet, _
                          lngRowsStart As Long, _
                          lngRowsEnd As Long, _
                          lngColumn As Long) As Range

    Dim rng As Range

    With ws
        Set rng = .Range(.Cells(lngRowsStart, lngColumn), .Cells(lngRowsEnd, lngColumn))
    End With
    
    Set GetRange = rng
    
End Function

 

Private Function GetRegEx() As Object

    On Error Resume Next
    Set GetRegEx = CreateObject("VBScript.RegExp")

End Function

 

Private Function GetSubString(objRegEx As Object, _
                              strString As String, _
                              strPattern As String) As String
                              
    Dim reMatches As Object
    Dim strResult As String
    
    strResult = "No Match"
    
    With objRegEx
        .Pattern = strPattern
        .Global = True

        Set reMatches = .Execute(strString)
        
        If reMatches.Count <> 0 Then
            strResult = reMatches.Item(0)
        End If
    End With
    
    GetSubString = strResult

End Function

The Results

REOutput
Great! Exactly what I was looking for.

Tidy up

    Final Thoughts

    Regular Expressions are found in nearly all programming languages and much like Duct Tape, they have a Million and one uses. You are truly only limited by your imagination and ability to concoct the correct patterns. I use Regular Expressions as I Extract data from various Data Silos to Transform and normalize data prior to Loading to target reporting databases. Let us know how you use Regular Expressions in the comments.

    Downloads

    Download the workbook from OneDrive.

    Additional Resources


Scribe 
 The Microsoft Scripting Runtime Library (scrrun.dll) exposes a couple of objects that we can exploit to increase the power and functionality of VBA.

    Dictionary
    Drive
    FileSystemObject
    TextStream

For today, I’ll look at the FileSystemObject. I’ll take a look at the other objects in future posts.
 
edit: I am going to use Late Binding in the sample snippets below. A discussion on Late / Early Binding is beyond the scope of this post. Please see the “Additional Resources” at bottom for links to detailed explanation of Late / Early Binding.
 
I would like a list of all files in a folder, C:\Data. You may use the FileSystemObject to loop through folders and files. In an Excel file, open the Visual Basic Editor, add a module and paste or enter this code:

Option Explicit
Sub Foo()

    'Author: Winston Snyder
    'Date: 11/26/2013
    'Purpose: Demonstrate looping through files in a folder using the FileSystemObject
    'Comment: Uses Late Binding
    '--------------------------------------

    'Declare variables
        Dim wb As Workbook
        Dim ws As Worksheet
        Dim FSO As Object
        Dim fsoFolder As Object
        Dim fsoFile As Object
        Dim strPath As String
        Dim i As Long
        
    'Excel environment - speed things up
        With Application
            .ScreenUpdating = False
            .DisplayAlerts = False
            .EnableEvents = False
            .Calculation = xlCalculationManual
        End With
        
    'Initialize variables
        strPath = "C:\Data\"
        Set wb = ThisWorkbook
        Set ws = wb.Worksheets.Add(After:=Worksheets(Worksheets.Count))
        Set FSO = CreateObject("Scripting.FileSystemObject")
        Set fsoFolder = FSO.GetFolder("C:\Data\")
        i = 2

    'List files in folder
        For Each fsoFile In fsoFolder.Files
            ws.Cells(i, 1).Value = fsoFile.Name
            i = i + 1
        Next fsoFile
    
    'Add Header
        ws.Cells(1, 1).Value = "FileName"
    
    'Tidy up
        'Destroy objects
            Set fsoFolder = Nothing
            Set FSO = Nothing
            Set ws = Nothing
            Set wb = Nothing
        
        'Restore Excel environment
            With Application
                .ScreenUpdating = True
                .DisplayAlerts = True
                .EnableEvents = True
                .Calculation = xlCalculationAutomatic
            End With
End Sub

Ole P. Erlandsen has some nice examples on his site using FileSystemObject to loop through Folders and Subfolders to get all kinds of file information.

A Business Case

OK, we looked at some fairly simple code to use the FileSystemObject (FSO) to look at how to get a list of files in a folder and some of each file’s properties. But how can we use FSO in a business case? Again, I’ll be using Late Binding. Additionally, I’ll be using the DoCmd Object of the Microsoft Access Object Model as well as some VBA functions to look at the file names and manipulate string variables.

My goal is to examine each file in a specified location, if the file meets my criteria, I want to manipulate the name of the file into a table name that I can transfer to MS Access. The code below will append “dim” to the beginning of each file name as well as remove the extension from the file name. I am using “dim” in this case because this is code I use to load dimension (dim) tables to MSAccess and later import into PowerPivot. Without further ado:

Option Compare Database

Public Sub ImportLoadFiles()

    'Author: Winston Snyder
    'Date: 11/27/2013
    'Purpose: Load Excel files to Access database tables
    'Comment: Paste the code into a standard module in the VBE in MS Access
    '----------------------------------------------------
    
    'Declare variables
        Dim strPath As String
        Dim strTableName As String
        Dim FSO As Object
        Dim fsoFolder As Object
        Dim fsoFile As Object
        Dim strFileName As String
        
    'Initialize variables
        strPath = DocsPath & "Load Files\"
        Set FSO = CreateObject("Scripting.FileSystemObject")
        Set fsoFolder = FSO.GetFolder(strPath)
        
    'Transfer all Excel files that meet criteria to Access database tables
        With DoCmd
                    
            'Turn off warnings
                .SetWarnings False
                
            'Loop and transfer files to database
                For Each fsoFile In fsoFolder.Files
                    
                    'If Excel file, create a name for  dimension (dim) table
                        If InStr(fsoFile.Name, ".xlsx") Then
                            strFileName = Left(fsoFile.Name, Len(fsoFile.Name) - 5)
                            strFileName = "dim" & strFileName
                            
                            'Transfer the spreadsheet to MS Access table
                                .TransferSpreadsheet TransferType:=acImport, _
                                                     SpreadsheetType:=acSpreadsheetTypeExcel12Xml, _
                                                     TableName:=strFileName, _
                                                     FileName:=strPath & fsoFile.Name, _
                                                     HasFieldNames:=True
                        End If
                Next fsoFile
                
            'Turn on warnings
                .SetWarnings True
        End With
        
    'Tidy up
        'Destroy objects
            Set fsoFolder = Nothing
            Set FSO = Nothing
End Sub

Note the function call on this line

strPath = DocsPath & "Load Files\"

DocsPath is a function, so we’ll need to create a function to return the USERPROFILE of the host machine which we can then use.

Public Function DocsPath() As String
    
    'Purpose: Get the Environ value for User Docuents
    'Returns: C:\Users\%User Name%\Documents\
    
    DocsPath = Environ$("USERPROFILE") + "\Documents\"
End Function

Also, please note, for simplicity I assume the folder, “Load Files” exists in the “ImportLoadFiles” snippet above. If it does not, the code will generate an error. You can make the code more robust by adding some error handling to check if folders exist and if the folder contains files that you are interested in.

An Example

ExcelToAccess

Create Load File

I need to create a load file and I have a process to do that, but I will not go too far into it in this post – I’ll review the process in a future post. For now, two things are important.

  1. The name of the file must be similar to the name of the target table
  2. The column headers in the load file must match the field names in the target table
File Name

The code above takes care of the file to table naming. Recall – the code trims the file extension and appends “dim” to the front of the file name. Therefore, the filename, “Teams.xlsx” is translated to “dimTeams” which is the name of the target table. You just need to plan your load file names to align in some fashion with the name of the target table in the Access database.

Create Access Table

Field (Column) Names

ColumnFieldName

  1. Add a table to your database
  2. Save the table as “dim” and some descriptive name such as “dimTeam”
  3. Name the first field of the table with a descriptive term plus the suffix “Key”. For example, “TeamKey”
  4. Add another field. Give it a descriptive name that describes what the field will contain such as “TeamName”.
  5. Set the Data Type to the appropriate type such as “Short Text”.
  6. While still selected on the second field, click on “Indexed” in the Field Properties pane. Click on the drop-down, change from the default value of “No” to “Yes (No Duplicates)”.
  7. Save and close the table
  8. Run the “ImportLoadFiles” process to load the table from the Excel file.

In my sample of NFL teams, I loaded 32 records in my first pass as expected. I then deleted 4 records from the table, and deleted all but the same 4 teams plus one extra from the load file. I saved the load file and ran the import process. The 4 missing records were imported as expected. the 5th record was already in the table so it was disregarded by Access.

Additional Resources

Tidy up

    Final Thoughts

    The one load file, one Access table, 32 records is a silly example. But what if you have 10 tables…20…50? Now, hopefully, you see the true value. How do you use FileSystemObject (FSO) in your Excel / Access projects? Let us know in the comments section.

    Downloads

    Download the Excel file and Access database from SkyDrive.

, , ,