StoogesFinal

I’m a huge fan of the 3 Stooges (pictured here) and Curly Howard in particular. Without a doubt, they appeal to my 5 year-old sense of humor with their brand of inane slapstick.

Shemp Howard, Curly’s brother, was ok, but Curly Joe? Ah Marone! Don’t even get me started!

Curly Howard was pure slapstick comedic genius – undisputed. Unfortunately, Curly left us far too early.

Today’s post, however, is not about the Stooges or Curly. It is about Cartesian Products and creating “Controlled” random data.

Random Data Generators

Dick Kusleika and Jimmy Pena offer Random Data Generator add-ins on their respective sites. Jimmy’s will cost you a couple of bucks:

  1. Daily Dose of Excel – Dick Kusleika
  2. JP Software Technologies – Jimmy Pena

Cartesian Product

A Cartesian Product returns everything from Table A with everything from Table B which is generally unwanted and means you have not configured your join(s) properly.

Here I have a couple of Tables of NFL Teams and Divisions that I would like to join in a query to get the teams into their respective divisions.

AccessTables

Here’s my query in the Query Design Window

NFLQry1

When I run the query, I get a Cartesian Product where every team is returned against every division. It is not possible for the Arizona Cardinals to belong to 8 different divisions:

NFLCartesian

Revised Join

I revised the join to show the correct relationship between Division and Team:

NFLTeamDivRelate

Now I get the correct results when I run the query:

NFLTeamDivRelateData

Create “Controlled” Sample / “Dummy” Data

Now, I want to create some sample data to load for a PivotTable so I want some random values, but I want to control the Row Label and Column Field values. For Row Labels, I’ll use Regions and Representatives. This is a standard relationship, so we will reflect as such in the query design:

RegionsReps

Additionally, I would like to generate some sample (dummy) data for each rep for each day of the year. I’ll use a Cartesian Join to generate the dataset:

RegionsRepsCart

Note that there is no join between the Reps/Regions and Dates tables. The query results:

RegionsRepsCartResultRecAnnon

Let’s Create Some Random / “Dummy” Data

I’ll need a function in Access that I can add to the query to generate a random number

Public Function GetRandomValue(Optional x As Integer) As Double

    Dim dblRandom As Double
    
        dblRandom = (1000000 - 500000 + 1) * Rnd() + 500000
        GetRandomValue = dblRandom
        
End Function

Note the difference: in MS Access, the Built-in Function for generating a random number is Rnd(). In MS Excel, it’s Rand(). I added the User-Defined Function to the Query Design Grid and named the output field as SalesAmount:

QueryDesignRandomFunctionFinal

And the query results:

QueryRandomResultsFinal

The first value looks great, but do you notice how the same value keep repeating through every row? That’s not very random. The problem is the function is only called one time, so the random value is generated once and then repeated on every row. I’ll need to pass a unique value to the function to make it generate a new random number on every row of the query results.

To create the unique value, I’ll concatenate 3 fields in the query together

[RepName] & "-" & [RegionName] & "-" & [RecordDate]

I can then pass that as a unique value to the Random Function. First, I need to modify the Random Function a little bit. The Function was looking for me to pass an Integer Value. So I’ll just change the argument to a Variant.

'Old
Public Function GetRandomValue(Optional x As Integer) As Double

'New
Public Function GetRandomValue(v As Variant) As Double

Now I can update the query in the design grid:

QueryRandomValueUpdate

And the query results:

QueryResultsRandomValuesFinalAnno

Perfect! Ready to connect to the query with Excel PivotTable and pivot and slice ’til you get your fill.

MS Query

I was able to generate some random data with some controlled data in MS Access, can I repeat the same in MS Query? I’ll begin by copying the from each of the 3 Access tables to 3 different worksheets in an Excel Workbook:

AccessDataToExcel

Next, I’ll import the data from the 3 worksheets into MS Query

  1. Click on Data Tab on the Ribbon
  2. In the Get External Data group, click on From Other Sources
  3. In the resulting pop-up menu, click on From Microsoft Query

MSQueryMenuSteps

At the Choose Data Source Dialog, click on Excel Files and OK

ChooseDataSourceDiaAnn

At the Select Workbook Dialog, browse through the directory structure to find the workbook that contains the data that you would like to use to import from including the the current workbook.

SelectWBDiaAnn

At the Query Wizard – Choose Columns Dialog, expand nodes next to the tables in the panel on the left, find the columns from each table that you would like included in your query results, use arrow buttons to add and remove items from the pane, “Columns in your query”

QryWizardChooseColDia

You may receive an error message:

The Query Wizard can not continue because it can not join the tables in your query. You must join the tables manually in Microsoft Query by dragging the fields to join between the tables.

Go ahead and click, “OK”. The Query Results will be displayed with no joins between the tables:

MicrosoftQueryInitialResultsAnno

In the initial Query Results Window, notice that there are no Joins, so MS Query went ahead and created a Cartesian Product of Table A, B and C thus returning 14.6K records.

Go ahead and add the Join, by dragging RegionKey from the Reps Table to RegionsKey on the Regions Table. The query results will update to the 3,650 records as expected:

MicrosoftQueryUpdateResultsAnno

Once you have the query returning the correct results, click on, “Return Data” icon on the MS Query Toolbar.

MicrosoftQryReturnDataIconAnno

At the Import Data Dialog, select the option to view the data as a Table in the workbook and at the where to put the data prompt, choose, “New worksheet” option.

ImportDataDiaAnn

In $D$1 add a title for the column that will hold values such as “SalesAmount”. In $D$2, enter a randomization formula in the general form =Rand()*(High-Low)+Low. I used =Rand()*(1000000-500000)+500000. Double-click on the fill handle in the lower-right corner of the cell to send the formula to the bottom.

ExcelTableRandFrmla

ExcelTableSalesAmounts

Looks great! Ready to pivot and slice ’til you get your fill.

Tidy Up

    Final Thoughts

    Hopefully, I presented a case for when a Cartesian Product can be a good thing. How many times have you copied the same data move down 20 rows..paste..repeat..over and over? Lots I bet – I know I have.

    I prefer the MS Access method above, but I wanted to present some alternatives. I also wanted to lay some foundation for MS Query a very underutilized feature in Excel IMHO. I’ll take a look at it a bit more in future post(s).

    That’s it for today. I’m going to watch some Stooges – nyuk..nyuk..nyuk.

    Downloads

    Download the Access Database or the Excel Workbook from here on OneDrive.

    Additional Resource

    1. Contextures
    2. Excel User
, ,
Trackback

no comment untill now

Add your comment now