Retrieve Related Tables in Power Query

As I was working on one of the assignments for our upcoming Power Query training course, it occurred to me that I’ve never blogged about this feature: how to retrieve related tables in Power Query.

Retrieve Related Tables in Access

If you’re using Access, you’re stuffed.  I mean… you can do it manually then merge the tables.  Here’s the database I connected to:

image

Yet when I pull the tblChits table, I get the following columns:

SNAGHTML24047fb1

All the columns from the tblChits table, but nothing from the other tables.  Too bad really.

So basically, you can still retrieve related tables in Access, you just need to create a connection to each table, then merge the queries manually.  The steps to do this, in brief:

  • Create a “Connection Only” query called Chits that points to the tblChits table
  • Create a “Connection Only” query called Items that points to the tblItems table
  • Create a “Connection Only” query called Categories that points to the tblCategories table
  • Create a new query that references the Chits query
  • Merge the Items query, making the relationship between the POSItemCode in both queries
  • Merge the Categories query, making the relationship between the POSCategoryCode in both queries

So basically, we can still retrieve related tables in Access, we just need to understand the relationships in our database and merge them manually.

Retrieve Related Tables in SQL

In SQL server, it actually pulls some things over for you nicely.

In this case I connected to the AdventureWorks database that we’re hosting in Windows Azure for course participants to use.  (Does anyone else provide an Azure hosted database to practice with?)  Specifically, I:

  • Connected to the Azure hosted AdventureWorks database
  • Selected the SalesOrderHeader table (only)
  • Filtered down the dataset to get a shorter table
  • Removed a bunch of unnecessary columns from the SalesOrderHeader table

And look what I’m left with:

SNAGHTML240c8e12

The Sales.SalesOrderDetail contains a full table of related records for each row in my table.  Those related records come from a completely separate table that I didn’t even ask for, how awesome is that?

SNAGHTML240e1703

And the Sales.SalesTerritory column shows a “value” for each, which also has more data that can be expanded (including sub tables):

SNAGHTML240f07eb

Pretty slick, and saves me the effort of having to sleuth out and perform the joins manually.  I sure wish Access had this ability as well!

Don’t have an Azure database?

This technique works for SQL server (on prem) as well.  I’m not 100% sure which other databases support this technique, as I don’t have access to others, but if you do know, please comment here and I’ll add them to the list.

And if you really wish you had access to try out sourcing data from an Azure database… it’s not too late to sign up for our course!

Data From Different TimeZones

A friend of mine emailed yesterday asking how to compare data from different timezones.  With how good the UI is in Power Query, you’d think this would be easy.  Unfortunately it’s a bit less than that, so I thought it would make a good example for today’s post.

Background

Let’s assume that we’ve got two columns of data; an Order Date and a Shipping Date.  We’d like to work out the number of days it took to ship our order.  Easy enough, we just need to subtract one from the other… except… the system that holds the Order Date reports it in UTC +0:00, and the shipping date is done from my home time zone (UTC –7:00).

The data table we’re starting with looks like this:

image

And you can download a copy of the workbook from my OneDrive here if you’d like to follow along.

Avoiding Temptation

So the first thing to do is pull the data in to Power Query.  So I clicked in the table, went to the Power Query tab, and chose From Table.  At this point we’re greeted with a nice table, and our first temptation is to go directly to the Transform tab and set the Data Type to Date/Time/Timezone:

image

And herein lies a problem.  The system has forced my local TimeZone on the data.  As specified in the initial problem, I need this to carry a UTC +0:00 distinction.

It’s a shame that there is no intermediate step here (how often do I ask for MORE clicks?) which allowed you to specify WHICH TimeZone.  If you’re into working with data from different regions (I.e. this feature), I’d don’t think I’m venturing out on a limb to say that this is pretty important.

To further complicate things, that is the extent of the TimeZone functionality in the UI.  And that’s not going to help us.  So let’s knock off the “Changed Type” step and look at this another way.

Using M to Deal with Data From Different TimeZones

The secret to making this work is to take explicit control of the time zone settings using some Power Query M functions.  It’s not as hard as it sounds.  In fact, we’re only going to use two in this example:

  • DateTime.AddZone to add a time zone to a DateTime data type
  • DateTimeZone.SwitchZone to convert from one time zone to another

I discovered both of these functions by searching the Power Query formula categories article on Microsoft’s site.

Forcing a DateTime to a Specific Time Zone

So we’re currently looking at this data in Power Query:

image

Let’s create a new column to convert the OrderDate:

  • Add Column –> Add Custom Column
    • Name:  Order Date (UTC +0:00)
    • Formula:  =DateTime.AddZone([OrderDate],0)

The secret here is in the last parameter, as we get to specify the time zone.  Since we know these dates/times come out of our system in UTC +0:00, we’re good to not add anything to it.  The result is shown below:

image

Converting a DateTime to a Different Time Zone

Now, in order to be able to compare our DateTimes easily, we want them both to be based in our own time zone.  Since my business works in UTC –7:00, I really want my Order Date represented in that time zone as well.  So let’s convert it.

  • Add Column –> Add Custom Column
    • Name:  Order Date (UTC -7:00)
    • Formula:  =DateTimeZone.SwitchZone([#"OrderDate (UTC +0:00)"],-7)

SNAGHTML1f066319

Beautiful.

Just a note here… It may have been tempting to force this data to UTC –7:00 when we added the time zone above, but that would have assigned the date based in the wrong time zone.  I.e. our first record would have returned 7/4/1996 1:12:00 PM –07:00, which is not the same as what we ended up with.

Forcing another DateTime to a Different Time Zone

Now we need to deal with the ShippedDate column, forcing that to my local time.  I could just select the column and turn it into a Date/Time/Timezone data type, but I won’t.  Why?  What if I send this workbook to another user?  It will return THEIR time zone, not mine.  And that could be different.  Much better to explicitly set it.

  • Add Column –> Add Custom Column
    • Name:  ShippedDate (UTC –7:00)
    • Formula:  DateTime.AddZone([ShippedDate],-7)

Notice that this time we do force it to be in the –7 time zone, as these DateTimes originated from that time zone. The result:

SNAGHTML1f0e9e46

Fantastic.  We’ve added time zone data, without changing the original times.

Let’s just go do a little bit of cleanup now:

  • Select the OrderDate and ShippedDate columns
  • Transform –> Data Type –> Date/Time
  • Select OrderDate (UTC +0:00) through ShippedDate (UTC –7:00)
  • Transform –> Date Type –> Date/Time/Timezone

Excellent.  Now they should show up correctly when we load them to an Excel table instead of losing their formatting.

Making Comparisons

We’re at the final step now: Working out the time to ship.  This is relatively straight forward:

  • Add Column –> Add Custom Column
    • Name:  Days to Ship
    • Formula:  [#"ShippedDate (UTC -7:00)"]-[#"OrderDate (UTC -7:00)"]
  • Select the Days to Ship column
  • Transform –> Data Type –> Duration

Note:  You can just double click the column names in the formula wizard and it will put the # characters in there for you.

And the final look in Power Query:

SNAGHTML1f1602fa

With that all complete, the final step is to give the query a name (I chose ShippingTimes) and load it to a worksheet:

image

Final Thoughts

Personally, I like to take explicit control over my data types.  Call me a control freak if you like (I’ve been called much worse) but relying on implicit conversions that set to “local time” scare me a bit, particularly if I’m going to be sending my workbook off to someone who lives in a different zone than I do.  Once you know how to do this it’s not super difficult, and I now know EXACTLY how it will represent on their side.

I’ll admit also that I’m a bit disappointed in the UI for datetime conversions.  To me, anyone playing in this field needs very granular control over every column.  An extra step in the Transform to Date/Time/Timzone step would go a long way to solving this, as you’d be able to skip writing custom formulas.  Hopefully that’s on the Power Query team’s radar for the future, as well as a full datetime menu that would allow us to easily choose from/add/convert to the majority of the formulas found in the article referenced above.

Power Query Training

Also don't forget.  If you love Power Query or are intrigued by the things you can do with it, we have an online training course coming up soon.  Check it out and register at www.powerquery.training/course

What Power Query Functions Exist?

I know that this topic has been covered before by others, but I think it’s still pretty valuable for a user to be able to figure out what Power Query functions exist, especially since they are often different than what we’re used to in Excel.

NOTE:  This article was updated 2015-05-20 at the request of a reader to include more coverage on implementing the discovered function into the solution.

 

Power Query Functions Documentation on the web

There’s a pretty good resource site available on the Microsoft Support site.  Personally I have that one bookmarked and head over there often when I’m looking for a new function to do something.  I find that with a quick CTRL + F on the page, I can quickly search and narrow in on the function I think I need in order to learn it’s syntax.

To be fair, I’m not always in love with the actual examples (many lack a power query UI view), but overall the site is fairly useful.

Power Query Functions Documentation in the client

Now that’s all good, but what if you’re working on a plane with no WIFI, and you need to figure out the syntax for a new function?

As luck has it, there is a way to pull up the list for most functions right in the client.  To do this, I:

  • Clicked Add Column –> Add Custom Column
  • Typed a 1 and clicked OK
  • Went to the Power Query formula bar and typed the formula below.  (Notice that this is case sensitive)

 =#shared

(Why the custom column? Because typing in the formula bar replaces the previous step, and I want to be able to revert to that since it’s part of my logic:

image

Now, you’ll see you get a list of (almost) all the functions that you can access:

SNAGHTML894ffc53

Now, let’s assume I’m trying to find a formula to remove certain characters from a text string.  I really need to search for “Text.”, but there isn’t a search option.  No big deal, let’s convert this list into a table:

SNAGHTML89522a10

Once we’ve done that, we get a nice table of all of the functions, and we can filter them to our heart’s content.  Here’s my table filtered down to just rows that begin with “Text”:

SNAGHTML897b95c3

And a page or so down, I found something that looks like it might work:  Text.Remove.

Investigating the Function Syntax

I clicked on the green Function beside the Text.Remove entry.  It pops up an Invoke Function box, and behind that is the syntax for how it’s supposed to work.  So that’s pretty cool.  I tried it out with some text, as shown below:

txtremove1

Clicking OK returned the following:

txtremove2

Now this is a bit… weird… and frustrating. Value? Why Value? (I actually don’t know why, you’d think it would have been the function name, wouldn’t you?)

I stepped back to the Value step of the query, as I wanted to look at the syntax page that popped up behind the Invoke Function dialog:

image_thumb.png

 

My only complaint here is that once you land in this window, the only indicator of the actual function name is in the smallest font on the page, buried in the middle. You’d think that the name would should up a little more prominently. Regardless, I copied the name of the function, then stepped back to the Invoked FunctionValue step and replace Value in the formula bar with the function name:

txtremove3

Perfect, it works.

Implementing the Function in the Solution

Now let’s see if I can get it into my original query. To do that I:

  • Copied that entire line of M from the formula bar,
  • Selected the Source step (I wouldn’t be able to do this if I had typed #shared while I had the Source step selected originally),
  • Choose to Add New Column –> Add Custom Column –> Accept the inserted step,
  • Pasted the copied M into the formula area, and
  • Replaced the original text (“My –Dog –Has –Fleas”) with the name of the appropriate column from my data set.

Visually, it looks like this:

txtremove4

 

And then I checked the query to see that it worked:

txtremove5

 

Cleanup

Now that I’ve been able to explore the functions and found and implemented the one I’d like to use, I can just knock off the extra steps shown below in yellow, returning me back to my next step:

txtremove6

 

 

 

Learning more about Power Query functions

For reference, this is one of the many things that Miguel and I will be covering in our upcoming Power Query training workshops.  Learn more about the workshop and register here:  http://powerquery.training/course/

Announcing Online Power Query Training!

I’m really pleased to announce that a new project I’ve been working on is live: powerquery.training, a website that offers online Power Query training!

online Power Query training

PowerQuery.Training

About PowerQuery.Training

PowerQuery.Training is a joint effort between Miguel Escobar (of PoweredSolutions.co) and myself: the two guys who are writing that Power Query book – M is for Data Monkey book.

M is for Data Monkey

M is for Data Monkey

What you can find at PowerQuery.Training

We believe that Power Query is a super important tool in the toolbox of every Excel user out there.  It's so easy to use, and so powerful, that we think everyone should know about it.  To that end, we've decided to include the following two areas on the site:

Power Query patterns

One of the areas of focus is to showcase Power Query patterns (techniques).  These patterns are well illustrated articles with supporting workbooks, which will help you build solutions for common scenarios.  There will be more coming as we have time to add them, but today you'll already find:

Online Power Query Training

We know that not everyone wants to read pages of documentation, trying to figure out how to use a new technology.  Some people want to carve out a block of time, link up with an instructor, and be taught the basics, and how to avoid the inevitable pitfalls.  That's our aim with our online Power Query training workshops.

That’s right, there will be live online offerings in order to help get you skilled up on Power Query without ever leaving the comfort of your own office!  All you need to do is dedicate 3 days of your time – well okay… and some space on your credit card – and you’ll be off to the races with this software, taming and automating the cleanup and refresh of your data.

Interested in seeing what the course covers?  Download the Course Agenda here.