Excel Summit South

Yes, you read that right.  If you haven't heard yet, I'll be coming to New Zealand and Australia in just under a month!  And the entire purpose of the trip is to come and share Excel knowledge for with my friends and colleagues south of the Equator.

I'm pretty jazzed about this, and not just because I get to go to the southern hemisphere for the first time in my life.  And also not just because I get to talk about Excel when I'm there.  That would be enough, but no… I'm jazzed because I get to do this with some pretty cool friends who are world respected leaders in their area.

Excel Summit South

The main purpose for my trip is the Excel Summit South conference.  Two days, two tracks of advanced Excel material in 3 different cities:

  • Mar 3&4: Auckland, New Zealand
  • Mar 6&7: Syndey, Australia
  • Mar 9&10: Melbourne, Australia

And the best part about this conference is that – while it's sponsorsed by Price Waterhouse Coopers – regisration is open to everyone.  So basically, you can check the schedule, pick the sessions that interest you, and learn things that will impact your Excel skills.  In other words, if Valuation Modelling isn't your thing, then you can go to a Power Query class.  And if Power Query isn't your thing… well… you're kind of odd, but there will be something that is.  Smile

The cast and crew for this conference really can't be beat.  Charles Williams, Bill Jelen, Jon Peltier, Zack Barresse, Liam Bastick are all Excel MVP's on the bill (as well as Ingeborg Hawighorst in our New Zealand apperance.)  Heck, we've even got a couple of guys from Microsoft attending and presenting as well.  This is a fantastic opportunity to not only meet some of the big hitter independant Excel folks out there, but also to talk to Microsoft directly.  How can you pass that up?

My Sessions

If it hasn't shown yet, I'm seriously looking forward to this conference.  Personally I'll be leading two sessions:

An End to Manual Effort: The Power Query Effect

What Power Query is, why you care, and how it can re-shape and transform the data experience.  What's really special about this session is that I'm going to take this data and turn it over to Jon Peltier who is then going to take it and turn it into a dashboard.  This is perfect, as I'm demoing how to automate data cleanup, and Jon will show you how to use it to add true business value… the real life cycle of Excel data in just a couple of hours.

The Impact of Power Pivot

This one will be fascinating, especially for those who have never seen Power PIvot in action before.  In just an hour I'll show you how big business BI (business intelligence) is at the fingertips of anyone with an Excel Pro Plus license.  It's applicable to companies as small as one employee, and scales up to multi employee small businesses, and even large businesses.  (Departments in large corporations eat this up, as they effectively just act as a small business within the larger whole.)

Register for Excel Summit South now!

Tickets are going fast for this event in all cities, so we ecourage you to register sooner rather than later, and hope to see you there!  You can find out more details and register at:  https://excelsummitsouth.wordpress.com/

Creating a custom calendar in Power Query

As it’s the beginning of a new year, I thought it might be interesting to show my spin on creating a custom calendar in Power Query. This topic has been covered by many others, but I’ve never put my own signature on it.

Our goal

If you’re building calendar intelligence in Power Pivot for custom calendars, you pretty much need to use Rob Collie’s GFTIW pattern as shown below:

=CALCULATE([Measure],
ALL(Calendar445),
FILTER(
ALL(Calendar445),
Calendar445[PeriodID]=VALUES(Calendar445[PeriodID])-1
)
)

Note:  The pattern as written above assumes that your calendar table is called “Calendar445”.  If it isn’t, you’ll need to change that part.

This pattern is pretty robust, and, as shown above, will allow you to return the value of the measure for the prior period you provide.  But the big question here is how you create the needed columns to do that.  So this article will focus on building a calendar with the proper ID columns that you can use to create a 445, 454, 455 or 13 month/year calendar.  By doing so, we open up our ability to use Rob Collie’s GFITW pattern for a custom calendar intelligence in Power Pivot.

For more on this pattern see http://ppvt.pro/GFITW

A bit of background

If you’ve never used one of these calendars, the main concept is this:  Comparing this month vs last month doesn’t provide an apples to apples comparison for many businesses.  This is because months don’t have an consistent number of days.  In addition, comparing May 1 to May 1 is only good if your business isn’t influenced by the day of the week.  Picture retail for a second.  Wouldn’t it make more sense to compare Monday to Monday?  Or the first Tuesday of this month vs the first Tuesday of last month?  That’s hard to do with a standard 12 month calendar.

So this is the reason for the custom calendar.  It basically breaks up the year into chunks of weeks, with four usual variants:

  • 445:  These calendars have 4 quarters per year, 3 “months” per quarter, with 4 weeks, 4 weeks and 5 weeks respectively.
  • 454:  Similar to the 445, but works in a 4 week, 5 week, 4 week pattern.
  • 544:  Again, similar to 445, but works in a 5 week, 4 week, 4 week pattern
  • 13 periods: These calendars have 13 “months” per year, each made up of 4 weeks

The one commonality here is that, unlike a standard calendar, the custom calendar will always have 364 days per year (52 weeks x 7 days), meaning that their year end is different every year.

Creating a custom calendar

In order to work with Rob’s pattern, we need 5 columns:

  • A contiguous date column (to link to our Fact table)
  • YearID
  • QuarterID
  • MonthID
  • WeekID
  • DayID

With each of those, we can pretty much move backwards or forwards in time using the GFITW pattern.

Creating a custom calendar – Creating a contiguous date column

To create our contiguous date column, we have a few options.  We could follow the steps in this blog post on creating a dynamic calendar table.  Or we could skip the fnGetParameter function, and just directly query our parameter table.  Whichever method you choose, there is one REALLY important thing you need to do:

Your calendar start date must be the first date of (one of) your fiscal year(s).

It can be this year or last year, but you need to determine that.  I’m going to assume for this year that my year will start on Sunday, Jan 3, 2016, so I’ll set up a basic table in Excel to hold the dates for my calendar:

SNAGHTML25692

Notice the headers are “Parameter” and “Value”, and I also named this table “Parameters” via the Table Tools –> Design tab.  For reference, the Start Date is hard coded to Jan 3, 2016, and the End Date is a formula of B4+364*2 (running the calendar out two years plus a day.)

Now I’m ready to pull this into Power Query and build my contiguous list of dates.

  • Select any cell in the table –> Create a new query –> From Table
  • Remove the Changed Type step (as we don’t really need it)

This should leave you with a single step in your query (Source), and a look at your data table.

  • Click the fx button on the formula bar to add a new custom step

SNAGHTML78078[4]

This will create a new step that points to the previous step, showing =Source in the formula bar.  Let’s drill in to one of the values on the table.  Modify the formula to:

=Source[Value]{0}

image

Reading this, we’ve taken the Source step, drilled into the [Value] column, and extracted the value in position 0.  (Remembering that Power Query starts counting from 0.)

Now this is cool, but I’m going to want to use this in a list, and to get a range of values in a list, I need this as a number.  So let’s modify this again.

=Number.From(Source[Value]{0})

 

 

image

Great stuff, we’ve not got the date serial number for our start date. Let’s just rename this step of the query so we can recognize it.

  • Right click the Custom1 step –> Rename –> StartDate

Now, let’s get the end date.

  • Copy everything in the formula bar
  • Click the fx button to create a new step
  • Select everything in the formula bar –> paste the formula you copied
  • Update the formula as follows:

=Number.From(Source[Value]{1})

That should give you the date serial number for the End Date:

image

Let’s rename this step as well:

  • Right click the Custom1 step –> Rename –> EndDate

We’ve now got what we need to create our calendar:

  • Click the fx button to create a new step
  • Replace the text in the formula bar with this:

={StartDate..EndDate}

If you did this right, you’ve got a nice list of numbers (if you didn’t, check the spelling, as Power Query is case sensitive).  Let’s convert this list into something useable:

  • Go to List Tools –> Transform –> To Table –> OK
  • Right click Column1 –> Rename –> DateKey
  • Right click DateKey –> Change Type –> Date
  • Change the query name to Calendar445
  • Right click the Change Type step –> Rename –> DateKey

The result is a nice contiguous table of dates that runs from the first day of the fiscal year through the last date provided:

SNAGHTML16e36f

Creating a custom calendar – Adding the PeriodID columns

Now that we have a list of dates, we need to add our PeriodID columns which will allow the GFITW to function.

 

 

 

 

Creating a custom calendar – DayID column

This column is very useful when calculating other columns, but can also be used in the GFITW formula to navigate back and forward over days that overlap a year end.  To create it:

  • Go to Add Column –> Index –> From 1
  • Change the formula that shows up in the formula bar to:

=Table.AddIndexColumn(DateKey, "DayID", 1, 1)

  • Right click the Added Index step –> Rename –> DayID

NOTE:  The last two steps are optional.  Instead of changing the formula in the formula bar, you could right click and rename the Index column to DayID.  Personally, I like to have less steps in my window though, and by renaming those steps I can see exactly where each column was created when I’m reviewing it later.

What we have now is a number that starts at 1 and goes up for each row in the table.  If you scroll down the table, you’ll see that this value increases to 729 for the last row of the table.  (Day 1 + 364*2 = Day 729).

Creating a custom calendar – YearID column

Next, let’s create a field that will let us navigate over different years.  To do this, we will write a formula that targets the DayID column:

  • Go to Add Column –> Add Custom Column
    • Name:  YearID
    • Formula:  =Number.RoundDown(([DayID]-1)/364)+1
  • Right click the Added Custom step –> Rename –> YearID

If you scroll down the table, you’ll see that our first year shows a YearID of 1, and when we hit day 365 it changes:

image

The reason this works for us is this:  We can divide the DayID by 364 and round it down.  This gives us 0 for the first year values, hence the +1 at the end.  The challenge, however, is that this only works up to the last day of the year, since dividing 364 by 364 equals 1.  For that reason, we subtract 1 from the DayID column before dividing it by 364. The great thing here is that this is a pattern that we can exploit for some other fields…

Creating a custom calendar – QuarterID column

This formula is very similar to the YearID column:

  • Go to Add Column –> Add Custom Column
    • Name:  QuarterID
    • Formula:  =Number.RoundDown(([DayID]-1)/91)+1
  • Right click the Added Custom step –> Rename –> QuarterID

The result is a column that increased its value every 91 days:

image

It’s also worth noting here that this value does not reset at the year end, but rather keeps incrementing every 90 days.

Creating a custom calendar – MonthID column

The formula for this column is the tricky one, and depends on which version of the calendar you are using.  We’re still going to create a new custom column, and we’ll call it MonthID.  But you’ll need to pick the appropriate formula from this list based on the calendar you’re using:

Calendar Type Formula
445 Number.RoundDown([DayID]/91)*3+
( if Number.Mod([DayID],91)=0 then 0
else if Number.Mod([DayID],91)<=28 then 1
else if Number.Mod([DayID],91)<=56 then 2
else 3
)
454 Number.RoundDown([DayID]/91)*3+
( if Number.Mod([DayID],91)=0 then 0
else if Number.Mod([DayID],91)<=28 then 1
else if Number.Mod([DayID],91)<=63 then 2
else 3
)
544 Number.RoundDown([DayID]/91)*3+
( if Number.Mod([DayID],91)=0 then 0
else if Number.Mod([DayID],91)<=35 then 1
else if Number.Mod([DayID],91)<=63 then 2
else 3
)
13 periods Number.RoundDown(([DayID]-1)/28)+1

 

As I’m building a 445 calendar here, I’m going to go with the 445 pattern, which will correctly calculate an ever increasing month ID based on a pattern of 4 weeks, 4 weeks, then 5 weeks.  (Or 28 days + 28 days + 35 days.)

SNAGHTML5ac6c5

This formula is a bit tricky, and – like the GFITW pattern – you honestly don’t have to understand it to make use of it.  In this case this is especially true, as the formula above never changes.

If you’re interested however, the most important part to understand is what is happening in each of the Number.Mod functions.  That is the section that is influencing how many weeks are in each period.  The key values you see there:

  • 0:  Means that you hit the last day of the quarter
  • 28:  This is 4 weeks x 7 days
  • 35:  This is 5 weeks x 7 days
  • 56:  This is 8 weeks x 7 days
  • 63:  This is 9 weeks x 7 days

The Number.RoundDown portion divides the number of days in the DayID column by 91, then rounds down.  That will return results of 0 through 3 for any given value.  We then multiply that number by 3 in order to return values of 0, 3, 6, 9 (which turns out to be the month of the end of the prior quarter.)

The final piece of this equation is to add the appropriate value to the previous step in order to get it in the right quarter.  For this we look at the Mod (remainder) of days after removing all multiples of 91.  In the case of the 445, if the value is <= 28 that means we’re in the first 4 weeks, so we add one.  If it’s >28 but <=56, that means it’s in the second 4 weeks, so we add two.  We can assume that anything else should add 3… except if there was no remainder.  In that case we don’t add anything as it’s already correct.

Creating a custom calendar – WeekID column

WeekID is fortunately much easier, returning to the same pattern we used for the YearID column:

  • Go to Add Column –> Add Custom Column
    • Name:  WeekID
    • Formula:  =Number.RoundDown(([DayID]-1)/7)+1
  • Right click the Added Custom step –> Rename –> WeekID

The result is a column that increases its value every 7 days:

 

SNAGHTML4fc488

Finalizing the custom calendar

The last thing we should do before we load our calendar is define our data types.  Even though they all look like numbers here, the reality is that many are actually defined as the “any” data type.  This is frustrating, as you’d think a Number.Mod function would return a number and not need subsequent conversion.

  • Right click the DateKey column –> Change Type –> Date
  • Right click each ID column –> Change Type –> Decimal Number
  • Go to Home –> Close & Load To…
    • Choose Only Create Connection
    • Check Add to Data Model
    • Click OK

And after a quick sort in the data model, you can see that the numbers have continued to grow right through the last date:

image

Final Thoughts

We now have everything we need in order to use the GFITW pattern and get custom calendar intelligence from Power Pivot.  Simply update the PeriodID with the period you wish to use.  For example, if we had a Sales$ measure defined, we can get last month’s sales using the following:

=CALCULATE([Sales$],
ALL(Calendar445),
FILTER(
ALL(Calendar445),
Calendar445[MonthID]=VALUES(Calendar445[MonthID])-1
)
)

As an added bonus, as we’re using Power Query, the calendar will update every time we refresh the data in the workbook.  We never really have to worry about updating it, as we can use a dynamic formula to drive the start and end dates of the calendar.

As you can see from reading the post, the tricky part is really about grabbing the right formula for the MonthID.  The rest are simple and consistent, it’s just that one that gets a bit wonky, as the number of weeks can change.  (To be fair, this would be a problem for the quarter in a 13 period calendar as well… one of those quarter will need 4 weeks where the rest will need 3.)

One thing we don’t have here is any fields to use as Dimensions (Row or Column labels, Filters, or for Slicers.)  The reason I elected not to include those here is that the post is already very long, and they’re not necessary to the mechanics of the GFITW formula.

 

 

If you’d like a copy of the completed calendar, you can download it here.  Be warned though, that I created this in Excel 2016.  It should work nicely with Excel 2013 and higher, but you may have to rebuilt it in a new workbook if you’re using Excel 2010 due to the version difference on the Power Pivot Data Model.

Suggestion to Improve the Pivot Table Experience

This is a special post to to discuss a suggestion to improve the Pivot Table experience, especially for Power Pivot users.

This week I’m at the 2015 MVP Summit in Redmond, WA.  It’s a trip I’m lucky enough to make every year, and certainly one of the annual events that I look forward to the most.  It’s a chance to reunite with my friends in the global community of Excel experts, as  well as make some new friends there too.  In addition, we get the opportunity to meet with the Microsoft Excel engineers, give our feedback, and talk about the things that are/aren’t working in the program.

Of course, this doesn’t mean that they can or will implement the suggestions we have.  Excel is a massive program, and every feature change can cause bigger issues elsewhere.  But they do listen, and they do want this product to be the best it can be.  Like every company, they have to work out what they can afford to do, and where the best investments are for their limit of resources.

In the spirit of the summit, I thought I’d share one of the ideas I have that I think would be really beneficial to Power Pivot users.  Maybe it makes the radar, maybe it doesn’t, but I think it would be a really useful change.  I’m fairly certain it could also be implemented without causing any issues with other features in the product as well.

The Issue

For those working with Power Pivot, you know the power of DAX.  This leads to creating many different DAX measures, each of which are landed in the columns of the Pivot Table.  This is awesome, but it brings up a challenge with the usability of the Pivot Table field list:

image

Back when we just dropped singular fields into the Values area, things weren’t so bad.  I generally only ran with a few fields, and I didn’t feel super constrained by the size of the window.  Yes, I overran the limit on occasion, but it wasn’t a big deal.

With Power Pivot, things have changed.  I have so much more flexibility to write the DAX measures I need, which leads to many more columns being defined.  If you think about things like forecasting an annual cash flow statement, I’ll write at least 13 different measures (one for each month), plus a total.  And that’s just one scenario.  For a regular financial statement the same thing… Actual, Budget, Variances, Year to date Actuals, Year to date Budgets, and so on.  Again, it’s not uncommon to see a statement with over 12 columns.

This proliferation of measures leads us to the issue… the Values are of the Pivot Table field list is too small today.  It only holds 3-4 visible columns at a time.  Trying to move a measure into the right place is a real pain, especially if you add a new measure to the bottom, and you have to drag it up.  I’m sure you’ve had massive “overscroll” problems where the thing seems to speed up to mach 5 JUST as you are trying to move it up that one last row…

The Slightly Better View

The Pivot Table field list has an alternate view called “Field Section and Areas Section Side-By-Side”.

SNAGHTMLec87838

 

This is a bit better, as we can at least see more fields in the area on the left.  But that’s only helpful for scrolling and finding the fields we need, not placing them on the Pivot:

image

You see?  I’ve still only got three rows showing (four when my Excel is maximized on screen.)

But here’s the thing…

When I’m building my Pivot, I rarely end up putting anything in the Filters area, as I tend to use Slicers.  I might have a few fields in there that I don’t want users messing with (I hide the top rows of the Pivot Table), but generally I’m looking at between zero and two fields in there.

And when I build my Rows and Columns, I tend to drag them on the Pivot and call it a day.  I could use more space on occasion when I’m layering on my Row fields, but Columns are usually sufficient.  Especially now that I’m writing DAX formulas.  The measure gets dragged in to the Values area, and doesn’t need anything in the Columns area at all.  It’s partly for this reason that the small size of the Values area is killing me.  The old logic for how the Pivot was build has essentially changed, with the description moving from the Columns area to the Values area.

What that means is that I’ve got a ton of wasted whitespace in my Filters and Columns area.  So why not reclaim that whitespace?

Suggestion to Improve the Pivot Table Experience

So here’s my suggestion to improve the Pivot Table experience: modify the “Field Section and Areas Section Side-By-Side” view as follows (excuse the rough mockup…)

SNAGHTMLed7c5f1

The key changes here are really about the arrows to the right of the Filters, Rows, Columns and Values areas.  These are the same arrows as used in the Field List on the left, where the white arrow pointing to the right shows the area collapsed, and the black arrow shows the area expanded.

To be clear, the proportions aren’t correct here, but my thought is that the expanded areas consume an equal share of the remaining whitespace.  So if all four areas are expanded, they each get a 25% share of the remaining space, as it what we see in the current implementation.

But collapse one field (let’s say Filters), and each remaining area expands, as it now gets a 33% share of the remaining space.  Collapse two (as I’ve shown above), and the remaining two get 50% each.  Collapse three, and all remaining whitespace goes to the final area:

image

This would be fantastic, as it would let me build my Pivot much more easily.  I’d be able to see what I’m working with, especially on Pivot Tables with higher levels of Row or Values fields.

I didn’t scope this in, but it would also probably be a good idea to append a number in parenthesis to each area as well, indicating how many fields exist in each area.  So in this case: image

Naturally, when you’re first building a Pivot, it should open with all areas expanded to 25% of the share… but bonus points if there is a way to save the default view for a configured Pivot.  The reason that I say this is that my guess is that 75% of the time when I’m modifying a Pivot it’s the Values area I’m doing, 20% is Rows, 4% is Columns and the remaining 1% of the time I’m modifying Filters. Respecting that others have different uses though, the ability to choose which fields are expanded/collapsed by default on an already existing pivot would be incredible.

At any rate, that’s my idea.  Here’s hoping a program manager on the Excel team thinks there’s merit to it and starts to look at the feasibility.  Feel free to share your thoughts on the subject below.  :)

If you like this idea...

Please throw it some votes at Excel UserVoice.  The more votes it gets there, the more likely it will be implemented!

Power Query Errors: Please Rebuild This Data Combination

I got sent this today from a friend.  He was a bit frustrated, as he got a message from Power Query that read “Formula.Firewall: Query 'QueryName' (step 'StepName') references other queries or steps and so may not directly access a data source. Please rebuild this data combination.”

What Does This Mean?

This message is a bit confusing at first.  For anyone who has worked with Power Query for a bit, you know that it’s perfectly acceptable to merge two queries together.  So why when you do some merges does Power Query throw this error?

image

The answer is that you cannot combine an external data source with another query.

Looking At Some Code

Let’s have a quick look at the code that my friend sent:

image

Notice the issue here?  The Merge step near the end references a query called DimShipper.  No issue there.  But it tries to merge that to an external data source which is called in the first line.

This is actually a bit irritating.  But let’s break this down a bit further.  Looking at the first line, it is made up as follows:

Source=Excel.Workbook(File.Contents(Filename())),

The Filename() portion is calling a function to return the desired filename from a workbook table, (based somewhat on this approach.)  We already know that works, but we also know that this is definitely pulling data from an external workbook (even it the file path is for this current workbook!)  And to be fair, this would be the same if we were pulling from a web page, database or whatever.  As long as it’s an external source being combined with another query, we’re going to get smacked.

So it’s suddenly starting to become a bit clearer what Power Query is complaining about (even if we think it’s frustrating that it does complain about it!)

Let’s “rebuild this data combination”

Right.  Who cares why, we care how to deal with this.  No problem.  It’s actually fairly straightforward.  Here’s how I dealt with it:

  • Right click the Query in Excel’s Query window
  • Choose Edit

I’m now in the Power Query window.  On the left side, I found the Queries pane and clicked the arrow to expand it:

image

Duplicate the Existing Query

The query in question here is “Purchase”…

  • Right click “Purchase” and choose Duplicate
  • Immediately rename the query to PurchaseList
  • Go to View –> Advanced Editor
  • Selected everything from the comma on the second line down to the last row of the query:

image

  • Press Delete
  • Change the final line of “Merge” to “Purchase_Sheet”

Perfect… so my new PurchaseList query looks like this:

image

  • Click Done

This query will now load.  Even though it is pointing to an external data source, it’s not being combined with anything else.  So it’s essentially just a data stage.

Modify the Existing Query

Now we need to go back and modify our existing query.  So in that left pane we’ll again select the Purchase query.

  • Go to View –> Advanced Editor
  • Select the first two lines and delete them
  • Put a new line after the let statement that reads as follows

Source = PurchaseList,

NOTE:  Don’t forget that comma!

And what we end up with is as follows:

image

So What’s The Difference?

All we’ve really done is strip the “external” data source and put it in it’s own query.  Yet that is enough for Power Query to be happy.  The new Purchase query above now has two references that it is comfortable with, and it works.  (And that’s probably the main thing.)

Designing To Avoid This Issue

I make it a practice to land all of my data sources into specific “Staging Tables”, which are set to load as connections only.  From there I build my “finalization” tables before pushing them into my Data Model (or worksheet).  You can visualize it like this:

SNAGHTML409dfd44

The key takeaways here are:

  • I always go from data source into a Staging Query (and do a bit of reshaping here)
  • My staging queries are always set to load to “Connection Only” so that the aren’t executed until called upon
  • All combining of data from disparate sources is done in queries that reference other queries, not external data sources

I’ve found this approach works quite well, and always avoids the “rebuild this data combination” error.

Date Formats in Power Query

Date formats in Power Query are one of those little issues that drives me nuts… you have a query of different information in Power Query, at least one of the columns of which is a date.  But when you complete the query, it doesn’t show up as a date.  Why is this?

Demonstrating the Issue

Have a look at the following table from Excel, and how it loads in to Power Query:

SNAGHTML157a9ddf

That looks good… plainly it’s a date/time type in Power Query, correct?  But now let’s try an experiment.  Load this to the worksheet:

image

Why, when we have something that plainly renders as a date/time FROM a date format, are we getting the date serial number?  Yes, I’m aware that this is the true value in the original cell, but it’s pretty misleading, I think.

It gets even better

I’m going to modify this query to load to BOTH the worksheet and the Excel data model.  As soon as I do, the format of the Excel table changes:

image

Huh?  So what’s in Power Pivot then?

image

Curious… they match, but Power Pivot is formatted as Text, not a date?

(I’ve missed this in the past and spent HOURS trying to figure out why my time intelligence functions in Power Pivot weren’t working.  They LOOK so much like datetimes it’s hard to notice at first!)

Setting Date Formats in Power Query

When we go back and look at our Power Query, we can discover the source of the issue by looking at the Data Type on the Transform tab:

image

By default the date gets formatted as an “Any”.  What this means to you – as an Excel user – is that you could get anything out the other end.  No… that’s not quite true.  It means that it will be formatted as Text if Power Pivot is involved anywhere, or a Number if it isn’t.  I guess at least it’s consistent… sort of.

Fixing this issue is simple, it’s just annoying that we have to.  In order to take care of it we simply select the column in Power Query, then change the data type to Date.

Unfortunately it’s not good enough to just say that you’ve set it somewhere in the query.  I have seen scenarios where – even though a column was declared as a date – a later step gets it set back to Any.

Recommendations

I’ve been irritated by this enough that I now advise people to make it a habit to set the data types for all of their columns in the very last step of the query.  This ensures that you always know EXACTLY what is coming out after all of your hard work and eliminates any surprises.

Slicers For Value Fields

Earlier this week I received an email asking for help with a Power Pivot model.  The issue was that the individual had built a model, and wanted to add slicers for value fields.  In other words, they’d built the DAX required to generate their output, and wanted to use those values in their slicers.  Which you can’t do.  Except maybe you can…  :)

My approach to solve this issue is to use Power Query to load my tables.  This gives me the ability to re-shape my data and load it into the data model the way I need it.  I’m not saying this is the only way, by any means, but it’s an approach that I find works for me.  Here’s how I looked at it in Excel 2013.  (For Excel 2010 users, you have to run your queries through the worksheet and into Power Pivot as a linked table.)

Background

The scenario we’re looking at is a door manufacturer.  They have a few different models of doors, each of which uses different materials in their production.  The question that we want to solve is “how many unique materials are used in the construction of each door?”  And secondarily, we then want to be able to filter the list by the number of materials used.

The first question is a classic Power Pivot question.  And the setup is basically as follows:

image

  • Create a PivotTable with models on rows and material on columns
  • Create a DAX measure to return the distinct count of materials:
    • DistinctMaterials:=  DISTINCTCOUNT(MaterialsList[material])
  • Add a little conditional formatting to the PivotTable if you want it to look like this:

image

The secret to the formatting is to select the values and set up an icon set.  Modify it to ensure that it is set up as follows:

image

Great stuff, we’ve got a nice looking Pivot, and you can see that our grand total on the right side is showing the correct count of materials used in fabricating each door.

Creating Slicers For Value Fields

Now, click in the middle of your Pivot, and choose to insert a slicer.  We want to slice by the DistinctMaterials measure that we created… except.. it's not available.  Grr…

image

Okay, it’s not surprising, but it is frustrating.  I’ve wanted this ability a lot, but it’s just not there.  Let’s see if we can use Power Query to help us with this issue.

Creating Queries via the Editor

We already have a great query that has all of our data, so it would be great if we could just build a query off of that.  We obviously need the original still, as the model needs that data to feed our pivot, but can we base a query off a query?  Sure we can!

  • In the Workbook Queries pane, right click the existing “MaterialsList” query and choose Edit.
  • You’ll be taken into the Power Query editor, and on the right side you’ll see this little collapsed “Queries” window trying to hide from you:

image

  • When you expand that arrow, you’ll see your existing query there!
  • Right click your MaterialsList query and choose “Reference”.

You’ve now got a new query that is referring to your original.  Awesome.  This will let us preserve our existing table in the Power Pivot data model, but reshape this table into the format that we need.

Building the Query we need

Let’s modify this puppy and get it into the format that will serve us.  First thing, we need to make sure it’s got a decent name…

  • On the right side, rename it to MaterialsCount

Now we need to narrow this down to a list of unique material/model combinations, then count them:

  • Go to Add Column –> Add Custom Column
  • Leave the default name of “Custom” and use the following formula:  [model]&[material]
  • Sort the model column in ascending order
  • Sort the material column in ascending oder

We’ve not got a nicely ordered list, but there’s a few duplicates in it.

SNAGHTML4800f6be[4]

Those won’t help, so let’s get rid of them:

  • Select the “Custom” column
  • Go to Home –> Remove Duplicates

Now, let’s get that Distinct Count we’re looking for:

  • Select the “model” column
  • Go to Transform –> Group By
  • Set up the Group By window to count distinct rows as follows:

image

Very cool!  We’ve now got a nice count of the number of distinct materials that are used in the production of each door.

The final step we need to do in Power Query is load this to the data model, so let’s do that now:

  • File –> Close & Load To…
  • Only create the connection and load it to the Data Model

Linking Things in Power Pivot

We now need to head into Power Pivot to link this new table into the Data Model structure.  Jump into the Manage window, and set up the relationships between the model fields of both tables:

image

And that’s really all we need to do here.  Let’s jump back out of Power Pivot.

Add Slicers for Value Fields

Let’s try this again now. Click in the middle of your Pivot and choose to insert a slicer.  We’ve certainly got more options than last time!  Choose both fields from the “MaterialsCount” table:

image

And look at that… we can now slice by the total number materials in each product!

image

PowerXL Course Live in Victoria, BC

We are very excited to announce that we will be hosting an "Introduction to Power Excel" session in Victoria, BC on June 6, 2014.

PowerPivot is revolutionizing the way that we look at data inside Microsoft Excel. Allowing us to link multiple tables together without a single VLOOKUP statement, it enables us to pull data together from different tables and databases where we never could before. But linking data from multiple sources, while powerful, only scratches the surface of the impact that it is making in the business intelligence landscape. Not only do we look at PowerPivot in this session, but we'll also explore the incredible companion product Power Query; a tool that will surely blow your mind. Come join Ken as he walks you through the process of building a Business Intelligence system out of text files, databases and so much more.

Full details, including an early bird signup offer, can be found at http://www.excelguru.ca/forums/calendar.php?do=getinfo&e=42&day=2014-6-6

From TXT files to BI solution

Several years ago, when Power Pivot first hit the scene, I was talking to a buddy of mine about it. The conversation went something like this:

My Friend: “Ken, I’d love to use PowerPivot, but you don’t understand the database culture in my country. Our DB admins won’t let us connect directly to the databases, they will only email us text files on a daily, weekly or monthly basis.”

Me: “So what? Take the text file, suck it into PowerPivot and build your report. When they send you a new file, suck it into PowerPivot too, and you’re essentially building your own business intelligence system. You don’t need access to the database, just the data you get on a daily/weekly basis!”

My Friend: “Holy #*$&! I need to learn PowerPivot!”

Now, factually everything I said was true. Practically though, it was a bit more complicated than that. See, PowerPivot is great at importing individual files into tables and relating them. The issue here is that we really needed the data appended to an existing file, rather than related to an existing file. It could certainly be done, but it was a bit of a pain.

But now that we have Power Query the game changes in a big way for us; Power Query can actually import and append every file in a folder, and import it directly into the data model as one big contiguous table. Let’s take a look in a practical example…

Example Background

Assume we’ve got a corporate IT policy that blocks us from having direct access to the database, but our IT department sends us monthly listings of our sales categories. In the following case we’ve got 4 separate files which have common headers, but the content is different:

clip_image002

With PowerPivot it’s easy to link these as four individual tables in to the model, but it would be better to have a single table that has all the records in it, as that would be a better Pivot source setup. So how do we do it?

Building the Query

To begin, we need to place all of our files in a single folder. This folder will be the home for all current and future text files the IT department will ever send us.

Next we:

  • Go to Power Query --> From File --> From Folder
  • Browse to select our folder
  • Click OK

At this point we’ll be taken in to PowerQuery, and will be presented with a view similar to this:

clip_image004

Essentially it’s a list of all the files, and where they came from. The key piece here though, is the tiny little icon next to the “Content” header: the icon that looks like a double down arrow. Click it and something amazing will happen:

clip_image005

What’s so amazing is not that it pulled in the content. It’s that it has actually pulled in 128 rows of data which represents the entire content of all four files in this example. Now, to be fair, my data set here is small. I’ve also used this to pull in over 61,000 records from 30 csv files into the data model and it works just as well, although it only pulls the first 750 lines into Power Query’s preview for me to shape the actual query itself.

Obviously not all is right with the world though, as all the data came in as a single column. These are all Tab delimited files, (something that can be easily guessed by opening the source file in Notepad,) so this should be pretty easy to fix:

  • Go to Split Column --> Delimited --> Tab --> OK
  • Click Use First Row as Headers

We should now have this:

clip_image006

Much better, but we should still do a bit more cleanup:

  • Highlight the Date column and change the Data Type to Date
  • Highlight the Amount column and change the Data Type to Number

At this point it’s a good idea to scroll down the table to have a look at all the data. And it’s a good thing we did too, as there are some issues in this file:

clip_image007

Uh oh… errors. What gives there?

The issue is the headers in each text file. Because we changed the first column’s data type to Date, Power Query can’t render the text of “Date” in that column, which results in the error. The same is true of “Amount” which can’t be rendered as a number either since it is also text.

The manifestation as errors, while problematic at first blush, is actually a good thing. Why? Because now we have a method to filter them out:

  • Select the Date column
  • Click “Remove Errors”

We’re almost done here. The last things to do are:

  • Change the name (in the Query Settings task pane at right) from “Query 1” to “Transactions”
  • Uncheck “Load to worksheet” (in the bottom right corner)
  • Check “Load to data model” (in the bottom right corner)
  • Click “Apply & Close” in the top right

At this point, the data will all be streamed directly into a table in the Data Model! In fact, if we open PowerPivot and sort the dates from Oldest to Newest, we can see that we have indeed aggregated the records from the four individual files into one master table:

clip_image008

Implications

Because PowerQuery is pointed to the folder, it will stream in all files in the folder. This means that every month, when IT sends you new or updated files, you just need to save them into that folder. At that point refreshing the query will pull in all of the original files, as well as any you’ve added. So if IT sends me four files for February, then the February records get appended to the January ones that are already in my PowerPivot solution. If they send me a new file for a new category, (providing the file setup is 3 columns of tab delimited data,) it will also get appended to the table.

The implications of this are huge. You see, the issue that my friend complained about is not limited to his country at all. Every time I’ve taught PowerPivot to date, the same issue comes up from at least one participant. IT holds the database keys and won’t share, but they’ll send you a text file. No problem at all. Let them protect the integrity of the database, all you now need to do is make sure you receive your regular updates and pull them into your own purpose built solution.

Power Query and Power Pivot for the win here. That’s what I’m thinking. :)

PowerPivot training live in Victoria, BC!

Anyone who follows my website or Facebook Fan Page knows that I’m a huge fan of PowerPivot. Well great news now, that you can come and learn not only why this is the best thing to happen to Excel in 20 years, but also how to take advantage of it yourself!

I’ll be teaching a course on PowerPivot and DAX in Victoria, BC on November 22nd, 2013. While the course is hosted by the Chartered Professional Accountants, it’s open to anyone who wishes to subscribe.

If you’ve been trying to figure out how to get started with Power Pivot, you’re confused as to how and why things work, or you want to master date/time intelligence in PowerPivot, this course is for you.

100% hands on, we’ll start with basic pivot tables (just as a refresher.) Next we’ll look at how to build the PowerPivot versions and why they are so much more powerful. From linking multiple tables together without a single VLOOKUP to gaining a solid understanding of relating data, you’ll learn the key aspects to building solid PowerPivot models. We’ll work through understanding filter context, measures and relationships, and finish the day by building measures that allow you to pull great stats such as Month to Date for the same month last year, among others.

Without question, you’ll get the most out of this course if you’re experienced with PivotTables. If you don’t feel like you’re a pro in that area though, don’t worry! As an added bonus, to anyone who signs up via this notice, please let me know. I’ll provide you with a free copy of my Magic of PivotTables course so that you can make sure you’re 100% Pivot Table compatible before you arrive.

This is a hands on course, so you need to bring a laptop that is pre-loaded with either:

  • Excel 2010 and the free PowerPivot download
  • Excel 2013 with Office 2013 Professional Plus installed (yes the PLUS is key!)
  • Excel 2013 with Excel 2013 standalone installed.

If you have any questions as to which you have installed, simply drop me a note via the contact form on my website, and I’ll help you figure it out.

Full details of the course contents as well as a registration link, can be found at http://www.icabc-pd.com/pd-seminars-seminar.php?id=2849. Don’t wait too long though, as registration deadline is November 14th!

Hope to see you there!