Pain Points – Power Query Credential Management

After my last post on Power Query Security Woes, Ehren – a developer on the Power Query team – sent me a message on Twitter about it.  I will blog about that later, after I’ve had a chance to test it out, but before I did, I thought I’d go and clean up some PQ stuff that I messed up.

A little history…

A while back I got the bright idea to pull down our web leads (at my day job).  I then though, “Hey, wouldn’t it be cool if I could plot all of them on Power Map?”  Sure it would.  But all I had was their IP address.  “No problem!”, I figured, “I’ll just use Power Query to feed out the IP to a web service and return a function to turn it into a real address!”

Success!

I cooked up a little script fed the IP to http://ip-api.com, which did exactly that. Here’s a sample from a random address:

image

I actually tried several services before I settled on this one, but ultimately  it seemed to generate the most accurate results.  Perfect, so I let the script fly, and it was awesome!

Success turns to failure…

Except… somewhere during my batch of 3,500 addresses it stopped working.  And when I went back to the site, I found out that I’d been blocked.  Ooops.  Apparently if you send over a certain threshold of queries in a certain amount of time you’re not a good person. (That’s why I’m not sharing the script.)  Sorry about that ip-api.com, I promise I’ll be good in future!

… and creates a mess in the process…

At any rate, the issue I then ran into is that I ended up with an entry for each URL in my Data Source Settings.  I was a while back, so I don’t remember if it happened by default, or if it happened because I was learning and just clicked the wrong thing.  End result is that I have an entry in my list for each IP I queried.

image

Since I’m going to be a good boy and not steamroll their site any more, I kind of want to clean them up.  No problem, right?  Click on the first address on screen, hold down SHIFT and click on the last address on screen and… huh?  I’ve only got the last address?  Okay, let’s try with CTRL… click, click.. ARGH!!

… and the mess turns into major frustration!

This interface only allows you to select one item at a time?  Really?  So to delete my 200 entries I have to click the line item, click Delete (at the top), click “Yes I’m sure” and then move on to the next one.  Ouch.  I see a lot of wasted time in my future if I really want to clean up now.  :(

The missing feature(s)

Power Query team, (if you’re still listening,) please give us the ability to multi-select in this dialog, and multi-delete items.  We need to be able to clean up, as we’re all human and make mistakes as we go along.  This feels like a severe punishment right now!

In addition, while I’m here, why isn’t there an “Add” button in this interface?  When I go to edit one of these addresses, I see the following:

image

I’m guessing that if I’d just set that to the root at the beginning, I wouldn’t have an entry for every IP in that query.  Man!  If only I’d been able to declare this up front and realized that this was an option!  Yet there is no way to do this from a simple UI.  Instead, I (believe) I have to:

  • Create a query to one page at the domain
  • Set it’s security level as I pull data
  • Save the query
  • Go back to this interface and edit the anonymous access

Life would be SO much easier if I just had the ability to go into this UI and create my favourite and most used sites.  (And bonus points if I could push them out to users through group policy like I can with Office Trusted Locations!)

Okay, enough for now.  I’ll be back later once I’ve tested Ehren’s suggestion.  :)

Power Query Security Woes

Recently I’ve been working on automating a business process for a client.  It’s become a really interesting project that uses a mix of Excel tables, Power Query, Power Pivot and VBA, as well as a WinAutomation script.  Before I talk about the issue that is facing me, it would be helpful to provide a bit of background on what we’re doing, and why so many of the Power BI pieces are needed.

Solution Background

My client has a business in which they outsource employees to other companies.  Each of those companies maintains the records of hours, and lets my client download a spreadsheet version of the hours logged by the subcontracted employee.  This is fairly helpful, as my client is the one that pays all of the employees, so getting the hours lets him know who gets paid for how many hours.  Easy enough, right?  But each spreadsheet is in a different format, and that causes a lot of manual entry to try and standardize it into a file that can be uploaded to the payroll processor.

So here we enter Power Query.  With Power Query we can import each of the files individually, reformat the data into a consistent set of columns, and output it into a table.

Each pay period we start a new Excel file, and import the data files for that pay period by running a WinAutomation script.  The script logs in to the remote systems, passes the correct parameters to the system, and downloads the Excel and CSV files required.  It then saves them in a subdirectory of our application under the pay period end date.  My Power Query scripts then use the pay period end date, dynamically build a file path to the file and import the data.

It’s beautiful… except…

The Issue

Here’s where the pain begins…  Every time you touch a new data file in Power Query that you haven’t used before you get a security message asking you what type of data (Public, Private or Organizational) you are using.  (You can learn more about those here:  http://office.microsoft.com/en-us/excel-help/privacy-levels-HA104009800.aspx) The issue I have is that each payroll I create new files in a subdirectory, so Power Query sees them as unique.  To that end I can’t just trust the data files once and be done with it.

Now, there is a workaround… just enable the Fast Combine feature.  That kills off the warning and lets them go, but it also has an issue.  This setting is workbook specific, which means that when my VBA saves the master payroll control file under a new name (we preserve history this way) the setting doesn’t stay selected.  Grrr.  Given that there is no way to touch this setting via code, my user has to remember to click the Fast Combine button every time they run the update.  Is it minor?  Sure.  But it’s ridiculous, it gets forgotten, and concerns them when they get the permissions messages.

The Solution We Need

Now don’t get me wrong, I’m all for security, but where it makes sense.  In Office 2007 we got a new macro paradigm that allowed us to trust folders on our computers/networks.  This setting is set on an application level basis, and persists between Excel sessions.  Beautiful, as I can set it to a specific folder and forget it.  I throw trusted files in there, I don’t throw in the ones I don’t know.  It actually allows me to practice safer computing overall.

In my opinion, this setting is drastically lacking in Power Query.  I really need the ability to set my Power Query add-in to have trusted root folders with trusted subfolders.  This would allow me to trust my application’s data directory and not have to remember to click Fast Combine each time I create new files.  It also means that I might pay attention to the message when it does come up in future, as it would be unusual.  Currently I see the error so much I just cancel it and go straight to Fast Combine.  Hmm… kind of like how we set macro security to “Low” in Excel 2003 and earlier to bypass the annoying message, which essentially left us unprotected.

The Solution (Some of us) Can Implement

To be fair, I believe I do have a workaround for this now, but I don’t think it should be necessary.

Basically what I could do is use VBA to drive the refresh of my tables, and therefore the Power Query scripts.  Given that, I could use VBA to copy and replace the files in a central repository where the hierarchy does not change.  I.e. I could set up a folder like AppData\Timesheets\Current and use VBA to copy the required data files from my dated subfolders, replacing the ones in the “current” folder.  I would then target my Power Query scripts against the files in the “Current” folder, and build my solution off that.

Because Power Query holds a list of trusted files at an application level (rather than workbook level), these files should stay trusted even if I do create a new file, removing the need to constantly click the “Fast Combine” button.  So I think this should work.

So what’s the problem?  It only involves creating a VBA macro to do that.  Fine for me, as I know how to code with VBA and make this happen.  But for most of the Power Query target market I would suggest that this is probably outside of their normal skill set.

Final Thought

Even though it is technically possible to work around this issue, I still argue should not be necessary.  We need a proper option to trust the files in a local folder so that dynamically referring to a file in Power Query can be scripted without invoking a painful security paradigm.

Importing Dates in Power Query

A couple of weeks ago I was teaching a course on Power Query.  I distributed a bunch of sample files, started to walk the attendees through the steps, and a hand went up.

“I get an error in my date column…”

For a second it stumped me… I tested my examples 15 times before pushing them out to the class.  Why now?

As it turns out it’s that pesky issue that drives many non-North Americans crazy.  I keep my regional settings set to use the MM/DD/YYYY format.  I just find it makes life a lot easier for me.  What we ran up against here was a user who was using the Canadian standard of DD/MM/YYYY.  Yuck.

I promised a way to show how to fix it, and am finally getting around to posting it…

Replicating the Issue

The file I’m working with here is a csv file (saved in a txt format) that uses the DD/MM/YYYY format, as that replicates the issue for me.  So I import it and end up with this:

image

So what’s the problem?  Well, if you look at the areas I circled in red, the ChangedType step changed the data type for the TranDate column to Text.  That’s no good.  So I highlight the TranDate column and change it to Date:

image

Great.  An error.

So this is what I’m talking about.  The dates above have been converted to dates, but not the 1st of February, 2nd of February and so on.  Instead it’s January 2nd, February 2nd, etc.  As soon as it gets to the 13 it gags as there is no 13th month.

This is not an uncommon issue, by any means.  If you database exports in one format and your system settings are in another, you’ll run into this.  Multinationals that exchange files across borders see this all the time (just look in any forum for the amount of help posts on the topic.)

I REALLY wish that Power Query were smarter about this.  I also wish there was an easy way to just tell Power Query which date format your data (not your system) was using, or an easy way to convert it.  Anything to let me do something in one click versus what we need to do to fix this.  (As an added insult, the old text import wizard DID let you declare this.  Go figure!)

The Solution

All right, so how do we fix it?  I’m sure there are a variety of ways, and hopefully a new ribbon will come along to make this obsolete, but here is ONE way.

Step 1: Remove the garbage

First I deleted the “Changed Type” step of the query.

Step 2: Split by the / character

Next I selected the TranDate column, went to the Transform Tab (if you don’t have it download the latest Power Query update), chose Split Column, then By Delimiter, then Custom.  I put in a / and clicked OK.

As an aside here, I wish Power Query suggested what it thought you wanted to use as the delimiter, without actually clicking OK for you.  In this case I think it’s pretty obvious, even though it’s not in the list of defaults.  Granted, I might want to override it with something crazy like a 0, which is why I wouldn’t want it to just click OK for me, but it might save me time if it made an intelligent suggestion.

At any rate, I end up with this:

image

Step 3: Rename the columns

This part is important for the next piece.  Rename your columns to reflect which part of the date they belong to.  You can do this by right clicking the column header and choosing Rename…  (Alternately you can edit the M code in the formula bar manually, but you need to edit the “SplitColumnDelimiter” step first, then edit the ChangedType step to reflect the new column names.  Probably easier to just right click and rename them:

image

Step 4: Putting the date together… correctly

OK, now the tricky part.  We need to insert a column and write a Power Query formula.  To do this, go to the Insert tab, and click Insert Custom Column.  You’ll be presented with a dialog to write your formula.  And here’s the formula you need:

=#date([Year],[Month],[Day])

Sadly, there really isn’t a lot of documentation yet that explains this stuff.  I believe that the # sign tells PowerQuery that this is a data type, the “date” portion determines what type it is, and then we feed it the components that make it up (for each row).  I also changed the name at the top of the dialog to “Date” and clicked OK:

image

Step 5: Force the data type

I’m not sure why this is, but even after declaring this as a date, it formats the column visually as a date, but the data type is left blank.  For whatever reason, I had to select the column and force the data type to a date in order to make it recognize it properly as such.

Step 6: Cleanup

So this part is optional, but you can now delete the Day, Month and Year columns if you don’t need them any more.  Unlike standard Excel, this move won’t leave you with a bunch of #REF! errors down the Date column.  :)

Implications/Modification

The implications of this are somewhat interesting to me.  Let’s assume that your data goes the other way.  I send you a file in MM/DD/YYYY format, and you use DD/MM/YYYY format.  You should be able to follow all of the above steps, with the only difference being which column you identify as which when you rename them.  (In this case it would go Month Day Year after splitting them.)

I’d love to get some feedback from others on this, but I believe the formula in the custom column should work for you without modification as long as the columns are named correctly.

I’ve attached two files to this post for testing purposes.  Right click the links below and SaveAs to get copies to play with:

Try them both, you should be able to generate a correct date for your format from either using these steps.

PowerXL Course Live in Victoria, BC

We are very excited to announce that we will be hosting an “Introduction to Power Excel” session in Victoria, BC on June 6, 2014.

PowerPivot is revolutionizing the way that we look at data inside Microsoft Excel. Allowing us to link multiple tables together without a single VLOOKUP statement, it enables us to pull data together from different tables and databases where we never could before. But linking data from multiple sources, while powerful, only scratches the surface of the impact that it is making in the business intelligence landscape. Not only do we look at PowerPivot in this session, but we’ll also explore the incredible companion product Power Query; a tool that will surely blow your mind. Come join Ken as he walks you through the process of building a Business Intelligence system out of text files, databases and so much more.

Full details, including an early bird signup offer, can be found at http://www.excelguru.ca/forums/calendar.php?do=getinfo&e=42&day=2014-6-6

Activating Debugging Symbols with Add-In Express

Some time ago I published a blog post about an add-in that I’m building, and the reasons why I elected to start using Add-In Express to manage the process.  I still think Add-In Express is the best product for this kind of task, but figured I’d publish this post mainly to save me time the next time I get a new PC and have to go through this again.

The issue I ran into is that I need to set the CLR to work with .NET 3.5.  To do this you have to set the host file to v2.0x.  This is pretty easy, and Eugene has a good picture on how to do this here:  http://www.add-in-express.com/forum/read.php?FID=5&TID=7912

Following the steps that Eugene provided set me off to the races… with Excel 2010.  But when I tried to debug in Excel 2013 I wasn’t having much luck.  I changed the “Program to Start” to Excel 2013, set a breakpoint in my code, and started debugging.  Even though I’d set a breakpoint in my code, it was never activated.  Instead, the breakpoint goes from a red dot to a red circle with a white fill, and mouse-ing over it yields the message “The breakpoint will not currently be hit.  No symbols have been loaded for this document.”

SNAGHTML64bc16

This really sucks, as it’s pretty tough to work with.  So what’s different, and how come Excel 2010 works, but Excel 2013 doesn’t?

The action of modifying that host config file as described by Eugene actually creates a new file that is stored within the appropriate Office subfolder held within the Program Files folder.  Unfortunately, the program that is defined in your “Program to Start” settings seems to be ignored, and it actually creates the file in the folder that holds the oldest version of Office on your system.  So in my case it created the following file:

  • C:\Program Files\Microsoft Office\Office14\excel.exe.config

Now, I stand to be corrected on this, but I believe that this file is only used when launching the debugging tools from Visual Studio.  When the app is started it will look for this file and, if it finds it will use it to provides the debugging symbols back to Visual Studio for the .NET framework specified.

At any rate, the key here is to copy this file and paste it into the same folder that contains the exe file specified in the “Program to Start” area in the Project Properties of Visual Studio.  And once you do that:

SNAGHTML72dbee

Beautiful!  Works nicely.

It’s also worth mentioning that this setup works whether you are using a version of Office that comes from an MSI installer file (such as a DVD or volume license version of Excel/Office), or if you are using the Click to Run (C2R) version of Office that you can download from Office365.  The only thing you need to be concerned about is the file path in which to store the config file and find the Excel.exe executable:

  • 64bit MSI:  C:\Program Files\Microsoft Office\Office15
  • 32bit MSI:  C:\Program Files (x86)\Microsoft Office\Office15
  • 64bit C2R: C:\Program Files\Microsoft Office 15\root\office15
  • 32bit C2R: C:\Program Files (x86)\Microsoft Office 15\root\office15

I know this isn’t super Excel focussed, but hopefully it will help someone out there if they run into this error.

I should also throw a shout out to Andrei from Add-In Express who helped me sort this out last time I needed to figure this out.  The support there was awesome in helping me get it resolved.

Using PowerQuery with Email

At our last MVP summit, I sat in on a session on PowerQuery. Our presenter, who is pretty passionate about it, at one point asked us “Who has wanted to use PowerQuery on their email inbox?” And honestly, I have to admit, I was kind of “meh” to the idea. After all, email is a distraction, and not something I generally like to go back and study. Why would I care how many emails someone sent me, or what time they do? Not interesting.

But oh wait… after fooling around with it when I was bored one day, I came across something pretty damn cool. PowerQuery can do much more than just count emails… it can parse the contents! (Okay, that’s uber-geek for “it can read and examine the contents of the email.”) Why is that important? It’s because PowerQuery is all about sourcing and reshaping data.

We can build this…

Check this out… this chart is created, in Excel, showing the affiliate income that I’ve earned from selling Charley Kyd’s dashboard kits from Exceluser.com.

image

Yeah, okay, it’s not a huge amount of money, but that’s not the point.

… From This…

The point is this:  Charley sends an automated email each time there is a sale, of which the contents are always in the same structure:

image

And here’s how…

Step 1: Create a folder and a rule in Outlook

The first thing I did was set up a rule in Outlook to funnel all emails like the one above into the “Affiliates’’\Exceluser.com” folder.  While it’s not entirely necessary – I could target the inbox itself – this makes the updates run a bit faster as they don’t have to examine ALL of my email.  It’s also good housekeeping.  That folder will now be the ENTIRE source for the chart above.  Even better, I didn’t even have to read the emails myself!

Step 2: Open Excel and create a new Power Query

To do this, you must be running the December 2013 update at a minimum.  (Earlier versions don’t support Exchange as a data source.)  To set it up:

POWER QUERY –> From Other Sources –> From Microsoft Exchange

If you haven’t used an Exchange connection, you’ll be prompted for your credentials, which it will then save for you.  Unfortunately only one Exchange account is supported at a time right now, so if you need to change it go to POWER QUERY –> Data Source Settings to modify your credentials.

Now, for me it pulled up a Navigator task pane on the right with a list of Mail, Calendar, People, Tasks and Meeting Requests.  I personally wish it would take you into Power Query, as this is just one more step to slow me down.  I have to select Mail, then click Edit Query (at the bottom) to go and start building my query.  Once I do so, then I see this:

image

Wow… that’s a lot of columns, but not really where I want to be.  Let’s filter this list down a bit…

I clicked the dropdown arrow beside “Folder Path”, and then clicked the “Load More” button, since there is little in the list:

SNAGHTML38a48c

Perfect, now I uncheck Select All, check \Affiliates\Exceluser.com\ and I’m set… my list will filter to only those emails from Exceluser.com.

Step 3: Removing Irrelevant Columns

There’s a lot of them, so I’m going to get rid of them.  I select and remove each of the following:

“Subject”, “Sender”, “DisplayTo”, “DisplayCc”, “ToRecipients”, “CcRecipients”, “BccRecipients”, “DateTimeReceived”, “Importance”, “Categories”, “IsRead”, “HasAttachments”, “Attachments”, “Preview”, “Attributes”, “Id”

That leaves me with 3 columns in all:  “Folder Path”,”DateTimeSent” and “Body”:

SNAGHTML3d45e9

Step 4: Expand the Email Body

Now here’s where cool starts to get visible… the Body field contains a bunch of entries called “Record”, which are shown in green.  That means they are clickable and, if you do click it, will show you the entire body of the selected email.  But even better is that I can expand those into the table itself.  That funny little double headed arrow next to “Body” does just that.  When I click it, I can see all the items in the record which, in this case includes both the text and html bodies of the email:

image

I’d rather play with plain text than HTML, so I cleared the HTMLBody flag and click OK, resulting in the following:

image

So the body has expanded, and if I click on a specific record it shows the entire body in the window at the bottom.  Very cool, and I can see that this email shows earnings of $12.91.  Even better, as I click through the other records, Charley’s emails are identical, with only the value changing.  This is VERY good.

Step 5: Strip out the Value

Now the above is good, but it contains a bunch of hard returns and stuff.  I want to get rid of those.  So I select the Body.TextBody.1 column then go to Transform—>Text Transforms—>Clean.  That removes all the hard returns, resulting in this:

SNAGHTML46cabd

Now I’m going to do a find and replace to remove all characters before the value.  Since it’s always the same, I got to Replace values and I can replace “Hi Ken Puls,This email is to notify you that you have made a sale.You have earned $” with nothing.

The column will now start with the value, but I need to remove everything after the value.  Easiest way for me was to:

  • Replace ” in commission for this sale.” (notice the leading space) with a pipe: “|”
  • Choose Split Column –> By Delimiter –> Custom –> |

I then made sure that the Body.TextBody.1.1 column was selected and choose to set the Data Type to Number.

Cool, I’ve got my values now!

Step 6: Final cleanup

Finally, I did a bit of cleanup…

  • I removed the Body.TextBody.1.2 column as well as the Folder Path columns, since I don’t need either any more
  • I renamed the Body.TextBody.1.1 column to Commission
  • I renamed the table (on the right) to Commissions

And here it is:

SNAGHTML512197

And finally I clicked OK to load the table to the worksheet.

Step 7: Create the Report

And this part is probably the easiest… I created a simple Pivot:

  • DateTimeSent on rows
  • Group dates by Months and Years
  • Years on Columns
  • Commissions on Values

And then the PivotChart where I removed the Field Buttons, vertical axis and gridlines, and added data labels.  Pretty straight forward if you’re used to working with PivotTables or charts at all.

Final Thoughts

This tool is truly awesome.  All I need to do to get an update on my affiliate status is to open the file and refresh the data.  It will reach into my email folder and do the rest.  Simply amazing, really.

And while the setup may look like a lot of steps, this file took me less than 30 minutes to build, from setting up the rule in Outlook to pulling the data into PowerQuery through cleaning it and reporting on it.  Granted, I’ve been playing around with similar data lately, but still, it was quick from start to finish.

So… “Meh, it’s just email”?  Not any more.  :)

From TXT files to BI solution

Several years ago, when Power Pivot first hit the scene, I was talking to a buddy of mine about it. The conversation went something like this:

My Friend: “Ken, I’d love to use PowerPivot, but you don’t understand the database culture in my country. Our DB admins won’t let us connect directly to the databases, they will only email us text files on a daily, weekly or monthly basis.”

Me: “So what? Take the text file, suck it into PowerPivot and build your report. When they send you a new file, suck it into PowerPivot too, and you’re essentially building your own business intelligence system. You don’t need access to the database, just the data you get on a daily/weekly basis!”

My Friend: “Holy #*$&! I need to learn PowerPivot!”

Now, factually everything I said was true. Practically though, it was a bit more complicated than that. See, PowerPivot is great at importing individual files into tables and relating them. The issue here is that we really needed the data appended to an existing file, rather than related to an existing file. It could certainly be done, but it was a bit of a pain.

But now that we have Power Query the game changes in a big way for us; Power Query can actually import and append every file in a folder, and import it directly into the data model as one big contiguous table. Let’s take a look in a practical example…

Example Background

Assume we’ve got a corporate IT policy that blocks us from having direct access to the database, but our IT department sends us monthly listings of our sales categories. In the following case we’ve got 4 separate files which have common headers, but the content is different:

clip_image002

With PowerPivot it’s easy to link these as four individual tables in to the model, but it would be better to have a single table that has all the records in it, as that would be a better Pivot source setup. So how do we do it?

Building the Query

To begin, we need to place all of our files in a single folder. This folder will be the home for all current and future text files the IT department will ever send us.

Next we:

  • Go to Power Query –> From File –> From Folder
  • Browse to select our folder
  • Click OK

At this point we’ll be taken in to PowerQuery, and will be presented with a view similar to this:

clip_image004

Essentially it’s a list of all the files, and where they came from. The key piece here though, is the tiny little icon next to the “Content” header: the icon that looks like a double down arrow. Click it and something amazing will happen:

clip_image005

What’s so amazing is not that it pulled in the content. It’s that it has actually pulled in 128 rows of data which represents the entire content of all four files in this example. Now, to be fair, my data set here is small. I’ve also used this to pull in over 61,000 records from 30 csv files into the data model and it works just as well, although it only pulls the first 750 lines into Power Query’s preview for me to shape the actual query itself.

Obviously not all is right with the world though, as all the data came in as a single column. These are all Tab delimited files, (something that can be easily guessed by opening the source file in Notepad,) so this should be pretty easy to fix:

  • Go to Split Column –> Delimited –> Tab –> OK
  • Click Use First Row as Headers

We should now have this:

clip_image006

Much better, but we should still do a bit more cleanup:

  • Highlight the Date column and change the Data Type to Date
  • Highlight the Amount column and change the Data Type to Number

At this point it’s a good idea to scroll down the table to have a look at all the data. And it’s a good thing we did too, as there are some issues in this file:

clip_image007

Uh oh… errors. What gives there?

The issue is the headers in each text file. Because we changed the first column’s data type to Date, Power Query can’t render the text of “Date” in that column, which results in the error. The same is true of “Amount” which can’t be rendered as a number either since it is also text.

The manifestation as errors, while problematic at first blush, is actually a good thing. Why? Because now we have a method to filter them out:

  • Select the Date column
  • Click “Remove Errors”

We’re almost done here. The last things to do are:

  • Change the name (in the Query Settings task pane at right) from “Query 1” to “Transactions”
  • Uncheck “Load to worksheet” (in the bottom right corner)
  • Check “Load to data model” (in the bottom right corner)
  • Click “Apply & Close” in the top right

At this point, the data will all be streamed directly into a table in the Data Model! In fact, if we open PowerPivot and sort the dates from Oldest to Newest, we can see that we have indeed aggregated the records from the four individual files into one master table:

clip_image008

Implications

Because PowerQuery is pointed to the folder, it will stream in all files in the folder. This means that every month, when IT sends you new or updated files, you just need to save them into that folder. At that point refreshing the query will pull in all of the original files, as well as any you’ve added. So if IT sends me four files for February, then the February records get appended to the January ones that are already in my PowerPivot solution. If they send me a new file for a new category, (providing the file setup is 3 columns of tab delimited data,) it will also get appended to the table.

The implications of this are huge. You see, the issue that my friend complained about is not limited to his country at all. Every time I’ve taught PowerPivot to date, the same issue comes up from at least one participant. IT holds the database keys and won’t share, but they’ll send you a text file. No problem at all. Let them protect the integrity of the database, all you now need to do is make sure you receive your regular updates and pull them into your own purpose built solution.

Power Query and Power Pivot for the win here. That’s what I’m thinking. :)

Un-Pivoting Data in Power Query

I was fooling around with the latest build of Power Query (for Excel 2010/2013), and I’ve got to say that I’m actually really impressed with this piece. I think the user experience still needs a bit of work, but once you know how, this is really simple.

Whenever I teach courses on PivotTables people want to eat them up, but often they’ve already got their data in a format that resembles the PivotTable output, rather than a format that is consumable by the Pivot engine. Take the following set of data.

SNAGHTML111988

A classic setup for how people think and want to see the data. But now what if the user wanted to see the records by month under the category breakdowns? An easy task for a Pivot, but it would require a bit of manipulation without it. So how do we get it back to a Pivot compliant format?

With Power Query it’s actually super simple:

  • Step 1: Select the data and turn it into a table
  • Step 2: Select Power Query –> From Table to suck it in to Power Query
  • Step 3: Un-pivot it

Okay, so this one takes a bit of know how right now. If you just click Unpivot, you get something really weird:

image 

What actually needs to happen, which isn’t totally intuitive, is we need to click the header for the January column, hold down CTRL and click the February and March column headers:

SNAGHTML18e3d9

Once we’ve done that we click Unpivot and TA-DA!

image

So by virtue of being able to choose one or more columns, it’s actually quite flexible as you can choose HOW to un-pivot, which is uber-awesome. It’s a shame the UI doesn’t help you realize that today, but hopefully that gets fixed in future.

At any rate, you can right click the headers to rename the columns, then click “Apply and Close” and we end up with the un-pivoted source in our worksheet. And now we can create a new PivotTable off that source:

SNAGHTML1b48f1

Even better, if we add some new data to our original report:

SNAGHTML1cfc48

We can then refresh the query on the worksheet and the records are pulled in. With a refresh on the Pivot as well (yeah, unfortunately you can’t just refresh the pivot to drive the query to update) it’s all pulled in:

SNAGHTML1dae1a

Pretty damn cool!

PowerPivot training live in Victoria, BC!

Anyone who follows my website or Facebook Fan Page knows that I’m a huge fan of PowerPivot. Well great news now, that you can come and learn not only why this is the best thing to happen to Excel in 20 years, but also how to take advantage of it yourself!

I’ll be teaching a course on PowerPivot and DAX in Victoria, BC on November 22nd, 2013. While the course is hosted by the Chartered Professional Accountants, it’s open to anyone who wishes to subscribe.

If you’ve been trying to figure out how to get started with Power Pivot, you’re confused as to how and why things work, or you want to master date/time intelligence in PowerPivot, this course is for you.

100% hands on, we’ll start with basic pivot tables (just as a refresher.) Next we’ll look at how to build the PowerPivot versions and why they are so much more powerful. From linking multiple tables together without a single VLOOKUP to gaining a solid understanding of relating data, you’ll learn the key aspects to building solid PowerPivot models. We’ll work through understanding filter context, measures and relationships, and finish the day by building measures that allow you to pull great stats such as Month to Date for the same month last year, among others.

Without question, you’ll get the most out of this course if you’re experienced with PivotTables. If you don’t feel like you’re a pro in that area though, don’t worry! As an added bonus, to anyone who signs up via this notice, please let me know. I’ll provide you with a free copy of my Magic of PivotTables course so that you can make sure you’re 100% Pivot Table compatible before you arrive.

This is a hands on course, so you need to bring a laptop that is pre-loaded with either:

  • Excel 2010 and the free PowerPivot download
  • Excel 2013 with Office 2013 Professional Plus installed (yes the PLUS is key!)
  • Excel 2013 with Excel 2013 standalone installed.

If you have any questions as to which you have installed, simply drop me a note via the contact form on my website, and I’ll help you figure it out.

Full details of the course contents as well as a registration link, can be found at http://www.icabc-pd.com/pd-seminars-seminar.php?id=2849. Don’t wait too long though, as registration deadline is November 14th!

Hope to see you there!