PowerXL Course Live in Victoria, BC

We are very excited to announce that we will be hosting an “Introduction to Power Excel” session in Victoria, BC on June 6, 2014.

PowerPivot is revolutionizing the way that we look at data inside Microsoft Excel. Allowing us to link multiple tables together without a single VLOOKUP statement, it enables us to pull data together from different tables and databases where we never could before. But linking data from multiple sources, while powerful, only scratches the surface of the impact that it is making in the business intelligence landscape. Not only do we look at PowerPivot in this session, but we’ll also explore the incredible companion product Power Query; a tool that will surely blow your mind. Come join Ken as he walks you through the process of building a Business Intelligence system out of text files, databases and so much more.

Full details, including an early bird signup offer, can be found at http://www.excelguru.ca/forums/calendar.php?do=getinfo&e=42&day=2014-6-6

Activating Debugging Symbols with Add-In Express

Some time ago I published a blog post about an add-in that I’m building, and the reasons why I elected to start using Add-In Express to manage the process.  I still think Add-In Express is the best product for this kind of task, but figured I’d publish this post mainly to save me time the next time I get a new PC and have to go through this again.

The issue I ran into is that I need to set the CLR to work with .NET 3.5.  To do this you have to set the host file to v2.0x.  This is pretty easy, and Eugene has a good picture on how to do this here:  http://www.add-in-express.com/forum/read.php?FID=5&TID=7912

Following the steps that Eugene provided set me off to the races… with Excel 2010.  But when I tried to debug in Excel 2013 I wasn’t having much luck.  I changed the “Program to Start” to Excel 2013, set a breakpoint in my code, and started debugging.  Even though I’d set a breakpoint in my code, it was never activated.  Instead, the breakpoint goes from a red dot to a red circle with a white fill, and mouse-ing over it yields the message “The breakpoint will not currently be hit.  No symbols have been loaded for this document.”

SNAGHTML64bc16

This really sucks, as it’s pretty tough to work with.  So what’s different, and how come Excel 2010 works, but Excel 2013 doesn’t?

The action of modifying that host config file as described by Eugene actually creates a new file that is stored within the appropriate Office subfolder held within the Program Files folder.  Unfortunately, the program that is defined in your “Program to Start” settings seems to be ignored, and it actually creates the file in the folder that holds the oldest version of Office on your system.  So in my case it created the following file:

  • C:\Program Files\Microsoft Office\Office14\excel.exe.config

Now, I stand to be corrected on this, but I believe that this file is only used when launching the debugging tools from Visual Studio.  When the app is started it will look for this file and, if it finds it will use it to provides the debugging symbols back to Visual Studio for the .NET framework specified.

At any rate, the key here is to copy this file and paste it into the same folder that contains the exe file specified in the “Program to Start” area in the Project Properties of Visual Studio.  And once you do that:

SNAGHTML72dbee

Beautiful!  Works nicely.

It’s also worth mentioning that this setup works whether you are using a version of Office that comes from an MSI installer file (such as a DVD or volume license version of Excel/Office), or if you are using the Click to Run (C2R) version of Office that you can download from Office365.  The only thing you need to be concerned about is the file path in which to store the config file and find the Excel.exe executable:

  • 64bit MSI:  C:\Program Files\Microsoft Office\Office15
  • 32bit MSI:  C:\Program Files (x86)\Microsoft Office\Office15
  • 64bit C2R: C:\Program Files\Microsoft Office 15\root\office15
  • 32bit C2R: C:\Program Files (x86)\Microsoft Office 15\root\office15

I know this isn’t super Excel focussed, but hopefully it will help someone out there if they run into this error.

I should also throw a shout out to Andrei from Add-In Express who helped me sort this out last time I needed to figure this out.  The support there was awesome in helping me get it resolved.

Designing Software – How do YOU start?

Today I’m working on building a new reporting system at my day job.  Basically we’re looking at creating tools to budget and forecast results to be reviewed weekly, with a goal of being more on top of results so we can react better.  The stuff that every business really wants to do.

I’ve got the reports that we want at the end, and now I’m looking at them trying to figure out where to get the info from, how quickly it will need to be updated, where to source info that isn’t readily available and all that good stuff.  By the time we’re done I’m sure we’ll be hitting on Access, SQL Server, Excel, VBA, Power Pivot and Power Query, and possibly some other tools I haven’t thought of yet.

So at the end of the day, I’m building a system here.  Something to be used daily to capture, store and report on information.  Built of a collection of different pieces and technology, I’m essentially building my own software solution to a business problem.

I started with some sketching on paper, trying to illustrate the overall data collection/storage points and program flow.  Then I moved to Visio to try and draw it there.  (The pieces needed to move around as I added new dimensions and considerations.)  Then I got stuck on some of the considerations… some of them were big enough that they could change the way things were done.

Now I’m in Word trying to state – in English – the goals and non-goals of the overall project, so that I can get an overview on paper.

I’ve got a meeting this afternoon with a key player/driver to talk specifically about:

  • What the system should do
  • Key points that is absolutely will not be required to do
  • Key points that we want to make sure do not happen (different from above)
  • Who is going to use the system (each part)
  • How it’s going to be updated (each part)
  • How often updating will be needed for information drivers

I’m hoping with those stated that I can do two things:

  1. Get answers to the major design changing considerations I need before I build anything, and
  2. Generate a list that I can go back to later to ensure I haven’t overlooked or missed anything when I’m designing the system.

While I’m keenly aware that I’m not going to get everything covered, and that some design considerations will need to be made on the fly during development, I want to minimize the chance that any of those will major issues.  I hate coming up on a blocking point part way through the implementation that forces design re-work and retrofitting.

Now I’ve done this kind of things many times, but I don’t know that I’ve ever felt that I’ve settled into a piece of software that really works for me when trying to design this stuff.  I usually end up with Word to track goals, Visio for the flowcharts, but there’s a lot of iteration to get through it.

That’s the reason I’m reaching out here, is to see what others do in this area. I’m curious as to what tools/processes YOU use to scope and design a BI system or software.  Do you start with a goals document, do you start with flowcharts, or something else?  Are you happy with your process, or are there pain points and gotchas that you still get hit with?

Using PowerQuery with Email

At our last MVP summit, I sat in on a session on PowerQuery. Our presenter, who is pretty passionate about it, at one point asked us “Who has wanted to use PowerQuery on their email inbox?” And honestly, I have to admit, I was kind of “meh” to the idea. After all, email is a distraction, and not something I generally like to go back and study. Why would I care how many emails someone sent me, or what time they do? Not interesting.

But oh wait… after fooling around with it when I was bored one day, I came across something pretty damn cool. PowerQuery can do much more than just count emails… it can parse the contents! (Okay, that’s uber-geek for “it can read and examine the contents of the email.”) Why is that important? It’s because PowerQuery is all about sourcing and reshaping data.

We can build this…

Check this out… this chart is created, in Excel, showing the affiliate income that I’ve earned from selling Charley Kyd’s dashboard kits from Exceluser.com.

image

Yeah, okay, it’s not a huge amount of money, but that’s not the point.

… From This…

The point is this:  Charley sends an automated email each time there is a sale, of which the contents are always in the same structure:

image

And here’s how…

Step 1: Create a folder and a rule in Outlook

The first thing I did was set up a rule in Outlook to funnel all emails like the one above into the “Affiliates’’\Exceluser.com” folder.  While it’s not entirely necessary – I could target the inbox itself – this makes the updates run a bit faster as they don’t have to examine ALL of my email.  It’s also good housekeeping.  That folder will now be the ENTIRE source for the chart above.  Even better, I didn’t even have to read the emails myself!

Step 2: Open Excel and create a new Power Query

To do this, you must be running the December 2013 update at a minimum.  (Earlier versions don’t support Exchange as a data source.)  To set it up:

POWER QUERY –> From Other Sources –> From Microsoft Exchange

If you haven’t used an Exchange connection, you’ll be prompted for your credentials, which it will then save for you.  Unfortunately only one Exchange account is supported at a time right now, so if you need to change it go to POWER QUERY –> Data Source Settings to modify your credentials.

Now, for me it pulled up a Navigator task pane on the right with a list of Mail, Calendar, People, Tasks and Meeting Requests.  I personally wish it would take you into Power Query, as this is just one more step to slow me down.  I have to select Mail, then click Edit Query (at the bottom) to go and start building my query.  Once I do so, then I see this:

image

Wow… that’s a lot of columns, but not really where I want to be.  Let’s filter this list down a bit…

I clicked the dropdown arrow beside “Folder Path”, and then clicked the “Load More” button, since there is little in the list:

SNAGHTML38a48c

Perfect, now I uncheck Select All, check \Affiliates\Exceluser.com\ and I’m set… my list will filter to only those emails from Exceluser.com.

Step 3: Removing Irrelevant Columns

There’s a lot of them, so I’m going to get rid of them.  I select and remove each of the following:

“Subject”, “Sender”, “DisplayTo”, “DisplayCc”, “ToRecipients”, “CcRecipients”, “BccRecipients”, “DateTimeReceived”, “Importance”, “Categories”, “IsRead”, “HasAttachments”, “Attachments”, “Preview”, “Attributes”, “Id”

That leaves me with 3 columns in all:  “Folder Path”,”DateTimeSent” and “Body”:

SNAGHTML3d45e9

Step 4: Expand the Email Body

Now here’s where cool starts to get visible… the Body field contains a bunch of entries called “Record”, which are shown in green.  That means they are clickable and, if you do click it, will show you the entire body of the selected email.  But even better is that I can expand those into the table itself.  That funny little double headed arrow next to “Body” does just that.  When I click it, I can see all the items in the record which, in this case includes both the text and html bodies of the email:

image

I’d rather play with plain text than HTML, so I cleared the HTMLBody flag and click OK, resulting in the following:

image

So the body has expanded, and if I click on a specific record it shows the entire body in the window at the bottom.  Very cool, and I can see that this email shows earnings of $12.91.  Even better, as I click through the other records, Charley’s emails are identical, with only the value changing.  This is VERY good.

Step 5: Strip out the Value

Now the above is good, but it contains a bunch of hard returns and stuff.  I want to get rid of those.  So I select the Body.TextBody.1 column then go to Transform—>Text Transforms—>Clean.  That removes all the hard returns, resulting in this:

SNAGHTML46cabd

Now I’m going to do a find and replace to remove all characters before the value.  Since it’s always the same, I got to Replace values and I can replace “Hi Ken Puls,This email is to notify you that you have made a sale.You have earned $” with nothing.

The column will now start with the value, but I need to remove everything after the value.  Easiest way for me was to:

  • Replace ” in commission for this sale.” (notice the leading space) with a pipe: “|”
  • Choose Split Column –> By Delimiter –> Custom –> |

I then made sure that the Body.TextBody.1.1 column was selected and choose to set the Data Type to Number.

Cool, I’ve got my values now!

Step 6: Final cleanup

Finally, I did a bit of cleanup…

  • I removed the Body.TextBody.1.2 column as well as the Folder Path columns, since I don’t need either any more
  • I renamed the Body.TextBody.1.1 column to Commission
  • I renamed the table (on the right) to Commissions

And here it is:

SNAGHTML512197

And finally I clicked OK to load the table to the worksheet.

Step 7: Create the Report

And this part is probably the easiest… I created a simple Pivot:

  • DateTimeSent on rows
  • Group dates by Months and Years
  • Years on Columns
  • Commissions on Values

And then the PivotChart where I removed the Field Buttons, vertical axis and gridlines, and added data labels.  Pretty straight forward if you’re used to working with PivotTables or charts at all.

Final Thoughts

This tool is truly awesome.  All I need to do to get an update on my affiliate status is to open the file and refresh the data.  It will reach into my email folder and do the rest.  Simply amazing, really.

And while the setup may look like a lot of steps, this file took me less than 30 minutes to build, from setting up the rule in Outlook to pulling the data into PowerQuery through cleaning it and reporting on it.  Granted, I’ve been playing around with similar data lately, but still, it was quick from start to finish.

So… “Meh, it’s just email”?  Not any more.  :)

From TXT files to BI solution

Several years ago, when Power Pivot first hit the scene, I was talking to a buddy of mine about it. The conversation went something like this:

My Friend: “Ken, I’d love to use PowerPivot, but you don’t understand the database culture in my country. Our DB admins won’t let us connect directly to the databases, they will only email us text files on a daily, weekly or monthly basis.”

Me: “So what? Take the text file, suck it into PowerPivot and build your report. When they send you a new file, suck it into PowerPivot too, and you’re essentially building your own business intelligence system. You don’t need access to the database, just the data you get on a daily/weekly basis!”

My Friend: “Holy #*$&! I need to learn PowerPivot!”

Now, factually everything I said was true. Practically though, it was a bit more complicated than that. See, PowerPivot is great at importing individual files into tables and relating them. The issue here is that we really needed the data appended to an existing file, rather than related to an existing file. It could certainly be done, but it was a bit of a pain.

But now that we have Power Query the game changes in a big way for us; Power Query can actually import and append every file in a folder, and import it directly into the data model as one big contiguous table. Let’s take a look in a practical example…

Example Background

Assume we’ve got a corporate IT policy that blocks us from having direct access to the database, but our IT department sends us monthly listings of our sales categories. In the following case we’ve got 4 separate files which have common headers, but the content is different:

clip_image002

With PowerPivot it’s easy to link these as four individual tables in to the model, but it would be better to have a single table that has all the records in it, as that would be a better Pivot source setup. So how do we do it?

Building the Query

To begin, we need to place all of our files in a single folder. This folder will be the home for all current and future text files the IT department will ever send us.

Next we:

  • Go to Power Query –> From File –> From Folder
  • Browse to select our folder
  • Click OK

At this point we’ll be taken in to PowerQuery, and will be presented with a view similar to this:

clip_image004

Essentially it’s a list of all the files, and where they came from. The key piece here though, is the tiny little icon next to the “Content” header: the icon that looks like a double down arrow. Click it and something amazing will happen:

clip_image005

What’s so amazing is not that it pulled in the content. It’s that it has actually pulled in 128 rows of data which represents the entire content of all four files in this example. Now, to be fair, my data set here is small. I’ve also used this to pull in over 61,000 records from 30 csv files into the data model and it works just as well, although it only pulls the first 750 lines into Power Query’s preview for me to shape the actual query itself.

Obviously not all is right with the world though, as all the data came in as a single column. These are all Tab delimited files, (something that can be easily guessed by opening the source file in Notepad,) so this should be pretty easy to fix:

  • Go to Split Column –> Delimited –> Tab –> OK
  • Click Use First Row as Headers

We should now have this:

clip_image006

Much better, but we should still do a bit more cleanup:

  • Highlight the Date column and change the Data Type to Date
  • Highlight the Amount column and change the Data Type to Number

At this point it’s a good idea to scroll down the table to have a look at all the data. And it’s a good thing we did too, as there are some issues in this file:

clip_image007

Uh oh… errors. What gives there?

The issue is the headers in each text file. Because we changed the first column’s data type to Date, Power Query can’t render the text of “Date” in that column, which results in the error. The same is true of “Amount” which can’t be rendered as a number either since it is also text.

The manifestation as errors, while problematic at first blush, is actually a good thing. Why? Because now we have a method to filter them out:

  • Select the Date column
  • Click “Remove Errors”

We’re almost done here. The last things to do are:

  • Change the name (in the Query Settings task pane at right) from “Query 1” to “Transactions”
  • Uncheck “Load to worksheet” (in the bottom right corner)
  • Check “Load to data model” (in the bottom right corner)
  • Click “Apply & Close” in the top right

At this point, the data will all be streamed directly into a table in the Data Model! In fact, if we open PowerPivot and sort the dates from Oldest to Newest, we can see that we have indeed aggregated the records from the four individual files into one master table:

clip_image008

Implications

Because PowerQuery is pointed to the folder, it will stream in all files in the folder. This means that every month, when IT sends you new or updated files, you just need to save them into that folder. At that point refreshing the query will pull in all of the original files, as well as any you’ve added. So if IT sends me four files for February, then the February records get appended to the January ones that are already in my PowerPivot solution. If they send me a new file for a new category, (providing the file setup is 3 columns of tab delimited data,) it will also get appended to the table.

The implications of this are huge. You see, the issue that my friend complained about is not limited to his country at all. Every time I’ve taught PowerPivot to date, the same issue comes up from at least one participant. IT holds the database keys and won’t share, but they’ll send you a text file. No problem at all. Let them protect the integrity of the database, all you now need to do is make sure you receive your regular updates and pull them into your own purpose built solution.

Power Query and Power Pivot for the win here. That’s what I’m thinking. :)

Un-Pivoting Data in Power Query

I was fooling around with the latest build of Power Query (for Excel 2010/2013), and I’ve got to say that I’m actually really impressed with this piece. I think the user experience still needs a bit of work, but once you know how, this is really simple.

Whenever I teach courses on PivotTables people want to eat them up, but often they’ve already got their data in a format that resembles the PivotTable output, rather than a format that is consumable by the Pivot engine. Take the following set of data.

SNAGHTML111988

A classic setup for how people think and want to see the data. But now what if the user wanted to see the records by month under the category breakdowns? An easy task for a Pivot, but it would require a bit of manipulation without it. So how do we get it back to a Pivot compliant format?

With Power Query it’s actually super simple:

  • Step 1: Select the data and turn it into a table
  • Step 2: Select Power Query –> From Table to suck it in to Power Query
  • Step 3: Un-pivot it

Okay, so this one takes a bit of know how right now. If you just click Unpivot, you get something really weird:

image 

What actually needs to happen, which isn’t totally intuitive, is we need to click the header for the January column, hold down CTRL and click the February and March column headers:

SNAGHTML18e3d9

Once we’ve done that we click Unpivot and TA-DA!

image

So by virtue of being able to choose one or more columns, it’s actually quite flexible as you can choose HOW to un-pivot, which is uber-awesome. It’s a shame the UI doesn’t help you realize that today, but hopefully that gets fixed in future.

At any rate, you can right click the headers to rename the columns, then click “Apply and Close” and we end up with the un-pivoted source in our worksheet. And now we can create a new PivotTable off that source:

SNAGHTML1b48f1

Even better, if we add some new data to our original report:

SNAGHTML1cfc48

We can then refresh the query on the worksheet and the records are pulled in. With a refresh on the Pivot as well (yeah, unfortunately you can’t just refresh the pivot to drive the query to update) it’s all pulled in:

SNAGHTML1dae1a

Pretty damn cool!

Newsletter and blog updates

Today marked a new milestone for me, as I actually sent out a newsletter to everyone who has ever signed up at Excelguru.ca.  I was little hesitant, but figured I’d give it a go and see how people reacted.

I actually got some really nice feedback from people, which was great, even asking how to get others to sign up.

To that end, I’ve added a newsletter signup button on my Facebook page at www.facebook.com/xlguru.  And, since my blog theme was getting a bit stale, I’ve updated that as well, adding a link on the right hand side here.

I’m not totally sure I’m sold on the new theme, but let me know what you think.

PowerPivot training live in Victoria, BC!

Anyone who follows my website or Facebook Fan Page knows that I’m a huge fan of PowerPivot. Well great news now, that you can come and learn not only why this is the best thing to happen to Excel in 20 years, but also how to take advantage of it yourself!

I’ll be teaching a course on PowerPivot and DAX in Victoria, BC on November 22nd, 2013. While the course is hosted by the Chartered Professional Accountants, it’s open to anyone who wishes to subscribe.

If you’ve been trying to figure out how to get started with Power Pivot, you’re confused as to how and why things work, or you want to master date/time intelligence in PowerPivot, this course is for you.

100% hands on, we’ll start with basic pivot tables (just as a refresher.) Next we’ll look at how to build the PowerPivot versions and why they are so much more powerful. From linking multiple tables together without a single VLOOKUP to gaining a solid understanding of relating data, you’ll learn the key aspects to building solid PowerPivot models. We’ll work through understanding filter context, measures and relationships, and finish the day by building measures that allow you to pull great stats such as Month to Date for the same month last year, among others.

Without question, you’ll get the most out of this course if you’re experienced with PivotTables. If you don’t feel like you’re a pro in that area though, don’t worry! As an added bonus, to anyone who signs up via this notice, please let me know. I’ll provide you with a free copy of my Magic of PivotTables course so that you can make sure you’re 100% Pivot Table compatible before you arrive.

This is a hands on course, so you need to bring a laptop that is pre-loaded with either:

  • Excel 2010 and the free PowerPivot download
  • Excel 2013 with Office 2013 Professional Plus installed (yes the PLUS is key!)
  • Excel 2013 with Excel 2013 standalone installed.

If you have any questions as to which you have installed, simply drop me a note via the contact form on my website, and I’ll help you figure it out.

Full details of the course contents as well as a registration link, can be found at http://www.icabc-pd.com/pd-seminars-seminar.php?id=2849. Don’t wait too long though, as registration deadline is November 14th!

Hope to see you there!

Excel Power Map Sample

Yesterday, Microsoft released a preview of Power Map; a geo-spatial mapping Excel add-in, formerly know as GeoFlow.  It’s a pretty cool add-in that allows you to plot data based on geographic identifiers (longitude, latitude, country name, town names, postal codes, etc), and show it on a 3D or 2D map.  It works with tabular data sets, whether they be from and Excel table, database or Power Pivot.

The other cool thing is that, once you get it right, you can actually produce a video that can be shared with non-Excel users!  I’ve uploaded one of those to YouTube so you can check it out.

This video shows the wind speeds of hurricane Sandy as it travelled through the Caribbean to it’s eventual landfall on the USA’s east coast.  I used a heat map to show the speed (the redder it gets the faster), and played it over time so you can trace the path.

Pretty cool stuff!

If you have Excel 2013 Pro Plus, you can download the preview here:  http://www.microsoft.com/en-us/download/details.aspx?id=38395

OFFSET or Named Range – Which would you use?

I’m working on a spreadsheet where users will be able (required) to insert new rows at a later date.  When they do so, it’s critical that the section subtotals always… well… subtotal correctly.

The challenge, of course, is that you can’t rely on newly inserted rows being picked up by the subtotal formulas, so someone needs to check them.  At least, you can, but it takes more than just a SUM or SUBTOTAL formula to get it done.

I reached back to the method using a named range that I describe in the “Always Refer to the Cell Above” Excelguru KB Entry, resulting in a formula that looks like this:

SNAGHTML958c98

Of course, I don’t actually need to use the named range to do this.  I could make it work by using the OFFSET function in L66 as follows:

=SUBTOTAL(9,L62:OFFSET(L66,-1,0))

Either will work just fine, and will not be tripped up by a user inserting a new row within my boundary, so I should never (okay, never say never) run into an issue with this particular problem.

I’m curious which method you would use?  Named Range or OFFSET, and why…