This past weekend we published a major redesign at Skillwave.Training. Months in the making, this has been a total overhaul to focus on delivering the best online learning experience for our clients. Check out some of the images from the new site:
When you log in, you’ll be taken to your Dashboard immediately. This is the one stop console that will let you access any of your active course subscriptions, review forum posts, download your files, and manage your billing and profile details. We’ve worked hard to make this dashboard intuitive and easy to use as possible, and to make it look great on mobile as well.
Re-Designed Course Player
The course player is a completely custom built as well. Of course, you’d expect to see your navigation menu on the left to get to your lessons, but we’ve also added a “Materials” fly out menu on the right where you can access files specific to any given lesson.
Community Forum Overhaul
We said is was a major redesign at Skillwave.Training, and we meant it. One of our big goals here was to do a better job with the Skillwave help forum and foster a sense of community within it. Our belief is that learning is great, but there can be another hurdle when trying to convert theory into practice with your own data. We see the forum experience and Skillwave community as a crucial part of solving this issue, giving students the ability to:
Ask questions about the course materials,
Get help with applying techniques to their own data,
Interact with other people in the same training,
Practice applying their skills to other data sets, and
Reinforce their knowledge and help others in the process.
Any of our clients who have an active subscription to one of our paid products will find a completely revamped forum experience. As forum posters ourselves, there were a couple of very important things that we wanted to make sure that our community was provided a good set of tools for:
Asking To this end, we’ve made sure that we support topic tags, image and file uploads, code tags and a variety of rich formatting options. (Our old forum was quite weak in this regard).
Answering In addition to the tools above, we’ve added the ability to mark questions as solved. Our forums are searchable based on topic tags, answered status, solved status and more.
Ensuring high quality answers. Our forum is private and monitored by our admin team. Even if Matt, Miguel or myself aren’t the ones answering specific questions, we have a special “Recommended Answer” tag that we can apply to answers. This serves two purposes to us: the first is providing assurance to the asker that they got a great answer, while the second is providing validation to a poster that they’ve provided a high-quality response.
Course to Question Integration
There’s one more really cool thing though… We also now give you the ability to post a forum question directly from a given lesson and provide links to all other questions that have been posted in this manner. This serves both askers and answerers as it links directly back to the source of the question. We’re super proud of this little feature and feel that it sets us apart from other platforms out there. Not because other platforms don’t offer the ability to ask questions – they do. But we serve all of that up right inside the lesson page.
Check Out the major redesign at Skillwave.Training
If you haven’t checked out Skillwave.Training yet, you really should. We’ve got all kinds of great courses related to Excel, Power BI, Power Query and DAX. You can even try out the platform via our free Power Query Fundamentals course. You won’t have access to the forums on the free tier, but you’ll be able to experience the rest of our new platform.
As we've just launched the site, we'd love to get your feedback. For the next month or so, you can do that by clicking the little Feedback widget on the right side of any site page. Let us know what you think!
We're super excited to let you know that we've just released some new Monkey Tools features! Let's take a quick look as to what is new...
The Table Monkey
This feature was actually released back in December. However, since we announced it at the KSA meetup (which you can see on YouTube), we decided that it needed a personality of its own. So now, on the Query Monkey menu you'll find the Table Monkey: a monkey who is dedicated to helping you build queries from Excel tables.
Some of the cool features of this Monkey are:
It can create multiple "From Table" queries at once.
You can rename the Excel tables by right clicking on the blue boxes that represent the Excel tables.
You can rename the Queries by right clicking on the green boxes that represent the data model tables.
It allows you to toggle the end query so you can load it to the data model or as a connection.
It provides a data typing algorithm that is smarter than Power Query's native algorithm.
Overall, we find this to be super useful. It allows us to create multiple table connections in a few seconds, rather than the minutes it would take us to set things up manually.
This feature is a Pro feature, but is fully functional in our free trial.
Create Query from M Code
The next feature that we included is a nice interface to create a new query from M code. If you post in forums and need to quickly create a query for testing, you can simply take their code, paste it into the form, give it a name and click create. Much easier than having to create a new query, edit the code, select everything and then paste:
The main benefit of this form is saving you the headache of jumping into the query editor to create your query. Additionally, we also added the ability to indent the code right in the form. So if you're just trying to read it, it can be useful without ever creating a query at all.
We feel that this would be a super useful feature for those helping each other in the community. Thus, this feature falls in to our "Forever Free" category and works at all license levels (include after your trial expires).
Convenience Features - Pivots & Filters
Another one of the new Monkey Tools features that we've added is a Pivots & Filters menu to the Monkey Tools ribbon. This is purely a convenience feature. It's designed to bring the commands closer to you so that you don't have to do as much tab switching:
The version on the left is what we are terming the "Classic" view, which shows you the Insert PivotTable button (as well as PivotCharts, Slicers & Timelines). The view on the right is what your menu will look like once the new Insert PivotTable button rolls out to your Office 365 install. (If your Monkey Tools menu starts with PivotCharts, then head to our Options screen and uncheck the "Use Legacy PivotTable Menu Buttons" option.)
And - of course - like every release we do, we have included a bunch of bug fixes. Fixes that are applicable for all users including Pro, Trial and Free.
How to you get the new Monkey Tools features?
If you already have Monkey Tools installed, then head in to Monkey Tools -> Options. If you are running 1.0.7678.28973, then you already have them. And if not, click Check for Updates Now to update.
Don't have Monkey Tools installed? You can try the full feature set for free for two weeks before the license reverts to a "free" license. We think you'll be pleasantly surprised with how useful Monkey Tools is on a free license, and yet how much more it does in the Pro version.
Wow, it is hard to believe it is already December. And looking back at my blog, I realized that I forgot to tell you that we released a few more free features in Monkey Tools over the past month! In fact, November was a busy development month for us, so I though it would be a good time to share what we have done.
One of my friends saw the ability to create a calendar using the Calendar Monkey. While he was suitably impressed, he did also ask me if it could do something he badly needed, which was to create a column displaying the ISO week that is commonly used in Europe. Unfortunately, the Calendar Monkey had not learned enough about ISO weeks at that time, so was unable to help. So, we sent a couple of the Monkeys back to school…!
If you are on a trial or free version of Monkey Tools, you will find that the Query Monkey will now allow you to add a custom Power Query function called GetISOWeek to your file. From there, you can manually call this function via the Invoke Custom Function button, or via writing a formula in the Custom Column dialog within Power Query. Simply feed the function any date column to get the ISO Week Number, and include “true” for the final (optional) parameter if you prefer the “precise” text version:
=fnGetISO( [Date] )
=fnGetISO( [Date], true )
Sun 30 Dec 2007
Mon 31 Dec 2007
Tue 1 Jan 2008
Of course, adding a new function in to your workbook is great, but for our Pro users, the Calendar Monkey wanted to make it even easier, and added it as a default column choice. No fuss, no mess, just choose the ISO date formats you need and let the Calendar Monkey do the rest!
Measure Monkey – Basic Explicit Measures
While we are also super proud of our Measure Monkey who will help create Multiple Explicit Measures, we also realize that there are times where you need to create individual measures. For this reason, we trained another Measure Monkey to do exactly that.
The Measure Monkey that focuses on Basic Explicit Measures provides you with a no-code experience to create… well… basic explicit aggregations. (Yes, you could make Implicit versions via drag and drop, but serious modelers far prefer the more customizable and scalable explicit versions.)
This Measure Monkey will help you create these measures without writing a single line of DAX (although it does show you the DAX it has created.) You will be provided a list of relevant aggregations (go home COUNTA!) and smart default formatting choices. The Monkey will even capture your preferred defaults to make you even faster next time.
Did I mention that my friend whom I referred to above, runs a French version of Excel? Unfortunately, Monkey Tools had some challenges reading the queries in his model correctly. While we have always claimed that we only support English versions of Excel, this still bothered us.
One interesting part about being a coder is that MOST coding is written in English. But every now and then, Microsoft localizes something that we did not expect. So was the case with the underlying Power Query connection name. To make a long story short, I have now learned that “Query” is “Requête” in French, “Abfrage” in German, and has other localized words among other languages. And now that we know? We have retrained our tool to deal with this challenge.
What this means to you if you are a user of a non-English version of Excel is – while we are not quite ready to say we fully support all non-English versions of Excel – we do believe Monkey Tools should work no matter the localization of your Excel install. (We do still recommend caution here. Until we say we OFFICIALLY support all languages, please do try the Trial version before you buy, and let us know if Monkey Tools has any issues reading your queries!)
Another question we received from time to time was “How do I give you feedback?” or “How do I report a bug?” It was enough that we realized that we had done a poor job of giving you a mechanism to do so. So to that end, we have added the following to the Monkey Tools Help menu:
Log a Bug
QuerySleuth Indenter Issues (for issues specific to QuerySleuth indentation)
Each takes you to a form that you can fill out to get in contact with the dev team. And yes, we are open to hearing your suggestions!
Various Other Bug Fixes
Of course, no release would be complete without a few bug fixes. There were a half dozen fixes that were included in the various November updates (plus another half dozen published last night.) Each was minor, and not really worth mentioning on their own, but rest assured that we are trying to fix bugs whenever we find them.
What is the Current Version?
To make sure you have all of the current features, go to Monkey Tools -> Options. If you are running a version that is less than 1.0.7640.41496, then click Check for Updates Now to update.
And if you don’t have Monkey Tools installed yet… what are you waiting for? You can try the pro features for free for two weeks, and there are a ton of useful tools even if you don’t elect to purchase a pro license. Click here to get your copy of Monkey Tools. And hey… if you decide to upgrade to a Annual Pro license today, you can get 20% off with the code BF20MONKEYTOOLS.
So… What’s Next?
We are working on something cool that will help Excel modelers get started quickly. And if you want to be one of the first to hear about it and see it in action you should attend the inaugural KSA Excel Power Platform meetup, as I’ll be demoing this new feature.
We’re super excited to announce that we have given the Monkey Tools calendar creator feature an upgrade. In fact, it is so much of a power up, that this feature has graduated into its own full grown monkey! We call it the Calendar Monkey.
In our initial version, the Calendar Creator would create the queries necessary to load a single column calendar of unique dates into the data model and add PeriodID columns for 364-day calendars like 445 and 13 fiscal periods.
The Calendar Monkey adds a couple of pretty powerful features to this original mix. Let’s walk through the experience quickly to see what I mean.
Step 1: Define the Calendar Boundaries
Not much has really changed here. You can still:
Pick your calendar type (12 month, 13 month, 445, 454, 544), and define your custom year end (including a different month for 12 month calendars.)
Define a name for your calendar table query
Choose the load destination
Pick any valid date columns for the Start and End date of your data
These settings allow our monkey to build the calendar to dynamically span the entire range of your data on every refresh.
The only real difference here is that we’ve added a checkbox and a Next button. (Notice that Create is still available, if you just want to accept the monkey’s default choices for the rest of the options you’re about to see.)
Step 2: Choose Calendar Columns
One of the things that always bothered us about our original version is that it created the calendar’s Date column, but then left it up to you to add the different date formats that you wanted. So we decided to improve that, as you can see here:
Our monkey pre-selects the most common date formats, but if you ever check/uncheck one, it will learn your preferred defaults and provide those next time you go to inject a calendar. The columns shown dynamically react to your choices in Step 1 as well… if you use a Dec 31 year end for a 12 month calendar, you only need the first two columns – so that’s what the Calendar Monkey will show you. If you have a custom year end (like Jun 30 or Sep 30), you may also want Fiscal columns, so the monkey provides those as options too. And if you work with a 364-day calendar like a 445 variant or a 13 fiscal periods calendar… there is a final column of PeriodID’s that shows up in that blank spot too.
Step 3: Adding Relationships
Depending on your choices in Step 1, the Calendar Monkey will determine if you will be presented with this page or if it will be skipped. If you choose to load your calendar to the Data Model (or Data Model & Worksheet), the monkey will list every date column loaded to the data model. The purpose of this is simple; let you decide if your new calendar table should be linked to any of those columns listed.
For any columns you check, Calendar Monkey will do its best to create those relationships after loading your table to the data model. (There are some things that can prevent the monkey from accomplishing this, such as creating an inactive relationship.)
Step 4: Creating the Calendar
Even though the Calendar Monkey has a lot of work to do when you click create, it also knows the value of good feedback. For this reason, it will update you as to the progress as it completes all the individual tasks, as you can see here:
Now unfortunately, there are a couple of things that the monkey is unable to do (thanks to a lack of security clearance with the Excel data model). Rather than just ignore these essential tasks, however, it will tell you what needs to be done, with the exact steps to do so.
As a bit of a pro-tip here… you don’t actually need to close the Calendar Monkey window to take action on those steps… so keep it open until you’ve hopped into the data model and made the advised changes! Of course, if you understand what’s happening, and know the steps you’ll need to take, there is also an option to automatically close the summary screen upon completion as well.
What the Calendar Monkey cannot do
There are three things that the Calendar Monkey can’t do at this time:
Automatically hide the keys on the “many” side of the relationship (the foreign keys)
Automatically create the sorting hierarchies to sort Month Name by Month Number and Day Name by Weekday Number
Create the calendar in a Power BI file
Honestly, while we joked earlier that the Calendar Monkey doesn’t have security clearance, the reality is that there is a limitation with the data model’s extensibility model which is preventing us from solving the first two items. The third… its on our backlog.
So how do you get the new Calendar Monkey?
You need Monkey Tools version 1.0.7493.29574 or higher, and you'll have the Calendar Monkey ready to do your bidding.
We’re pleased to announce the first Monkey Tools update is now available for download! This one contains a new feature, some new logic and an update to one of our data connectors. Read on for more information!
I can’t believe it’s been a month since our initial release, but here we are. Since that release, a couple of notable things happened:
We published a blog post on PivotSleuth and Measure Tables. This was a learning experience for me, as I discovered something new about Measure Tables. I always knew that you could mark your Fact tables as Measure tables by hiding all the unaggregated columns, but I didn’t realize that this also means that disconnected Measure tables will then suppress the “Relationships between tables may be needed error.”
One of my friends hit me with an interesting curve ball: he turned on “Store datasets using enhanced metadata format” in the Power BI preview features. And as it turned out, some of the methods we’d been using to analyze the Power BI model disappear when you do that.
Both of these have led to some improvements in the software, which we’re proud to say are finally available to you.
What’s new in version 1.0.7418.29970?
There are three major things that are new in this version:
A New Power BI Connector
This was actually a huge amount of work for us, as not only did we have to build a new connector to read the new Power BI file format, but we also had to analyze the file as it was opening to see if it was in the classic format or not. And to make it harder, if you have the Enhanced Metadata Format turned on, a legacy file requires using our initial connector, but any refresh must be done with the new connector. Fun times for a developer and, as you can imagine, it took as a bit of effort to pull it off. It’s actually this piece that has held us back on the other features, as the connector MUST work and impacted everything.
While most of this work is invisible to the end user, there are two things that hope you do notice:
Connecting to the new file format is much faster than using the legacy format.
We also took the time to remove the reliance on configuring the version of Power BI that launches for you by default. We now just launch Power BI using your default connector, then bind to it, no matter how many versions you have on your machine, or which they are.
Updated Functionality in the PivotSleuth
As we learned in the PivotSleuth and Measure Tables blog post, hiding all unaggregated columns on a disconnected measure table flags the table as an official Measure table and suppresses the “Relationships between tables may be needed” message. For that reason, PivotSleuth needed to recognize that this is acceptable.
In other words, when the Measures table is a properly formatted disconnected table, it needed to (and now does) show that there are no issues:
But when that disconnected Measures table has a visible column, not only should it have shown the issues, but also tell you what needs to be done to fix them. And now it does:
New QueryMonkey Feature: Add Measure Table
As mentioned before, our philosophy is “Build better, faster”. For that reason we’ve added a new QueryMonkey feature for you: Add Measure Table.
This feature will prompt you for a name for the table…
And then create a new empty table in the data model for you:
Unfortunately, the Excel team hasn’t given us a way to programmatically hide columns in the data model (we could REALLY use that ability), so we can’t take that last critical step for you: Hiding the Measures column to prevent the "Relationships between tables may be needed" message. But never fear, we do tell you exactly what needs to be done:
So while we typically store our measures on the Fact tables, rather than a disconnected Measure table, we totally get that a lot of people like this approach. Hopefully this make it a lot easier for you!
How do I get the Monkey Tools Update?
The answer to this depends on whether or not you’ve installed Monkey Tools yet.
Last week I got an email from one of our readers, with some feedback related to how Monkey Tools’ PivotSleuth works with Measure tables in Power Pivot.
Best practices is to set up a disconnected table to house DAX measures. Because of this - all of the fields listed in the Pivot Sleuth end up being RED. So it is not really useful. Is there away around this? Other than incorporating your DAX measures in an actual table?
And is this the reason for the " You might need a relationship" annoyance from Excel?
Personally, I challenge the term “best practices” here, as I think it’s borne a bit out of history, and no longer relevant. But more on that below...
My favourite part about this email was the last line, as this specific problem is actually one of the main reasons we wrote this feature. Let’s take a look and see what PivotSleuth says about this…
The model framework I’m using looks like this:
As you can see, we have a separate _Measures table in this instance, where all the measures are collected, rather than storing the measures on the Sales & Budget tables. They’re not complex at all, in fact, they’re about the easiest measures you could create:
Sales $: =SUM( Sales[Amount] )
Budget $: =SUM( Budgets[Amount] )
And when you drag them on to a PivotTable, they work nicely:
So, if everything is fine, what’s the issue?
It’s all about this nagging little message:
Why is it here?
PivotSleuth and Measure Tables
When you launch PivotSleuth against this PivotTable, you see the following:
Why are all the fields red? The answer is shown when you select one of the measures: there is no relationship between the Sales, Categories, or Calendar tables.
So, what happens when you store these measures on the Sales and Budget tables? The irritating message goes away:
The practice of storing measures on another table as a recommended practice was born out of Power Pivot instability, back when things crashed a ton. Sometimes the fix would require removing the table from the data model and re-creating it, at which point you’d lose any measures or relationships built on those tables. It was frustrating and annoying, and led people to keeping their measures into a separate table to protect themselves from having to do that work. The challenge, however, was that it caused a “Relationships May be Needed” message every single time you used a measure. And there was no way to make that go away.
Since 2016, Microsoft focused on fixing bugs related to Power Pivot, with many of them making their way back into the Excel 2016 product, even if they were fixed after 2019 was released. While they’re certainly not all gone, it’s unusual to see issues that force the need for tables to be deleted and rebuilt now. To me, this reason for separating your DAX has basically become a non-issue, but the habit still remains.
Some people also argue that this gives you a central place to go to get your measures. I’d argue that the list can become overwhelming when all your measures are in one folder without any categorization. (Unlike Power BI, we can’t group measures into folders.)
I far prefer to put my measures on the appropriate tables, then hide all the unaggregated columns on the table. This offers three benefits:
It groups the measures by table, making them easier to find. (Sure, you can use the search function, if you like. I’m just saying those tables act like folders.)
It means the “Relationships May be Needed” warning only shows up when a measure can’t be cross filtered by a natural relationship in the model.
It changes the icon of the table to the sigma icon (?), which is synonymous with measures.
The Benefits of Hidden Columns
Let’s take a quick look at the benefits of hiding every unaggregated column in the data model, essentially leaving only measures visible on your table. This is super easy to do: you just hop into the Power Pivot window, right click every natural column and choose “Hide from Client Tools”. (I personally prefer to do this in Diagram view, but it works in table view as well.) The results are pretty impactful when you look at the PivotTable field list:
Notice that the tables are now flagged as “Measure Tables”. They inherit the sigma icon, as well as move to the top of the list. This is the primary reason I prefer to work this way… the measures are grouped logically based on their tables. Essentially, they act like Measure Folders.
Full Disclosure on PivotSleuth
As I was writing up this post, I discovered something that I hadn’t tripped on yet. Look what happens when I hide all unaggregated measures on a Measure table:
Notice that there is no “Relationships May be Needed” message. I didn’t realize that this flag would change that, and as of today Monkey Tools doesn’t check for this, so still shows all read. So, it looks like we need to update our logic a bit to add some more functionality. 🙂
Hopefully we both learned something here!
Personally, I’m sticking with the individual table approach, and storing my measures as close to the table they summarize. I like the ability that it categorizes my measures. But ultimately, it’s cool that we have the ability to work the way we want to work.
And we’ll look at modifying Monkey Tools to avoid showing red when – and only when – you’ve hidden every unaggregated column on your measure table.
PS: Did you know that you can’t call a table “Measures”? It’s a reserved word, so will give you a table called “A”. That’s why I went with “_Measures”
We are super excited to announce that we’ve (at last) released the first version of our Monkey Tools software! Ken has been working on this software on and off for the better part of 8 years now. But after showing it to a friend in Wellington last year, we decided it was finally time to get serious. We hired a full-time developer last summer and are finally ready to go live with the initial release!
What is Monkey Tools?
Monkey Tools is an Excel add-in (supported in Excel 2016 and higher) which provides tools for you - as a business intelligence author/reviewer - to:
Build models more rapidly
Follow recommended practices
Document your work
Audit files that you receive
It is targeted primarily at modelers and analysts who work primarily in Excel, but also push their models into Power BI. (Our philosophy at Excelguru is to model in Excel first, then export to Power BI for reporting, sharing and security control.)
Oh, and super important… it installs on your system without requiring admin rights on your PC. How cool is that?
What does Monkey Tools actually do?
Well… lots! We’ve collected all the cool features under some themed buttons including:
QueryMonkey (for inserting new queries)
DestinationSleuth (to provide information on query load destinations)
QuerySleuth (helping understand your actual queries)
TimeSleuth (to benchmark query load times)
PivotSleuth (helping you diagnose Pivot Table field issues)
DAXSleuth (tools especially for working with DAX measures)
ModelSleuth (reporting on the properties of your queries and data model)
Cute names, right? The Monkey builds things, and the Sleuths investigate things. Here’s a high-level view of what they each contain.
Query Monkey gives you the ability to insert key queries like:
A “From Folder” setup that works with local and/or SharePoint hosted files
Dynamic calendar tables based on your data (for custom calendars, it even provides the option to insert the "periodicity" columns for Rob Collie's GFITW DAX pattern!)
Today, this is simply a viewer to visually indicate the load destinations of your tables (better than just “Connection Only” or “x Rows Loaded”).
This is a single form, packed with information and features such as:
A dependency/precedent tree view layout
Full colour display based on load destination
Colourful and indented M code
The ability to modify the M code and write it back to the editor WITHOUT LOCKING YOUR EXCEL User Interface!
This feature allows you to time query execution in Excel, and even chart comparisons between them with or without privacy settings enabled. If you’ve ever wondered which query is slowing down your workbook, or wanted to time test two different approaches, you may find this helpful!
Have you ever seen that irritating “relationships may be needed” error when building a Power Pivot based Pivot Table, and wondered why? Pivot Sleuth can tell you…
See the real, fully qualified names of the fields used in your Pivot Tables
Highlight potential or current issues in Pivot Table configurations
Debug cross filtering issues, “relationships may be needed” errors and errors where grand totals are returned for all rows on the Pivot Table
We believe that measure dependencies are just as important as query dependencies, and this is the reason we build the DAXSleuth. This form:
Displays a dependency/precedent treeview of your DAX measures
Provides a full colour display of Implicit and Explicit measures (with or without children), as well as Calculated Columns
Shows your DAX measures with colour highlighting in an indented format
Allows you to Indent, Un-Indent, Flatten, Duplicate and even Update measures without leaving the DAXSleuth
Exposes all locations a DAX Measure has been used (Pivot Tables, Pivot Charts, OLAP Formulae and Named Ranges), and even allows you to select those objects right from the DAX Sleuth!
Have you ever had to provide documentation for your model? Or picked up a model from someone else and had to review it? The ModelSleuth provides reports and utilities such as:
A full model summary report showing key statistics about your tables, relationships, columns, measures and queries. (Trial and Free licenses are limited to every other record in this report.)
A model memory usage report, complete with how much memory is recoverable (for Excel based data models).
An unused columns report (for Excel based data models).
A DMV Explorer (for those who want to build their own reports).
Monkey Tools Supported File Types
The Monkey Tools add-in is compatible with Excel 2016 or higher, and can read from:
Power BI Desktop files
Backup files (that you can export from the Monkey Tools software)
Will Monkey Tools get updates?
Oh yes, we have plans for many more features!
Our intended model is to deliver features (and bug fixes) as we develop them. That means that there could be periods with no updates as we work on something big, or periods with multiple updates delivered in a single week. We know that some people love frequent updates and some people don’t, so we let you control how often you get them:
The key thing to recognize here is that we are not holding new features for a vNext. They’ll be delivered when they’re ready.
Can I try Monkey Tools before I buy it?
Ken did not become or remain a Microsoft MVP without contributing a large portion of tools and help to the community for free, and that won’t change. Having said that, we’re paying a developer to work on this product full time and need to recoup those costs. For that reason, we will always have both a Free version, as well as a Pro version.
Naturally, we want you to look at it, as we're convinced you'll like it. And that's why we have a two-week trial that provides full access to almost all of the full feature set. Once your trial expires, your license will automatically revert to a free license. You’ll still get fixes and new features, they’ll just render in free mode (without colour, without field advice, etc.). We do believe that you’ll still find the tool useful, just maybe not as useful without a Pro license.
Ready to learn more about pricing options and download the free trial? Click here!
I’m super excited to be presenting a Data Insights 2 Day Master Class in Wellington, NZ with my good friend Matt Allington. This is the first time we’ll be working together to bring our unique strengths to our participants in a joint session format, and it’s going to be AWESOME!
How is the event going to work?
We think you’ll love this. We’re going to divide our group in two. You’ll get a one full day with me on Dimensional Modeling, and one full day with Matt, which focuses on the DAX formula language. These two components are essential to understand when you want to build truly dynamic, scalable and stable data models, and we're going to cover both in detail.
What is covered in the Dimensional Modeling day?
Ken will be looking deeply at how to structure your data for a successful Excel/Power BI data model. You’ll learn how your data should be shaped, what the data model expects in its tables, and a variety of techniques and patterns to work around common join problems. Our goal here is very simple: to teach you everything you need to lay the foundation for a data model that will stand the test of time.
But not only will you lean practical hands on techniques to lay this groundwork, you’ll learn the key terminology at play. By the time you leave this session you’ll be able to identify things like ‘facts’, ‘dimensions’, ‘relationships’, ‘schemas’, ‘slowly moving dimensions’ and much more. Armed with this knowledge you will be able to not only design your own models properly, but you’ll be able to understand other materials you reference during your career.
As you might expect from one of the world’s leading voices on Power Query, there’s going to be a heavy focus on Power Query in this course. But it's Power Query with a purpose: to feed a Power Pivot Data Model.
What is covered in the DAX Formula day?
Matt will take you into the world of DAX formulas, exploring how this incredible language can be used to summarize virtually any statistic you want to know. He’s one of the world’s experts in the DAX language and will teach you not only what you SHOULD do with DAX, but what you SHOULDN’T.
When Is This?
Soon! It’s going to be hosted in Wellington, NZ on Feb 24 and 25, 2020. But the good news is that there are still seats available, and we’d LOVE to see you there with us.
On this blog, I showcase a lot of different techniques for manipulating and reshaping data. For anyone that follows the blog, you already know this, and you know it's a pretty important topic to me. But the thing we shouldn't lose site of is WHY we do this. It's to drive analytics. I'm fairly convinced that the majority of the loyal readers here already know this. Thus, I wanted to ask your opinion on something...
How do you design your data model?
What I'm specifically interested in is how you approach designing the Fact and Dimension tables you use for your Power Pivot model. And I'm not specifically talking about Power Query here. We all know you should be using what you learned from our recently relaunched Power Query Academy to do the technical parts. 😉
What I'm more interested in is the thought process you go through before you get to the technical bit of doing the data reshaping.
If you read books on setting up a data model, you'll probably be told that you need to do the following four steps:
Identify the business process
Determine the grain of the model
Design your Dimension tables
Design the Fact tables
So if you're asked "how do you design your data model", do these steps resonate with you, and why?
Do you consciously sit down, and work through each of these steps in order? I suspect that many self-service BI analysts skip the first step entirely as they are implicitly familiar with their business process. (As a consultant, I ask a lot of questions in this area to try and understand this before building anything.)
Do you design the reports on paper, then work backwards to the data you'll need, go find it and reshape it? Or do you go the other way, trying to collect and reshape the data, then build reports once you think you have what you need?
Do you explicitly define the model grain? And if you do, what does that mean to you? Is it restricted to "I want transactions at an monthly/daily/hourly basis"? Or do you do deeper like "I want transactions at a daily basis and want to break them down by customer, region and product"?
Why the question?
There's actually two reasons why I'm asking this question:
Reason 1 is that I'd I think healthy discussion makes all of us better. I'd like to hear your thoughts on this as I'm probably going to learn something that I haven't discovered in my own learning journey.
Reason 2 is that my whole business is around teaching people how to do these things, and I'm always looking to make things clearer. The more opinions I hear (even if they contrast with each other), the more I can help people understand his topic.
So sound off, please! We'd all love to hear how you approach the task of building a data model.
One of the things that I find a bit misunderstood in Power Query is whether or not Data Types matter. I mean, I’m sure everyone agrees to some point that they do, but you do know just how much they matter, and why?
Over the weekend, I received an email from one of our loyal students which read, in part:
I am going under the assumption that in Power BI nothing is free. Steps, calculated columns, measures and so on could be cheap but they are never free. It is with this premise in mind that I pose the following theory.
Users should sparingly use Change Type in Power Query. It is fine to Change Type when converting Date/Time to Date, Decimal Number to Whole Number, and others that actually change the value. It is a waste of resources to Change Type from Decimal or Whole Number to Currency. Even if you want the column to look like Currency, you can apply that format in the Data Model view and save one Power Query step.
On the face, this theory seems somewhat reasonable. After all, adding additional steps is bound to add some overhead to the Power Query process in most cases. And let’s be honest, in small models, it may make no difference to you at all. But when things get bigger…
Data Types vs Formats
To understand what Data Types matter, we need to get something very clear right off the bat: Data Types and Formats are not the same thing. Data Types dictate what kind of data you have and determine how much memory is allocated to store a value. Formatting, on the other hand, tell you how you want the values to appear. To see this in practice, have a look at the following values in Power Query, where the Data Type has been set to Currency:
Notice that they only place you see a $ sign is in the header. And see how the decimal numbers do not line up? I can tell you from many years of teaching accountants, that this drives them bonkers. That 1.7 needs to be 1.70! But you don’t do this here, you do that in the Excel worksheet, Power Pivot model or Power BI visual.
They key to remember here:
In Power Query, you define Data TYPES
In Power Pivot, Excel or Power BI, you define Data FORMATTING
Excel’s Data Types
In the classic days of Excel, we only had four data types that we had to be worried about. Those were:
(Although represented by textual patterns like #REF!, #N/A, #NAME?, and the like, they actually count as a different data type.) Today it gets a bit more complicated with Stock and Geography data types, as I blogged about here, but anything else was just a number that was formatted with a number formatting string. Examples of these include:
Date: format it as a number, and you’ll see the number of days since Jan 1, 1900)
Time: a little trickier, but this is just a fraction of a day where .25 is 6:00 AM, .5 is 12:00 PM and .75 is 6:00 PM. And if all you have is the time, and you format it as a DateTime, you get that time on Jan 1, 1900.
True/False (Boolean): while this shows in the cell as TRUE/FALSE, which looks like text, if you multiply them by 1 you’ll see that TRUE equates to 1, where FALSE equates to 0.
But that was Excel, where data types and number formats where the same thing. That’s not Power Query.
Power Query’s Data Types
Unlike Excel, which has a pretty short list of data types, Power Query seemingly has a ton:
Numeric Data Types:
Currency (Fixed Decimal)
DateTime Data Types:
Others (Binary, Tables, Records, Lists, and more)
Any (the dangerous “undefined” type which allows the application to determine the correct data type)
The key to recognize, is that each of these data types is DISTINCT, meaning that each of these is different in some way from every other. (While we won’t get into it in this post, unlike Excel which implicitly converts data from one type to another, Power Query requires explicit type conversion via use of functions like Number.From() or Date.ToText(), which can be frustrating at times!)
For the purpose of this post, however, I want to focus on the first three numeric types: Decimal Number, Currency and Whole Number, and ask the question: Does the Data Type matter?
Let’s have a look at an example. For the illustration, I set up a Power Query chain that looks like this:
The Data table had 100,000 randomly generated [Sales] records that vary between 1.27317262341058 and 100000.017761279, and randomly generated [Customers] that vary between 1 and 1000.
The only change I made in the Whole, Currency and Decimal types was to set the Data Type for the [Sales] column accordingly. In the Customers table, I removed the [Sales] column and duplicates from the Customers column.
I then created 3 quick measures to sum the column up, and dropped them on a Pivot:
I don’t think it would surprise anyone here that the [Whole Sum] is slightly different than the [Currency Sum] and [Decimal Sum]. After all, the numbers were rounded at the source before being added up. And so far, we can see that the [Currency Sum] and [Decimal Sum] look pretty much the same. At least until we expand them a bit:
The only thing that should be a surprise here is that currency only holds up to four decimals, not two as most people expect. This actually makes sense when you start thinking about foreign exchange transactions, and how they are always carried to four decimal places.
But is that it? Is a Data Type just about rounding? As it turns out the answer to that is both yes and no.
Testing the Model Memory
The next step here was to test the model memory and see how Power Pivot is storing the data. To that end, here’s a little table that shows exactly that:
Before we get into this, I want to call out something important here. The Data Type that is showing has been read from Power Pivot. Notice that Whole Number shows as such, consistent with Power Query. But the Currency and Decimal tables both show Decimal. As it turns out, Power Pivot doesn’t make a distinction between these two data types. However, the distinction between these two Data Types matters to you anyway, as I’ll explain.
So, what does that all mean?
In the Whole Number table, I had rounded off all the decimals. This left 63,815 unique values. And because Power Pivot compresses based on unique values, it deals with this column very well, resulting in a total of 260.54 KB to store these values.
In the Currency table, I effectively rounded all the values off to four decimal places. This left 99,996 unique values in the table (only 4 values were repeats). Despite this, Power Pivot did a good job of compressing the values in memory, resulting in 390.75 KB to store them.
Then we get to the Decimal column. There are only 4 more unique values than in the Currency column, but the memory takes a colossal 5,234.47 KB to store the values, vs the 390.75 KB of Currency. What the heck is going on?
The answer lies in the fact that Power Pivot has to carry all of those decimal places, and once it does, it can flip to storing data using Hash Encoded memory. That’s a deep topic for another time but suffice it to say that this is a bad thing, as Value encoding is much more efficient. (Hash is generally used for Text, and Values for… you guessed it… values!)
Interestingly, if you round the Decimal Number to 5 decimals you end up with 99,999 unique values in the column and a very minor change to the memory used. But if you round it to 4 decimals, the memory of the Decimals column compresses the same as Currency, and the memory need drops to the same 390.75 KB.
Why Model Memory Matters
And this is the final part of the equation to me. Not all time is created equal. My users will accept a 2-minute refresh of the data model. They might say it takes time, but +/-30 seconds in a refresh isn’t anything that they’ll get overly concerned with. They’ll click Refresh, go grab a coffee, then come back to work with the file.
But if they click a slicer and it takes 10 seconds to redraw the Pivot Table or Power BI visuals? Watch out! They’ll claim the model is too slow, ineffective, wrong and useless. I’d MUCH rather push resource consumption into the initial refresh in order to build a memory-efficient model that performs well when being used for analysis.
Wait, what happened to the Data Types?
To be fair, Power Pivot’s compression mechanism is more about unique values and the length of precision than it is about Data Types. But it’s up to you to choose the correct Data Type to future-proof your model and make sure that the compression algorithms can be applied.
But due to the way Excel has always worked, the way the Data Types are named, and the fact that most modellers don’t have a clear understanding of Formatting vs Data Types… users are more likely to pick Decimal over Currency. I mean, why would I ever format my units as Currency? (In Power BI this is called a Fixed Decimal, although it still shows with the currency symbol.)
We need to recognize that lagging decimals really do happen in business. Let’s look at that Units column for a second. Naturally we never sell a partial unit… or do we? I’ll give you 3 examples of things I’ve seen in the Food and Beverage industry that forever changed my opinion on this:
We had a couple of customers who insisted that they be able to purchase half a muffin. I’m not even kidding here. The killer for me was that it only showed up every 10,000 rows of transactions or so, meaning that the column often got set to Whole Number incorrectly.
The salesperson agrees to provide those 300 units for $5,000. No big deal except that they monkey the price field in your database to make it work and you suddenly you have a sales price of $16.666667. Fortunately, this one usually gets rounded via setting it to a Currency Data Type, as that’s logical. But what if you set it to decimal or left it undefined?
Things go the other way and the customer agrees to a package price of $5,000 for something that usually carries a price of 17.00 per unit. It gets rung into the system with a reverse engineered quantity of 294.1176470588235 to get it exactly right.
The last is the real kicker as the memory suddenly flips from Value to Hash, the RAM needed to refresh the data expands exponentially and brings your model to its knees. And now every slicer click has gone from 1 second to update your visuals to the better part of 10 seconds. And your audience is screaming that the model is “unusable”.
My Recommended Practice
The final step in every query I build which loads to a destination (an Excel Table or the Data Model) is to correctly define my Data Types. This does two things for me:
It ensures that I never accidentally load a column with an undefined Data Type. (In the case of Dates, they load to Excel as numbers, and to the Data Model as text!)
It ensures that I’m cutting down to the minimum number of decimals I’ll ever need for my analysis.
Does it take more processing time? Maybe marginally. But does it future-proof my solution to keep it performant? Definitely. And it ensures the time to do so happens in the refresh, not in when the model is being used for analysis.
Some Final Thoughts
The issues you saw here with memory also affect DateTimes in a big way, as they are quite similar to decimal numbers, where time is represented as the fraction of a day.
The article touches on some things from a much bigger topic: how to optimize a Power Pivot/Power BI data model. I actually teach a full day course on this topic for CPABC, where we cover how the Vertipaq engine compresses its data, why you want to avoid calculated columns and fall in love with measures. We make extensive use of Power Query in this course to reshape tables (applying correct data types along the way) so that they can be consumed by the data model in a more efficient manger. We dissect a poorly built model, test its memory and then rebuild it in stages seeing how our changes made an impact.
And if you’re looking for tools to help with this process… stay tuned to this blog. I’m hoping to have an announcement on that front within the next couple of months.