I was teaching a course on Power Query yesterday in which we imported a text file. Almost immediately, some of my users pointed out that their dates weren’t importing correctly, and we had to cover how to fix date errors right away.
And to underscore the importance of this… this morning I woke up to comments on one of my previous blog posts with similar issues. I figured it’s time to cover the easy way to fix date errors.
My data came from a text file, and was shown in the following format:
As you can see, the data is set up in the Month/Day/Year format. The issue for the user is that their system is set to a different format… in the case of the users in my class their default windows settings were set to Day/Month/Year, which is the Canadian date format. When they tried to convert the data from an “any” data type to a date, it messed it up.
The problem comes in to play because many systems export into a MDY format as they were programmed using US date standards. But with our operating system set to a different format, it tries to interpret dates in the standard set there. So when importing a text file, it looks at 1/12/2000 and interprets it as Dec 1, 2000, not Jan 12, 2000.
Then it hits a date like 1/13/2000. Because there is no 13th month, it returns an error.
One of the class attendees brought up an interesting point as I was explaining this… “I thought that dates were just a format on top of a serial number? So how can it get it wrong”. He was absolutely correct. But in this case we are importing data from another source into Excel (via Power Query)… Excel (Power Query) is trying to determine what that date serial number is based on the system settings. That’s where the issue hits us.
How to fix date errors
At first, you might be tempted to flip the date format in your Windows settings, but that won’t actually help you in the long run. In fact, in the worst case it may fix the issue for the current import, and blow apart other solutions that you’ve built. So that’s really not a practical solution. What we need is a way to tell Power Query what the date settings are for THIS data source. Fortunately, there is a way to override the date format. In truth, this doesn’t so much fix date errors, but rather prevents them from occurring in the first place.
What we do is select the column with our dates in it then:
- Right click the column
- Choose Change Type –> Using Locale
(Yeah, I know… this is hardly a term that Excel users are familiar with, but it allows you to force a different regional setting on the data source.)
You’ll then be prompted with a new dialog where you’ll choose the date, then the Locale you want to use to read it:
The key here is to recognize WHICH locale your data format is emulating. There are hundreds of countries in this listing. My guess is that you’re probably going to pick either your own or English (United States) most of the time. In truth, when working with dates, the country is actually not the important part. The important part is that you pick a country where the MDY or DMY format is consistent with your data source.
I’ve been fortunate enough to have to deal with this issue very little in my career. I generally leave my system in a US English configuration and most of my imports follow the US date standard, so no issue. But I recognize this as a huge issue for Europeans, as well as any company that conducts business in multiple countries. In both cases this issue comes up over an over again.
There are two great things about using the “Change Type With Locale” feature:
The first is that it avoids relying on an implicit shortcut, explicitly declaring the source data format. This is REALLY handy for future proofing. In Canada, we typically end up with some users in the organization using Canadian standards and some who use US standards (we are a very confused country sometimes.) By specifically declaring the data type, I know that this solution will continue to work even when I send it to someone who uses a different date standard on their PC. Why? Because it’s now defined for the DATA, not the SYSTEM.
The second great thing is that this is a DATA SOURCE SPECIFIC feature. I can set a different format for each data source used in my solution, allowing me to combine several data sets from different countries and still get it consistent.
One thing that I struggled with is this. Back in the old text import tool for Excel, we had a nice feature that looked like this when we were setting data types:
This was fantastic, as I didn’t need to try and figure out which region the data came from, I simply chose the date format that I could see. While it’s great that we can now exhibit some national pride by choosing our country, that doesn’t always help. In the case of Canada, I’d bet that if I asked 5 different people what our official data format is, I’d get 5 different answers.
It would be SO handy if the Power Query team would add some indicator at the end of these options to indicate what the format is. That would be such an easy change to make here, and SOOOOO useful. I honestly don’t think I need to care if my setting is set to English Australia/UK/Ireland/Canada/Belize if it gives me interprets my date in the correct MDY order.
(I actually did email this thought to one of the program managers a few days ago. So hopefully one day we’ll see that change take place.)