There are two kinds of nuthin' in Power Query: null, and blank. I tripped on this issue the other day, and Ken thought it would be a good idea for a blog post.
Let's just call out the two types of nuthin' in Power Query:
- null is literally "no value" for any data type from text to table. In other words, the cell is completely empty.
- A blank also has looks like "no value", but it is equivalent to a cell formula of ="" in Excel. In other words, the cell holds a value that renders as blank.
Why is this important? It's because, inside Power Query (and indeed many programming languages) null and blank are not equal! And it turns out that nuthin' matters more than getting the right nuthin' in Power Query!
Burned by nuthin' in Power Query
Some time ago I built a set of Excel Power Query transforms which report on data extracted from a client system. For some time the client data has been extracted into Excel files, but there were some problems. My solution was to extract the client data files as CSVs instead.
What I found, unexpectedly, was that blank data values are treated differently by the PQ import functions depending on whether the file being imported is an Excel file or a CSV file!
Here's the rules:
- For an Excel import, blanks are converted to null - always.
- For CSVs, blanks are imported as blanks (not nulls). But when a field type is changed in a Power Query step the numeric and date column blanks are converted to nulls, and text column blanks remain as blanks (not nulls)
Seeing nuthin' in Power Query
What does it matter? Have a look at the following cases...
1. Excel data with blanks loaded into PQ. Blank cells are imported as null.
2. CSV data with blanks imported to Power Query: Blanks are read as blanks.
3. CSV data with blanks: dates and numbers change to null after type change. Text blanks remain blank
The impacts of nuthin' in Power Query
Nuthin' in Power Query could have caused me more issues here... my transforms used conditional columns to check for nulls in text columns, a test which failed when the value is blank.
As Ken pointed out, another key issue is that the Fill Down and Fill Up functions in Power Query are used to fill null values. But blanks are not nulls, so the functions do not work as intended in my case.
How to deal? Caveat Emptor! It appears Power Query treats blank data differently in the CSV import and Excel import functions.
So, if you are building transforms based on a consistent source of imported data, then there is little impact. But if you should need to change the type of data source like I did then beware. I had to do some detail testing, and I was looking to re-write some longstanding and well tested transforms.
Instead, for my specific situation, I added a value change step to the CSV import in PQ to replace nuthin' (blank) with the keyword null for the whole data set. That allowed me to maintain my downstream logic, but cost me some processing speed.
I hope this alerts PQ users to a potential issue in their transforms. If there are any alternate solutions to the problem let's see them in the comments.