Source Cells with Double Quotes Show "null" in Power Query

RoguePotatoes

New member
Joined
Jul 10, 2019
Messages
5
Reaction score
0
Points
0
Excel Version(s)
2019 (x64)
I am using Power Query to consolidate data from several thousand workbooks, and some of the cells includes double quotes at the end of the string (representing inches). Example of cell contents: 17"

When imported to power query (via Get from Workbook or Get from Folder) cells like this show up as null in the editor. I experimented and if I remove the double quotes in the source workbook the cell contents are imported fine. Even stranger is that if I add double quotes at the end of longer text strings (5 or 6 characters) in the source data those cells are imported correctly, including the double quotes.

Does anyone know why this may be and how I can get power query to import these cells properly?
 
I can't reproduce this with a few variants of a simple columnar list like 1", 2", 3"; can you share a file, or a portion thereof?

Norm
 
What would happen if you change the regional settings of Power Query to France or some other country that might read quotes or symbols differently. Maybe the import will successfully read these and afterwards you can change the regional settings back to what they were? Just a 2 cents thought?
 
I can't reproduce this with a few variants of a simple columnar list like 1", 2", 3"; can you share a file, or a portion thereof?


Norm


What would happen if you change the regional settings of Power Query to France or some other country that might read quotes or symbols differently. Maybe the import will successfully read these and afterwards you can change the regional settings back to what they were? Just a 2 cents thought?


Cell K37 is where I have the problem. The yellow cells were added to demonstrate my findings below.


First I tried changing the regional settings and although that didn't work, it got me thinking that it might be a version/compatibility issue since many of the files are in the Excel 97-2003 (.xls) file type. When I converted the file to the current .xlsm file type, the cell in question is imported properly.




The second and more peculiar thing I discovered while removing any sensitive data is that upon clearing cell K1 the issue is resolved.


My findings:
if cell K1 is populated with a number then any other cell in column K with trailing double-quotes doesn't import
if cell K1 is empty or populated with letters then column K is imported normally
cells in any other columns are imported normally regardless


What I've tried:
inserting a blank top row: no affect
copying cell K1 to K15 (with formatting) and clearing K1: column K imports normally
get data from table/range in the same workbook: imports normally




Any ideas? Thanks for your help so far!
 
Cell K37 is where I have the problem. The yellow cells were added to demonstrate my findings below.


First I tried changing the regional settings and although that didn't work, it got me thinking that it might be a version/compatibility issue since many of the files are in the Excel 97-2003 (.xls) file type. When I converted the file to the current .xlsm file type, the cell in question is imported properly.


The second and more peculiar thing I discovered while removing any sensitive data is that upon clearing cell K1 the issue is resolved.


My findings:
if cell K1 is populated with a number then any other cell in column K with trailing double-quotes doesn't import
if cell K1 is empty or populated with letters then column K is imported normally
cells in any other columns are imported normally regardless


What I've tried:
inserting a blank top row: no affect
copying cell K1 to K15 (with formatting) and clearing K1: column K imports normally
get data from table/range in the same workbook: imports normally


Any ideas? Thanks for your help so far!

Sorry my reply wasn't going thru so I had to contact the mods. Here is the workbook:

View attachment 5858.test1.xls
 
There is no query in the attached workbook that we can check. How are you loading the data into the query?
 
Wow, I just tried it and you are right.

Microsoft Access has the same issue. If the first row in the field is a number then the odbc connection assumes that the rest of the column is a number. Non numeric fields either error out or nulled (I can't remember which). I wonder if this behavior is a carry over of the ODBC behavior.

Unfortunately, I do not have a workaround for Access and can't imagine one for this.
 
I saved as .xlsx and then imported the file. All of the text below the 5858 now shows up. Does your environment allow you to write VBA script to open the .xls files and save as .xlsx?
 
Looks like you can trick PQ by having it use the first row as headers when opening the file (set the second argument in Excel.Workbook to true), then demoting the headers.

Code:
let
    Source = Excel.Workbook(File.Contents("C:\Users\Norm\Documents\Excel\ExcelGuru\5858.test1.xls"), true, true),
    Sheet2 = Source{[Name="Sheet1"]}[Data],
    #"Demoted Headers" = Table.DemoteHeaders(Sheet2)
in
    #"Demoted Headers"

Norm
 
Hey great trick Norm! It pays to read the documentation!
 
Looks like you can trick PQ by having it use the first row as headers when opening the file (set the second argument in Excel.Workbook to true), then demoting the headers.

Code:
let
    Source = Excel.Workbook(File.Contents("C:\Users\Norm\Documents\Excel\ExcelGuru\5858.test1.xls"), true, true),
    Sheet2 = Source{[Name="Sheet1"]}[Data],
    #"Demoted Headers" = Table.DemoteHeaders(Sheet2)
in
    #"Demoted Headers"

Norm

Hey this works perfectly, thanks!
 
Back
Top