PDA

View Full Version : CSV file - number of columns import limited by parameter...any way around this?



El Cid
2016-08-22, 10:10 PM
Here's the code:

= Csv.Document(File.Contents("U:\A-FINSTATS\2016\20160630\WIP\Contract Jobs P&L.CSV"),[Delimiter=",", Columns=228, Encoding=1252, QuoteStyle=QuoteStyle.None])

Problem is the file actually had 338 columns. Of course i can hand edit the parameter and run the query again, but that's not what I want to do because it means I have to open the CSV file in Excel and count the number of columns with data in them. Is there a way to make PQ look at the file and dynamically change the "Columns=" parameter?

Thanks.

El Cid
2016-08-22, 10:59 PM
I need to add this to it....when I try to merge the above file with a another file the Table.Expand stops at the old Column parameter as mentioned above. = Table.ExpandTableColumn(Source, "NewColumn", {"Column2", ..., "Column284"})

There doesn't seem a way to make this dynamic.

Pecoflyer
2016-08-23, 08:38 AM
Hi El Cid and welcome

in the future please do not quote entire posts unnecessarily. They clutter th thread and make it hard to read. Thx

El Cid
2016-08-23, 02:23 PM
No problem.:behindsofa:

Ken Puls
2016-08-23, 06:15 PM
Hi El Cid,


Is there a way to make PQ look at the file and dynamically change the "Columns=" parameter?

Yes, but it's not about dynamically changing as much as it is actually removing the part that hard codes the columns:


= Csv.Document(File.Contents("U:\A-FINSTATS\2016\20160630\WIP\Contract Jobs P&L.CSV"),[Delimiter=",", Encoding=1252, QuoteStyle=QuoteStyle.None])

The column locking part is totally unnecessary, and prevents new columns from coming in. Personally, I don't think this should happen by default.


when I try to merge the above file with a another file the Table.Expand stops at the old Column parameter as mentioned above

Yeah, for the expansion... that's a bit trickier. The reason this is happening is because Power Query uses names to expand, and those also get hard coded. To fix that, we need to dynamically pass the column names into the expansion steps, which we can do with the Table.ColumnNames() method. Having said that, there are some caveats here:


I have only tested this where the ID (merge key) columns are the first columns in the table.
Every column in table 2 (except the ID column) MUST have a unique name compared to the first table


So assume that I have the following structure:


Table
Columns


Animals
ID, Name


Inventory
ID, Quantity, SalesPrice



Assume also that I've set those up as connection only queries called Animals and Inventory. (This is important, as we will call the Inventory query later.)

Now I set up a new query to merge the two together based on the ID column. When I do, and I expand it, I end up with the following M code:

let Source = Table.NestedJoin(Animals,{"ID"},Inventory,{"ID"},"NewColumn",JoinKind.LeftOuter),
#"Expanded NewColumn" = Table.ExpandTableColumn(Source, "NewColumn", {"ID", "Quantity"}, {"ID.1", "Quantity"})
in
#"Expanded NewColumn"

The issue is in the Expanded NewColumn line, as the columns are hard coded. I can fix that by nesting Table.ColumnNames(Inventory) in place of the hard coded values BUT... if my ID columns are the same name (which they are here), then I'll get an error since it already exists. But we can get around that by skipping the first item in the list as follows:


let Source = Table.NestedJoin(Animals,{"ID"},Inventory,{"ID"},"NewColumn",JoinKind.LeftOuter),
#"Expanded NewColumn" = Table.ExpandTableColumn(Source, "NewColumn", List.Skip(Table.ColumnNames(Inventory),1), List.Skip(Table.ColumnNames(Inventory),1))
in
#"Expanded NewColumn"

I've attached a sample file for you. Just go in and add a new column to either of the top two tables and you'll see that the output table will expand/shrink as required.

El Cid
2016-08-23, 11:14 PM
Awesome and thanks. Just squeezed down a ton of M language into a nothing merging 3 files. Now I just have one more monster to slay..."Table.TransformColumnTypes" is its name. All of the 570 columns merged together all came in as type: ANY. Is there anyway to have the type transformation change the type on all of the columns EXCEPT the first column?

Ken Puls
2016-08-26, 07:10 PM
I'm not going to say no, but I think it would be VERY difficult. Here's a look at the M code for 2 columns:

Table.TransformColumnTypes(Source,{{"Test", type text}, {"Another", type text}})

So the issue is that the M language doesn't take a list of columns for each data type, but rather a list that declares the type for each column individually. Here's what I wish would work:


= Table.TransformColumnTypes(Source,{{{"Test","Another"}, type text}})

If that were the case, we could use the Table.ColumnNames to feed in the list of columns we need.

Bill Szysz
2016-08-26, 07:47 PM
Hi Ken, hi El Cid,


Is there anyway to have the type transformation change the type on all of the columns EXCEPT the first column?
With a little bit of M we can do that ;-)

TransformTypes= Table.TransformColumnTypes(Source, List.Transform(List.Skip(Table.ColumnNames(Source),1), each {_, type text}))

Regards :-)

Ken Puls
2016-08-26, 07:53 PM
LOL. I have this little theory about forum postings that when I say "No", someone almost immediately comes along and proves me wrong. Thanks Bill, I was hoping it would work again. ;)

Bill Szysz
2016-08-26, 08:12 PM
This is lucky coincidence, Ken. :-)
Unfortunately i am not here as often as i wish. But as you can see sometimes have a luck... the right place in the right time

El Cid
2016-08-26, 10:30 PM
Wow, you guys just slayed it! I thought I had Sensei Ken on the ropes and at the last second Sensei Bill tags in lands a side kick to the throat! Thank you both. Ken, you need to pull in Bill on Data Monkey v2.0!:whoo: