Combining Data from multiple files with inconsistent date headings

simsy

New member
Joined
May 28, 2020
Messages
2
Reaction score
0
Points
0
Location
UK
Excel Version(s)
365
Hi,

I need to combine the data in around 30 files into a single Excel file for further analysis. I've been doing this with VBA for around the last 6 years, which, while slow, has worked.

However the number of files I need to combine has grown significantly and my VBA code struggles to finish in a reasonable time.

Now PQ is part of Excel, it seems to make sense to put all this data in the Data Model for my analysis.

The problem I have is, the format of the data is very simple (it's a matrix. . not a table). I thought the very useful Unpivot command would be the answer, but I can't make it work.

If you look at the attached file, you see the simple format that I'm trying to convert in to a table. . . . but the table headings ( week commencing dates ) can all have different start dates.

Could somebody please give me a brief guide of the best way to approach this problem, I think I need to do an independent Unpivot for each file, but that's purely me describing what I think is the solution.

Many thanks

View attachment Ex Inconsistent Column Headings.xlsx
 
Using your example of three files, mock up what you want the end result to look like. This will help significantly in determining how to combine them. How will duplicate dates be handled?
 
Last edited:
Pull all the files in from the folder, and run a function on them to unpivot, then pivot them back.

Code:
let
    Source = Folder.Files("D:\_projects\_8 Community\_forums\ExcelGuru\ExcelGuru 10655 - Combine Files Various Dates"),
    convert = (input as text) as table =>
    let
        Source = Excel.Workbook(File.Contents(input), null, true){[Item="Sheet1",Kind="Sheet"]}[Data],
        headers.Promote = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        dates.Unpivot = Table.UnpivotOtherColumns(headers.Promote, {"Location", "Operator", "Work Type"}, "Date", "Value")
    in
        dates.Unpivot,
    file.AddCol = Table.AddColumn(Source, "Custom", each convert([Folder Path]&[Name])),
    extras.RemoveCol = Table.RemoveColumns(file.AddCol,{"Content", "Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),
    file.Expand = Table.ExpandTableColumn(extras.RemoveCol, "Custom", {"Location", "Operator", "Work Type", "Date", "Value"}, {"Location", "Operator", "Work Type", "Date", "Value"}),
    #"Renamed Columns" = Table.RenameColumns(file.Expand,{{"Date", "DateString"}}),
    data.Type = Table.TransformColumnTypes(#"Renamed Columns",{{"DateString", type date}, {"Location", type text}, {"Operator", type text}, {"Work Type", type text}, {"Value", type number}}),
    date.Format = Table.AddColumn(data.Type, "Date", each Date.ToText([DateString], "dd-MMM-yyyy")),
    datestring.RemoveCol = Table.RemoveColumns(date.Format,{"DateString"}),
    dates.Pivot = Table.Pivot(datestring.RemoveCol, List.Distinct(datestring.RemoveCol[Date]), "Date", "Value", List.Sum)
in
    dates.Pivot
 
Thanks Bob, that works a treat :smile:
 
Pull all the files in from the folder, and run a function on them to unpivot, then pivot them back.

Code:
let
    Source = Folder.Files("D:\_projects\_8 Community\_forums\ExcelGuru\ExcelGuru 10655 - Combine Files Various Dates"),
    [COLOR=#ff0000]convert = (input as text) as table =>[/COLOR]
    [COLOR=#ff0000]let
        Source = Excel.Workbook(File.Contents(input), null, true){[Item="Sheet1",Kind="Sheet"]}[Data],
        headers.Promote = Table.PromoteHeaders(Source, [PromoteAllScalars=true]),
        dates.Unpivot = Table.UnpivotOtherColumns(headers.Promote, {"Location", "Operator", "Work Type"}, "Date", "Value")
    in
        dates.Unpivot,[/COLOR]
    file.AddCol = Table.AddColumn(Source, "Custom", each convert([Folder Path]&[Name])),
    extras.RemoveCol = Table.RemoveColumns(file.AddCol,{"Content", "Name", "Extension", "Date accessed", "Date modified", "Date created", "Attributes", "Folder Path"}),
    file.Expand = Table.ExpandTableColumn(extras.RemoveCol, "Custom", {"Location", "Operator", "Work Type", "Date", "Value"}, {"Location", "Operator", "Work Type", "Date", "Value"}),
    #"Renamed Columns" = Table.RenameColumns(file.Expand,{{"Date", "DateString"}}),
    data.Type = Table.TransformColumnTypes(#"Renamed Columns",{{"DateString", type date}, {"Location", type text}, {"Operator", type text}, {"Work Type", type text}, {"Value", type number}}),
    date.Format = Table.AddColumn(data.Type, "Date", each Date.ToText([DateString], "dd-MMM-yyyy")),
    datestring.RemoveCol = Table.RemoveColumns(date.Format,{"DateString"}),
    dates.Pivot = Table.Pivot(datestring.RemoveCol, List.Distinct(datestring.RemoveCol[Date]), "Date", "Value", List.Sum)
in
    dates.Pivot


Hi Bob Philipps,
That's a very interesting code. I did a test on an example and it works very well.
I'm trying to understand the steps created automatically using the interface and those created manually.
Does the code in red correspond to a manual entry?

Thank you in advance
 
I'm trying to understand the steps created automatically using the interface and those created manually.
Does the code in red correspond to a manual entry?

Yes and no. I first created a query that read in one file, promoted the headers and unpivoted it, all via the interface, and then I turned that into a generic function (function signature, changed the file name to the parameter input), and embedded that manually into the query that grabbed all of the files.
 
Thanks Bob for your response,
I think I understand now how you did it,
it's actually easier than writing manually some code steps.

Have a nice day.
 
Back
Top