Remove top N rows from all files

Bernie

Member
Joined
Feb 9, 2018
Messages
34
Reaction score
0
Points
6
Location
Suburban NY
Excel Version(s)
Excel 2016 (Win)
Following the instructions in Chapter 5 of the "Monkey" book, I created a query to pull data from single sheet workbooks that don't have tables - the header row is row 10, so I removed the top 9 rows from the sheet and used the new row 1 as headers.

All the files are in a folder, so I used the From File / From Folder. I was able to correctly get the data from the single file I was working on, but when I added more files to the folder and refreshed, the other files did not have the top 9 rows removed, and the headers were still there.

What do I need to do to my query to remove 9 rows from the first file, create headers, and remove the first 10 rows from all subsequent files? I thought about checking for incorrect data types or blanks in certain columns, but the first 9 rows are pretty free form and I could not think of a column-based logic that would always find all the correct rows to delete.

Thanks,
Bernie
 
Last edited:
Bernie, Will you be appending all the sheets? If so, you could delete the rows that have null values in your primary column. You could then filter out the "header rows." Not sure if this is helpful or not. I don't have my "Monkey" book with me as I am at my summer house and it is home, so I cannot reference it. If my scenario is not the case, then lets hope Ken or someone more attuned to this will jump in.
 
Hi Bernie,

What you really need to do is use the Combine Files feature to do this. Then you can modify the "Sample File", and it will apply the same modifications to all files before they are appended. The only rub... there was a massive update to that feature AFTER we published the book. The good news though... I have a blog post on it. :)

Check this out and see if it gets you closer: https://www.excelguru.ca/blog/2016/12/22/combine-excel-files/

As I say, the big secret is to combine From Folder. When you do the combine you'll be asked to choose a Sample File. Then go into the Sample Transform query, make the modifications that you would normally make to a single file. At that point the main query should hopefully just work for you (although you might have to delete the final step of the main query if you change any column names or delete any.)
 
Thanks, Ken.

I used your

Combine Excel Files - Method 3

My file has 9 rows of formulas / meta data, the 10th row is headers, and the rest is the data that I want to combine into one data set. All columns are filled completely.

So I put the first attached file into a folder, followed the steps from your method, successfully getting the data I wanted, then added the second file and refreshed. The top 10 rows from the second file were not removed, so I must be doing something wrong.

Thanks again for all you do,
Bernie

If it comes to it, I know I can write a simple macro to wipe out the top ten rows from every file, which will fix my issue... I just hope to not have to do that so the files can be added to the folder without manipulation.

View attachment Example 1.xlsxView attachment Example 2.xlsx
 
Last edited:
If you change the generated function Transform File to the below code, that might do it.

Code:
let
    Source = (Parameter1) => let
        Source = Excel.Workbook(Parameter1, null, true),
        Sheet1_Sheet = Source{[Item="Sheet1",Kind="Sheet"]}[Data],
        #"Promoted Headers" = Table.PromoteHeaders(Sheet1_Sheet, [PromoteAllScalars=true]),
        #"Removed Top Rows" = Table.Skip(#"Promoted Headers",9)
    in
        #"Removed Top Rows"
in
    Source
 
Wow. :hail: That is some advanced data voodoo going on there.

It works - of course - now I just need to understand how you did it :smile:

Thanks,
Bernie

PS Is your avatar George Best?
 
Last edited:
When you combine files, it creates a number of queries, a function that processes each file, and a control query that gets all of the files from the folder and processes each through the function. As the function processes each file, you can add transformation code in there, I just modified it to add a step that removed the top 9 rows, which as I stated, happens for each file.

And yes that is the Belfast genius. It is actually an image of the eponymous Wedding Present album (George Best, not Belfast genius:loco:).
 
Last edited:
If I record the steps that I want, it all appears in one big block. Can I then cut the code from those steps and place them into the function with that internal Set (And a bit of editing, of course). I'm going to try that but I want to make sure I'm on the right path, or at least in the correct ball park (or football pitch, apparently ;) ).
 
What I did was to first build the query(ies) that open all the files in the folder and combine them. That created a number of queries, including the function Transform File. Then I built a query where I opened the single file, and then added a step in the UI to remove the rows. I then cut that extra code from that query and pasted into the function.

Glad that you said football pitch, not soccer pitch. Most Brits I know hate it being called soccer. :smile:
 
Cool - thanks for the tips. I will give that a go after the weekend. Enjoy yours and stay safe!
 
Blimey man, I have just seen your full signature, I didn't realise that Bernie was Bernie Dietrick :clap2:. I remember you from the old days, especially the NGs if I recall correctly (you, me, Dave Patterson, Tom Ogilvy and the rest). Did we ever meet at the summit, our MVP periods certainly overlapped?
 
I have just seen your full signature...

Bob,

One and the same. You didn't see my signature because I forgot to set it up when I first joined this group, and I corrected that oversight on Friday.

Dave Peterson (I think that was his name) almost immediately befriended me and pushed for MVP status for me, and then I met the rest of you - but never in real life, since my kids were young and travelling for the summit or other meetings was out of the question. When MS killed the newsgroups I switched to the forums, which they seem to not care about for MVP status, so I never regained it - no blog, book, or website. I still have some of the MS t-shirts, so I'm good. I dropped out of the private non-MS MVP forums after the trolling became too much, so I lost touch with everybody, basically.

But, the powers-that-be at work are splitting data responsibilities between two groups, so now data transfer is becoming a much bigger issue for me - hence the Power Query questions. Learning PQ is a lot easier than learning VBA - and it's really cool. I'm sure I'll be posting a lot more as I go, so you haven't heard the last from me ;)

Stay safe, stay healthy!

Bernie
 
Dave Peterson of course, thanks for reminding me. He and I were the most prolific in the NGs, I always hoped to meet him, but like yourself he never attended summits (Myrna Larson also, I would have loved to meet her). I got to meet many, Chip, Stephen, Rob, Walk, Debra, Andy Pope, Peltier et al, that was great.

I think the NGs were better than the forums, forums have advantages but the administration can be over-bearing. I agree about the old MVP groups, especially a short, angry, MAGA shouting Access MVP.

Anyways, good to come across you again, this is a small forum but a nice one.
 
Back
Top