This morning I logged in to check the solutions for Power Query Challenge 6 that were submitted and... Wow! There were a bunch, and some cool variety there. So now it's time to show you all what I came up with here.
What was Power Query Challenge 6?
The full description and source data can be found in yesterday's post, but in short it was to convert this:

Data table with multiple data points per cell
To this:

Data shown in First Normal Form (1NF)
So how do we do it?
Two Solutions for Power Query Challenge 6
Wait, two solutions? Why?
As it turns out, I put together two for this one. My first attempt was cooked up to solve the issue on short notice. Then I built another that seems a bit more elegant. So I'll show them both, although quickly.
Solution 1 - Splitting Individual Columns and Merging Back Together
The first of my solutions for Power Query Challenge 6 is actually quite similar to what Ali posted in the solution thread. It basically follows this method:
- Step 1:
- Create a Data query that connects to the source data, and load it as connection only
- Step 2:
- Create 3 new queries for ItemID, Quantity and Price which
- Reference the data query
- Keep the InvoiceID column and the other relevant column
- Split the relevant column by delimiter, ensuring it splits to rows (not columns as it defaults to)
- Add an Index column
- Step 3:
- Reference one of the Step 2 tables, and merge the other two tables to it, based on matching the Index column in each
So when complete the query chain looks like this:

And returns the table we're after:

The only real trick to this one is that - when you are building the Price query - the Price column will pick the decimal as the delimiter, so you have to force it to a line feed. So building the Price query would go through the following steps:
- Right click the Data query --> Reference
- Select the InvoiceID and Price columns --> Right click --> Remove Other Columns
- Right click the Price column --> Split column --> By Delimiter
- Clear the decimal from the Custom area
- Click the arrow to open the Advanced area
- Change the selection to split to Rows
- Check "Split using special characters"
- Choose to insert a Line Feed character
- Click OK
- Set the Data Types
- Go to Add Column --> Add Index Columns
Resulting in this:

The ItemID and Quantity queries follow the same steps, except that Power Query now correctly identifies the Line Feed as the character to split on.
Solution 2 - Group and Split
While the first solution to Power Query Challenge 6 worked, it left me less than satisfied as it took a bunch of queries. While effective, it didn't feel elegant. So I cooked up another solution that uses Grouping. It's actually quite similar to the first solution that Bill Szysz posted.
The basic method is as follows:
- Connect to the data
- Right click the InvoiceID column --> UnPivot Other Columns
- Right click the Value column --> Split Column --> By Delimiter --> OK
Following the steps above gets us to this state:

To unwind this, we group it:
- Go to Transform --> Group By
- Group By InvoiceID, Attribute
- Aggregate a "Data" column using the All Rows operation

At this point, we need to number these rows, so I just busted out the pattern to do that from our Power Query Recipe cards (recipe 50.125).
- Go to Add Column --> Custom
- Column Name: Custom
- Formula: =Table.AddIndexColumn( [Data], "Row", 1, 1)
- Right click the Custom column --> Remove Other Columns
- Expand all fields from the Custom column
Leaving us with this data:

The next step is to Pivot it:
- Select the Attribute column --> Transform --> Pivot Column
- Choose Value for the Values
- Expand the Advanced arrow
- Change the Aggregation to "Don't Aggregate"
- Click OK
- Select the "Row" column and Remove it. (Yes, it was needed to unpivot this correctly, but now adds no value.)
- Set the data types
- Load it to the desired destination
At this point, the query (again) looks perfect:

Now, I must admit, this felt far more professional and left me feeling good about it.
Which Solution to Power Query Challenge 6 is Better?
Naturally, solution 2 is better. It takes less queries, and looks way cooler. Right? Not so fast...
The real question is in the performance. And for this one I thought I'd test it. But I needed more data. I expanded the set to 11,000 rows and then used a tool we're working on to time the refreshes. Privacy was left on, and all times shown are in seconds:
- Solution 1: 1.43, 1.48, 1.11, 1.27 Avg ~1.32 seconds
- Solution 2: 2.77, 2.65, 2.63, 2.68 Avg ~2.68 seconds
I'll be honest, this surprised me. So I went back and added the GroupKind.Local parameter into the Grouped Rows step, like this (as that often speeds things up):
Table.Group(#"Changed Type1", {"InvoiceID", "Attribute"}, {{"Data", each _, type table [InvoiceID=number, Attribute=text, Value=number]}}, GroupKind.Local)
The revised timing for Solution 2 now gave me this:
- Solution 2A: 2.54, 2.49, 2.56, 2.61. Avg ~2.55 seconds
So while the local grouping did have a small impact, the message became pretty clear here. Splitting this into smaller chunks was actually way more efficient than building a more elegant "all in one" solution!
My solution (including 5,000 rows of the data), can be found in the solution thread here.