Task Tracking with Power Query

Did you know Power Query can be used as a task tracking tool? This might sound quite unusual but the method described here has been used for solving a real business case. The example I will use is rather simplified but still close to reality, and will demonstrate how to build task tracking with Power Query.

Laying out the Scenario

Vicky is a manager of a small team that is dealing with customer questions on various topics. One of her duties is to distribute various questions among her subordinates. After that, each of them should take some actions and report what is the status of each task.

The problem is – how can each employee can see what tasks are assigned to him/her and fill in the respective information for each task? At the same time, Vicky should at any moment be able to assign a new task and review the statuses of old ones. This is the table Vicky needs:
Task Tracking with Power Query

Unfortunately, she has no other tool at hand except Excel. Luckily, she can set up task tracking with Power Query right in Excel, which could work perfectly in this case.

Setting up Task Tracking with Power Query

So let's start building the solution.

1. Load the left table (in this example, called Filled by Manager) into Power Query.
Manager's table to assign tasks

2. Next, create one query for each employee by filtering the Employee column.
Create individual employee queries

3. Load each Employee table into a separate Excel sheet. (Of course they can be on different files linked to the source table).
Sample employee table showing tasks assigned

4. Then, create a table for each employee to fill in the actions and statuses.
Employee's task tracking worksheet

You can see in the above picture what each employee will have in his/her worksheet - a green table on the left with the tasks assigned to them, and a yellow table on the right where he/she has to fill in the respective information.

Creating the Filled By Employees Table

5. Load all the Employee tables into Power Query.
Load all the employee tables into Power Query

6. Append them in a new query (in this example, called Statuses).
Append all the employee tables into new Statuses query

You are probably guessing what the next step is – load the Statuses query into Excel right next to the Filled By Manager table

However, the result is not what we would expect.
The Filled by Manager table is not matching the newly loaded Fill by Employees table

Note that on first row of the Manager’s table is a task assigned to Ivan on 27.01.2019, but row 1 of the Employee’s table shows the task assigned to Maria on 09.02.2019.

In order to fix this mess, we need one additional query.

Building the Task Code Query

7. Once again, load the Manager’s table into Power Query and remove all columns except for Task Code.
Task Code Column

Task Code is a unique identifier of each task. In this example, it is simply composed of the employee's name and the number of occurrences of this name in the register up to the respective row. In Excel language, the formula is:
Use COUNTIIF to create unique task identifiers

The trick is that we fix the first row of column F (containing the employees' names) but the end row is not fixed.

8. Merge the Register Employees and Statuses queries together.
Merge the Register Employees and Statuses tables

9. Finally, expand the table and voila - it is in the required order. The only thing left is to load it back into the Manager’s table.
Final table for task tracking with Power Query

Now, any time she needs to, Vicky can refresh the Filled by Employees table and see the updated statuses of each task.

Likewise, each one of her subordinates can simply refresh the Manager’s table (the green one that is on left of his/her tab) to see any new tasks that have been assigned.

You could also automate the refresh operation VBA. For more details, refer to Chapter 16 of Ken's M is for (Data) Monkey book.

Final Words

This article presents nothing new and unusual as a Power Query technique. What is new and unusual is the way Power Query has been used for solving a typical business problem. This is just additional proof of how powerful and useful this tool is.

You can find the file with example here: Task tracking with PQ

Analyzing text with Power Query and Excel

We all know how good Power Query is for cleaning, shaping and analyzing numerical data. But this powerful tool opens doors for analysts that had been closed for long time, such as for analyzing text. In this article, I will give you some ideas how Power Query can be used for analyzing text that is unstructured. This is a new area for me as well so I’ll be grateful for any comments, thoughts, ideas and shared experience that will help to elaborate further on the methodology.

Our scenario

First, let’s put things in a context with simple but realistic example. Assume you are the newly appointed Marketing Manager of a 4-star SPA hotel. You want to analyze reviews about the hotel in sites like booking.com and tripadvisor.com to understand the major service issues. Thanks to Power Query, you will be able to extract, clean and shape data from those sites to receive a nice table like this:

Comments table

* Table contains ~300 real opinions for several Bulgarian SPA hotels in Hissarya, Velingrad, Sandanski and Bansko. Source: booking.com and tripadvisor.com

But how can you get some insights from this data? Obviously you will spend lots of time to read all reviews. Even if you do that, you need to summarize the information somehow. So let’s play a little within Power Query and see how it can help us analyze text.

Preparing data for analysis

First, we load the table in Power Query (From Table) and clean it of punctuation symbols like commas, semicolons, dashes, brackets, etc., as well as replace capital letters with lower cased one. Why do that last one? Because we want to count the most frequently used words and for Power Query ‘hotel’, ‘hotel,’, ‘Hotel’ are different words.

The cleaning can easily be done with tools like Replace Values, Trim, and Lowercase. I believe they are well-known to average user of Power Query so will I skip detailed operations. Instead, here some lessons learnt from practice:

  • First, duplicate the Opinions column so you can have original text.
  • Then, replace all punctuation symbols with a space.
    • Later we will use a period a separator to split opinions into sentences. And since a sentence can finish with exclamation mark or a question mark it is useful to replace ‘!’ and ‘?’ with ‘.’
  • Since this is a hotel, clients may use the symbol ‘*’ in their opinions (i.e. ‘nice 4* hotel’ or ‘definitely not a 4* hotel’). So don’t replace this symbol.

Above points are nice examples that when analyzing text or other data you need to do it in a context and should explore it first, but that is another story.

Here are the steps applied so far. As a final touch the text has been trimmed and cleaned.

Steps taken to clean data

Separating each comment

Our next step is to split the column using a period ('.') as a delimiter, and doing so at each occurrence of the delimiter.

Split columns by delimiter

Power Query creates a number of new columns, each one containing a sentence. I bet you are guessing what will follow. That’s right – select the original two columns and choose Unpivot Other Columns.

Unpivot other columns

After removing the Attribute column and filtering out rows that are blank, our table has a column for the original comments (Opinion) and a column for each sentence contained in the original comments (Sentence). Please be patient here, and you will understand why this is important.

Table containing full comments and each individual sentence

Let’s now duplicate the Sentence columns and repeat the above actions: split each sentence into its individual words, then unpivot. Below is the table after removing the Attribute column and filtering out blank rows:

Table containing full comments, each individual sentence, and each word

As a next step I will add custom column with a simple formula that calculates the length of each word. We can use it to remove common words that bring no value to our analysis – ‘a’, ‘the’, ‘is’, ‘or’, ‘but’ etc.

Custom function to calculate the length of each word

Finally, filter to exclude words that are 1, 2 or 3 letters long. We will use the resulting table as the basis for analyzing text.

Table filtered for words greater than 3 letters

Finding Lookup words

As a first step in analyzing text, let’s Duplicate the query (you may use Reference as well).

Duplicating the query

In the new Query, simply use the Group By function on the Words column to count the rows.

Grouping by the Word column

Thus, we will have a list with all words (above 3 letters long) and how many times they have been used. For convenience, we can sort this and then load it into Excel.

Word count table

Let’s take a look at the list. Obviously it contains words like ‘with’, ‘this’, ‘have’ that we need to ignore because they could be used in many different situations. On the other hand we see ‘good’ and ‘nice’ in the top 20. Those words are highly subjective – what is good for me might not be good for you. Besides, we don’t know whether the reviewer said ‘good food and nice service’ or ‘the food was not good and service far from nice’. So let’s skip them for now.

Apart from the above, we see that customers mention most often words like ‘room’, ‘pool’, staff’, ‘food’. These are our words of interest (I call them ‘Lookup words’). We may now filter all sentences that contain these words, allowing us to read much less information compared to original list of opinions.

Honestly speaking, if you have enough experience or even some common sense, you can skip this step. I mean it is more than expected that ‘room’, ‘staff’ and ‘pool’ are most often mentioned in reviews for a SPA hotel. However, identifying lookup words might be needed for cases like analyzing text of more open-ended questions.

Filtering Lookup words

So far we have identified our words of interest and listed them in a table:

List of lookup words

So how do we filter all sentences containing these lookup words? We can make it dynamic by creating a custom function. The process is similar to one used to load data from multiple internet pages, as described in M is for (Data) Monkey.

First, we create a query to Filter Rows of the Word column that contain one of our lookup words, ‘staff’ for example.

Filtering for a look up word

Then we go to Home > Advance Editor and change the M code a bit as shown below:

Updating the M code

After we create the function, we rename it as something more user-friendly (I used fxFilter).

Then we load our LookupWords table into Power Query (From Table):

LookupWords table

Next, we Add Custom Column. (Note: in Excel 2016, you have to Invoke Custom Function.)

Adding a custom column

And when we expand the table, we get all rows that contain the words from our lookup list.

Identifying Connected words

We now need to split the Sentence column again to identify the words that were most often used in sentences containing our Lookup words. I call these words ‘Connected words’.

There are some final steps such as eliminating words that appear rarely (i.e. less than 5 times), removing duplicate rows, rearranging columns etc. that I will skip.

Our final table is on the left below:

Final table for analyzing the comments

Analyzing text

As you see, I added two slicers to our table - one for our Lookup words and one for our Connected words. By using them in combination, we can get more insights quite easily. For example, these are the words used most frequently together with ‘staff’:

Slicers used to identify comments regarding staff

Here, I have filtered for the Connected word ‘friendly’, which gives us just 10 rows to read:

Analyzing text for the word 'friendly'

In a separate column, we can manually type “Y” for each positive opinion and “N” for each negative. Now we can calculate that 30% of mentions are actually negative. This is a simple way to quantify and measure our data. And it takes just few minutes to find out that the picture is similar with ‘helpful’ and ‘polite’ – positive feedback is prevailing.

When we filter ‘English’ it is easy to see that 8 people mentioned staff speaks no or poor English. That directly calls for some action - we need to train our staff in English. If you were reading through 300 opinions, you might miss this quite easily.

Analyzing text containing 'staff' and 'English'

It takes just few clicks and couple of minutes or reading to find out that clients are happy with the breakfast:

Analyzing text for comments on breakfast

But are disappointed by the prices and quality of food in the restaurant:

Analyzing text for comments on the restaurant

Of course this is just a starting point and each of the above observations needs further investigation and analysis. But the key takeaway here is that we can successfully use Power Query to summarize unstructured text data. We can then focus our attention on sentences that contain the most commonly used words instead of reading the whole bunch of reviews.

Applying the technique

Many other ideas come up into my mind of how we can use this for further analyzing text data. We can use the Stayed column to study trends in time, we can quantify some key words (as shown with 'friendly') or make a word cloud showing our top 20 words. What are your ideas - how can we use Power Query and Excel to analyse unstructured text?