Calculate Hours Worked

Chandoo posted an interesting challenge on his blog last Friday, challenging users to calculated hours worked for an employee name Billy.  This example resonated with me for a couple of reasons.  The first is that I’ve had to do this kind of stuff in the past, the second is because I’ve got a new toy I’d use to do it.  (Yup… that toy would be Power Query.)

It always blows my mind how many people respond on Chandoo’s blog.  As the answers were pouring in, I decided to tackle the issue my own way too.  I thought I’d share a bit more detailed version of that here as I think many users still struggle with time in Excel.

Background and Excel Formula Solution

Chandoo provided a sample file on his blog, so I downloaded it.  The basic table of data looks like this:

SNAGHTMLeb2f08

Now, for anyone who has done this a long time, there a few key pieces to solving this:

  • The recognition that all times are fractions of days,
  • The recognition that if you omit the day it defaults to 1900-01-01, and
  • The data includes End times that are intended to be the day following the Start time

The tricks we use to deal with this are:

  • Test if the End time is less than the start time.  If so, add a day.  (This allows us to subtract the Start from the End and get the difference in hours.
  • Multiply the hours by 24.  (This allows us to convert the fractional time into a number that represents hours instead of fractions of a day.)

Easy enough, and the following submitted formula (copied down from F4:F9 and summed) works:

=(D4+IF(C4>D4,1,0)-C4)*24

SNAGHTMLea6d3e

Also, there was a great comment that Billy shouldn’t get paid for his lunch break.  Where I used to work (before I went out on my own), we had a rule that if you worked any more than 4 hours you MUST take a lunch break.  Plumbing in that logic, we’d would need a different formula.  There’s lots that would work, and this is one:

=((D4+IF(C4>D4,1,0)-C4)*24)-IF(((D4+IF(C4>D4,1,0)-C4)*24)>4,1,0)

SNAGHTMLf008e9

So why Power Query?

If we can do this in Excel, why would we cook up a Power Query solution?  Easy.  Because I’m tired of having to actually write the formula every time Billy sends me his timesheet. Formula work is subject to error, so why not essentially automate the solution?

Using Power Query to Calculate Hours Worked

Okay, first thing I’m going to do is set up a system.  I’m set up a template, email to Billy and get him to fill it out and email it to me every two weeks.  I’ll save the file in a folder, and get to work.

  • Open a blank workbook –> Power Query –> From File –> From Excel
  • Browse and locate the file
  • Select the “Billy” worksheet (Ok, to be fair, it would probably be called Sheet1 in my template)

image

  • Click Edit

And now the fun begins…

  • Home –> Remove Rows –> Remove Top Rows –> 2
  • Transform –> Use First Row as Headers
  • Filter the Day column to remove (null) values
  • Select the Day:End columns –> right click –> Remove Other Columns

And we’ve now got a nice table of data to start with:

SNAGHTMLfb3d71

Not bad, but the data type for the Start and End columns is set to “any”.  That’s bad news to me, as I want to do some Date/Time math.  The secret here is that we need our values to be Date/Times (not just times), so let’s force that format on them, just to be safe:

  • Select Start:End –> Transform –> Date/Time

Next, we need to test if the Start Date occurs after the End Date.  Let’s use one step to test that and add one day if it’s true:

  • Add Column –> Add Custom Column
    • Name:  Custom
    • Formula:  =if [Start]>[End] then Date.AddDays([End],1) else [End]

So basically, we add 1 day to the End data if the Start time is greater than the end time.  Once we’ve done that, we can:

  • Right click the End column –> Remove
  • Right click the Custom column –> Rename –> End

And, as you can see, we’ve got 3 records that have been increased by a day (they are showing 12/31/1899 instead of 12/30/1899

image

Good stuff, let’s figure out the difference between these two. The order of the next 3 steps is important…

  • Select the End column
  • Hold down the CTRL key and select the Start column
  • Go to Add Column –> Time –> Subtract

Because we selected the Start column second, it is subtracted from the End column we selected first:

image

Now we can set the Start and End columns so that only show times, as we don’t need the date portion any more.  In addition, we want to convert the TimeDifference to hours:

  • Select the Start:End columns –> Transform –> Time
  • Select the TimeDifference column –> Transform –> Decimal Number

Hmm… that didn’t work quite as cleanly as we’d like:

image

Ah… but times are fractions of days, right?  Let’s multiply this column by 24 and see what happens:

  • With the TimeDifference column selected:  Transform –> Standard –> Multiply –> 24
  • Right click the TimeDifference column –> Rename –> Hours

Nice!

image

Oh… but what about those breaks?

  • Add Column –> Add Custom Column
    • Name:  Breaks
    • Formula:  =if [Hours]>4 then -1 else 0
  • Add Column –> Add Custom Column
    • Name:  Net Hours
    • Formula:  =[Hours]+[Breaks]

And here we go:

image

At this point I would generally:

  • Change the name of the query to something like:  Timesheet
  • Close and Load to a Table
  • Add a total row to the table

SNAGHTML12fe424

But just in case you only cared about the total of the Net Hours column, we could totally do that in Power Query as well.  Even though it’s not something I would do (I’m sure Billy would trust YOU implicitly and never want to see the support that proved you added things up correctly…), here’s how you’d do it:

  • Go to Transform –> Group By
  • Click the – character next to the Day label to remove that grouping level
  • Set up the Grouping column:
    • Name:  Net Hours
    • Operation:  Sum
    • Column:  Net Hours

Here’s what it looks like if you set the column details up first, indicating where to click to remove the grouping level:

image

And the result after you click OK:

SNAGHTML1344d4b

Holy Cow that’s a LOT of Work!?!

Not really.  Honestly, it took me about a minute to cook it up.  (And a LOT longer to write this post.)  But even better, this work was actually an investment.  Next time I get a timesheet, I just save it over the old one, open this file, right click the table and click Refresh.  Done, dusted, finished and time to move on to more challenging problems.

Even better, if I wanted to get really serious with it, I could implement a parameter function to make the file path relative to the file, and then I could pass it off to someone else to refresh.  Or automate the refresh completely.  After all, why write formulas every month if you don’t have to?

🙂

9 thoughts on “Calculate Hours Worked

  1. Really, you would write a formula every time for a regular process?

    More interestingly, what about the time when you only want the hours between a core time? For instance, 10/06/2012 14:00 - 12/06/2012 11:00 would return 45:00 hours in a straight calc, but if working hours were 08:00-17:00 it should only return 15:00.

    To get you started, here is an Excel formula, including allowing for holidays, as stolen from Barry Houdini

    =(NETWORKDAYS(A2,A3,holidays)-1)*("17:00"-"08:00")
    +IF(NETWORKDAYS(A3,A3,holidays),MEDIAN(MOD(A3,1),"17:00","08:00"),"17:00")
    -MEDIAN(NETWORKDAYS(A2,A2,holidays)*MOD(A2,1),"17:00","08:00")

  2. I'm trying this on my own, and either you left out a step or I just don't get it, but when I did the subtraction formula for the time difference, I end up with 3 rows of negative numbers. I can see the formula in chandoo's site that worked. But yours isn't the same. Soooo... what am I missing in following your instructions.

  3. Hi Melanie,

    There are three pieces that could be causing that issue.

    1) Did you add the if formula to check if the Start > End columns? =if [Start]>[End] then Date.AddDays([End],1) else [End]
    2) Did you remove the original End column and rename the newly created Custom column to "End"
    3) When you went to do the subtraction, did you select the End column first or the Start column first? (My guess would be you got it backwards)

  4. Bob, no... to be fair I'd probably set up a template and tell Billy to use it. But I've seen this kind of thing knocked up more times than I can count and seen people reproduce this stuff. Crazy.

    Regardless, that's an interesting question you posed. I'll have to noodle on that one a bit. 🙂

  5. So would I Ken, or maybe even automate it to pull the data in and create results. I do worry when people say things like that, especially people like you, because that is not a problem with the formula or with Excel, it is a problem with the process/process owner.

    If I get a chance, I will noodle it too 🙂

  6. The difficulty I have with array formula solutions to problems is that your standard end user is hosed if they need to understand what happened. From a level of complexity, I see Array Formulas near the top, VBA second, standard formulas third and Power Query at the bottom of the complexity level. The reason Power Query factors on that end to me is the ability to step through each line and see what it's doing. While we can do that with formulas too, the UI for it... sucks.

    Having said that, the fact that we have so many ways to solve the issue is pretty damn cool, and I wish I had more array formula knowledge!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.