# Formula for Actual duration excluding duplication

Hi Community,

I am trying to calculate the actual duration starting from a list of Start and End dates.

What I am trying to achieve is to exclude duration overlaps in order to get the actual effort days only, but including not contiguous time ranges.

Please see example:

Start: 01/10/2019 End: 04/10/2019 = 4 working days effort October

Start: 01/11/2019 End: 15/11/2019 = 11 working days effort November

Start: 11/11/2019 End: 18/11/2019 = 6 working days effort November

Start: 15/11/2019 End: 22/11/2019 = 6 working days effort November

Start: 09/12/2019 End: 13/12/2019 = 5 working days effort December

Total effort = 32 days --> this is NOT what I need

Actual days = 4+11+1[=6-5 overlap]+4[=6-2 overlap]+5= 20 days --> this is what I need

Please consider that this requires to be completely automated through formula as it needs to be deployed into multiple very long backlogs.

I would really appreciate your help.

Thanks!

## Comments

You could try something like:

=networkdays(min(collect(Startdate:Startdate,Startdate:Startdate,and(@cell >= date(2019,1,1),@cell <= date(2019,31,1)))), Max(collect(Enddate:Enddate,Enddate:Enddate,and(@cell >= date(2019,31,1),@cell <= date(2019,1,31

This should get you what you want for the month of January. If you have any issues understanding the formula let me know and i can break it down.

[email protected],

So you would essentially have to replicate this formula for each month for the duration of the project, then sum those up?

I think what Alessandro is trying to do is figure out how many working days were used from the earliest start date to the latest end date, but only count each day once.

I also noticed the condition of being "completely automated", so from there you would need to be able to automate the dates within your formula to start at the first month of the project, end at the last month, and replicate for every month in between regardless of the number of months.

Also... What if we had a two day task at the beginning of the month and a two day task at the end? Your formula would pull the start for the first task and the end for the second which would give us the total number of working days for the entire month instead of just 4.

I feel like this has been worked out before. I just can't seem to find my notes on it. I am going to keep digging. In the mean time...

Thoughts?

Hi both, thank you for your kind feedback. Much appreciated you took time to respond.

The formula I am trying to generate should ideally be feeding one single entry of the sheet summary, Paul Newcome in this statement "

I think what Alessandro is trying to do is figure out how many working days were used from the earliest start date to the latest end date, but only count each day once" is perfectly right.To give you some more context I have a long sequential backlog for each project, some tasks are assigned to "content development team" some others to "production", "systems" and so on.

We have start and end dates for each task in the backlog, the difference between start and end dates provides the actual days count. I would need the net lead time for each team, hence, SUM actual days for each team but excluding the days (dates) when multiple activities performed by the same team are running in parallel. Start and end dates are already counting working days only, therefore, also the Actual days are not impacted by weekends or holidays. A possible approach would be to calculate the difference between the MIN start date and the MAX finish date for each team (this would eliminate actual days duplication) but automatically excluding the dates for tasks owned by other teams.

Hope the description above would make sense to you to better picture the context.

Many thanks in advance!

If my above assumptions are correct in that we need to figure out the number of working days (where each day is only counted once) across multiple rows for an unlimited amount of rows and variable start and end dates...

I believe I have a solution. It is built on the row level and then a final column is summed. I have it broken down into 6 helper columns.

.

The basics...

A column to designate a row number.

A column that pulls the lowest row number for all rows that overlap that particular row.

A column the pulls the highest row number for all rows that overlap that particular row.

A column to pull the Start Date based on the low number.

A column to pull the End Date based on the high number.

A final column to figure out the NETWORKDAYS between the new start and end dates.

.

Multiple rows that overlap are accounted for.

.

HEREis a published version of the sheet....

Feedback is appreciated.

Hi @Paul,

thank you for the time taken to address this solution.

I shared a comment yesterday to provide further context, however, it took some time for the community moderators to approve it. It is now visible though.

I confirm your assumptions are correct and perfectly consistent with the goal I am pursuing.

The published version is a great walk though to implement the solution. I will test it this afternoon and share a feedback with you at the earliest convenience.

I am very grateful for your valuable contribution.

I'll keep you posted.

Alessandro,

I saw where you wanted this grand total of days rolled up into a sheet summary field, but I also saw where you wanted to include the requirements of essentially filtering by team. As long as the team is established on each row, you could use a basic SUMIFS in the sheet summary to pull by team.

=SUMIFS(NETWORKDAYS:NETWORKDAYS, [Team Column]:[Team Column], "Team A")

.

I'm glad you found the published explanation helpful. Please don't hesitate to ask if you need any clarification on anything. I do hope it works for you.

Hi Paul,

all seems to work correctly. I will keep testing but it looks great.

I am so grateful for your brilliant solution and for your kind help.

I will keep you posted if any further clarification shall be needed.

For now, many thanks again and all the best!

Happy to help!

Hi Paul,

hope you are well.

I had time to test the formula in more detail on projects where multiple activities are running in parallel and I realised something doesn't look right, unfortunately.

As you can notice in the attached screenshot network days is including days when activities are running in parallel, this results in overestimating the total sum.

A good example is the number 8 highlighted, the date range for the number 8 is entirely covered by the row above, therefore, I assume that number 8 should be equal to 0.

Could you kindly suggest any amendment to the formula that may fix the issue?

(for your reference AS and AF stand for Actual start and Actual finish and the first two columns on the left are the data source for all the new columns on the right side)

Many thanks

That is a scenario I didn't even think about. My apologies. I will need to do some more testing, and then I will get back to you.

Any help would be greatly appreciated.

Many thanks in advance for your kind effort.

Alessandro,

Double check your formulas and layout. When I used those same dates from your screenshot with the formulas unchanged in my published sheet above, and my "New" start and end dates were not the same as your "New" start and end dates...

Ignore the above. I plugged an extra date in to my sheet to make the new dates match your new dates (should have thought of that first duh). Check the above published sheet. I update the formulas in the Min and Max columns.

Hi Paul,

Many thanks for your kind response.

The two new formulas for Min and Max seem to correctly produce the output to resolve the duplication issue where data source is found (Production AS and AF in the screenshot below). But where source data is missing the formula returns #NO MATCH.

No MATCH is, I assume, due to the fact that the data source (Production AS and AF) is linked to the main backlog through the following

=IF(Role26 = "Production", [Actual Start]26), this is meant to shortlist dates that are related to Production Team only and calculate NETWORK DAYS uniquely for one team at the time. In fact, my goal is to remove effort duplication for one team at the time as different teams are working in parallel.Now, the New Start and New End dates (in the screenshot below named New Production AS and New Production AF) seem not to respond to the new Min and Max formulas. Both columns entirely return #NO MATCH either when a Min & Max is found and when data source (Production AS and AF) is missing.

Do you believe some amendments to the New Start and New End dates may resolve the conflict?

I really feel you are getting very close to the solution, this should really be the last conflict.

As usual, many thanks in advance for your great help. Much appreciated.

Alessandro

Ok. No worries. This is actually pretty straight forward. The fact that the Production AS field is blank is what is causing the issue. We just need to put in an edit that says to only run the formula if the Production AS is a date.

.

IF(ISDATE([Production AS]@row),

.

Take the above (comma and all) and insert it immediately after the initial = in each of these columns...

Min

Max

New Production AS

.

I have also updated this in the Published Sheet as well for your reference.

Hi Paul,

this has sorted all the #NO MATCH conflicts.

I have tested the entire formula structure in one of the most complex backlogs and I realised that some duplication are still returned by the formula unfortunately. This produces a higher number of Network days when summed up. I have attached a screenshot that exemplify the output.

However, I tested the formulas in more simple backlogs and it correctly excludes duplicates and returns the correct network days summary output. No clue how this can be possible. I copied and pasted all formulas from the same sheet for both backlogs, therefore, I would exclude syntax errors.

Many thanks in advance if you might come up with any further recommendation on how to amend the syntax.

Did you pull the updated MIN and MAX formulas mentioned above that I posted in reference to your initial duplication concern?

Yes, sure thing, I replicated and validated the entire formula structure, including the new MIN and MAX. For the purpose of testing in further detail I have deployed the formula model across 9 different backlogs (carefully copying and pasting from original sheet), I am experiencing duplication issues on two backlogs only, the other seven are returning correct dates and correct network days output. The two backlogs that produce incorrect output are the most complex ones in terms of connected phases, although the template is identical in all its components.

Hmm...

There must be an overlap scenario we haven't thought of yet. If you are able to identify the rows that are not only being duplicated but are LINKED to the duplicated rows through some sort of overlap, try copying them to another sheet (just the dates and helper columns for this). Once you have that done, maybe share me to the new sheet so I can get in and take a look. I am unable to reproduce duplicates, so I am not sure where to start on figuring out a solution.

Hi Paul,

I would be happy to share a selected portion of the sheet, would you recommend the "publish" function? To preserve formulas I might use the "Edit by anyone" option.

Many thanks

HEREis some information on Publishing Sheets.For the testing sheets, we would only need the Primary column which can contain any data you want to put into it (I usually just enter the row number or some random letter), then the date columns and the helper columns. If you want to "Save as new" one of the sheets you are having issues with, then delete the extra data/remove sensitive data so that it is strictly what we need to figure out this particular solution, that would work just fine.

The edit by anyone option would allow me to edit the formulas themselves so that I can do some testing and whatnot, so that would be preferable.

Hi Paul,

Thank you for your confirmation, I have published the sheet selecting the edit by anyone option.

Please find it Here

I filtered the sheet (Production AS is not blank) to keep only relevant rows but the publish function seems not to preserve filter settings.

I have highlighted in different colours where I verified a conflict, I also highlighted some cells in the Max column when the Max seems not to be pulling the correct date.

Please let me know if any question or something is not clear.

Many thanks

I have found a few things I would like to test. Would you please add in a few blank columns? 4 text/number and 2 date types should be enough. That way I can compare things side by side without overwriting anything already on there.

All columns have been set up as requested, Paul. Please let me know if any further action on my end may facilitate your testing.

I am really grateful for your kind help. Thank you and Happy Friday!

Hi Paul,

hope you are well.

Have you found my amendments to the sheet I shared?

Were the columns created correctly?

Thanks

I haven't had a chance to work on it the past few days. Did you see my note I left in a blue cell in your published sheet?

I ended up creating my own copy duplicating your dates because I wanted to try a few things with additional date type columns and whatnot.

I am still working on it when I have time, but unfortunately I can't quite get it just right. I keep getting very close, but not quite solved yet.

Just found your note in blue and your comment. Smartsheet didn't notify your kind response on my inbox for some reason.

Many thanks for looking into this, hope the solution will come your way soon.

I will check again the community once back from Christmas leave.

I wish you a marry and joyful holiday time.

Thank you gain for your kind help.

Best,

Alessandro

Hi Paul,

do you have any update on the formula?

Thanks!