How to Create a MEDIANIFS User-Defined Function with VBA
Did you know that Excel does not have a built-in MEDIANIFS function? You know, like AVERAGEIFS or SUMIFS, just with a median calculation instead of an average or a sum? Your boy didn’t, that’s for sure.
Which is why I’d like to give a HUGE shout to Evan at Mosaic Ventures. Like a gangster, Evan:
- Identified the missing MEDIANIFS function in the first place
- Commissioned a VBA-based implementation
- Gave me the OK to release the code to everyone + dog
Fuck yeah Evan. You’re a renaissance man.
Full disclosure you guys — the ~200 lines below are advanced, but heavily commented. Read through the comments and the process will crystallize in your mind grapes — scout’s honor.
More importantly, the logic for identifying which cells to consider and which to throw away is universal. What does that mean for you?
If you want to, say, build a MODEIFS instead of a MEDIANIFS, you simply have to change ONE LINE. That’s it. One.
So with that in mind, let’s flex those code muscles a little bit:
Per usual, we’ll follow the 4-Step VBA Process as we walk through MEDIANIFS:
Step 1 – Setup
Step 2 – Exploration
Step 3 – Execution
Step 4 – Cleanup
To start, Step 1 – Setup is actually minimal. The only real “setup” that we do before getting into step 2 takes place on line 7, where we re-dimension varAccumulator, which will store the values we want to take the median of, to be a single element in size.
This is unfortunately a quirk of VBA — since we don’t know how big varAccumulator will eventually be, it will be repeatedly re-dimensioned, and this two-step declaration is required. (You can read more about the “Array Already Dimensioned” error that results from skipping this step on the Microsoft Support page.)
Nice! With that, we’re ready to move on.
Step 2 – Exploration starts with three guard clauses, occupying lines 10 through 36. Let’s cover each of them:
- If the user mistakenly passes in an empty Range, we want the function to return 0
- If the user mistakenly passes in a mismatched number of Range / Criteria pairs, we want the function to return 0
- If the user mistakenly passes in Ranges that have different numbers of rows or columns, we want the function to return 0
Check #1 takes place from line 15 to line 18. If the median_range, which is passed-in by the user, is Nothing, we assign the output to be 0 and exit the function. Boom.
Check #2 takes place on lines 22 through 25. Every Range the user passes in (like A2:A30) needs to have a Criteria (like “<250”) — this means that there should be an even number of elements in range_and_criteria_pairs. (The name is a giveaway, after all.) We validate that this is true by subtracting the lower bound of the passed-in ParamArray variable range_and_criteria_pairs from the upper bound and making sure the result is an ODD number.
Tricky, right!? Let’s do a quick thought experiment to see why this is correct.
Suppose the user passed in C2:C30 as the Range and “>=5000” as the Criteria. In that case, ParamArray(0) is the C2:C30 Range and ParamArray(1) is “>=5000”. The upper bound is 1 and the lower bound is 0. 1 minus 0 equals 1, which IS evenly divisible by 2 — meaning that our check passes!
On the other hand, suppose the user passed in C2:C30 as the first Range, “>=5000” as the first Criteria, and D2:D30 as the second Range, but forgot to enter a second Criteria. In that case, the upper bound is 2 and the lower bound is 0. 2 minus 0 equals 2, which IS evenly divisble by 2 — meaning our check fails! Phew!
Check #3, the final guard clause, is on lines 30 through 36. For the MEDIANIFS function to work correctly, all of the passed-in Ranges need to be the same size and shape. (This is the case for SUMIFS and AVERAGEIFS too.) Verifying that, fortunately, is a snap — on line 31 we simply validate that the number of rows and columns in the passed-in Ranges match the number of rows and columns in the median_range, which was passed-in by the user and holds the column he or she would like to see the median of.
Wahoo! You’ve already come a long way, so take a second to celebrate. Seriously — that was some heavy data validation! Hats off to ya 🙂
Our Step 2 – Exploration begins in a new direction on line 43, where we start looping through all of the rows that are potentially going to be added to our median calculation.
Upon entering this loop, we immediately initialize blnAllMatched to False. We will use blnAllMatched to quickly identify situations where the Criteria is not met — but before then, we need to figure out what kind of operation (like less than or equal to) we’ll be conducting, as well as the threshold (like 200 or “Apples”) for that operation, for each Range / Criteria pair! On line 49, we start looping through the range_and_criteria_pairs, since the user can pass in as many as he or she would like.
For each Range / Criteria pair, we run through the beast mode Select Case statement from line 52 to line 96 to identify the operation (like less than or equal to) and the threshold (like 200 or “Apples”).
Lines 53 to 58 are very straightforward — if the first two characters of the Criteria string are less than or equal to, greater than or equal to, the operator (stored in strOperator) is assigned and the threshold (stored in varThreshold) is extracted.
Lines 62 through 69 handle the not equal to sign. As indicated by the comments, this is a little tricky — the user could be trying to match a number (like “<>200”) or a string (like “<>Apples”)! To handle both cases, we use the handy IsNumeric built-in function (as well as the equally-handy IsEmpty function, since IsNumeric evaluates to true on empty cells) on lines 64 and 65.
Lines 74 through 79 function the same way as the first two clauses of the our original Select Case statement — if the first character of the criteria string is less than or greater than, we again assign the strOperator variable and extract the varThreshold.
Lines 83 through 89 handle the equal sign, which is tricky — just like the not equal to sign. We use the same IsNumeric / Not IsEmpty strategy here on lines 85 and 86 to tell the difference between numbers and strings.
Finally, lines 95 to 102 handle everything else. Just like the comments say, we’re going to bucket everything else as an equality check — fortunately, this means that we can use the exact same logic from the equal sign check! (The only difference is that in this case we do not need to chop off the equal sign character.) Nice!
Hell yeah! At this point, you have an operator (stored as strOperator) and a threshold (stored as varThreshold) — all that’s left to do is check the cell value against the operator and threshold! This is where the blnAllMatched variable comes in handy too.
At last y’all, it’s time for Step 3 – Execution, which occurs between lines 108 and 189. Let’s get it…
Once again, we’re going to use a Select Case statement from lines 111 to 160 — this time, though, we’re checking the strOperator variable, applying logic as needed.
On line 112, for example, we handle “<>” (does not equal) by checking the cell value to see if it is not equal to varThreshold — if so, blnMatched is set to True because that Range / Criteria pair was successful!
Line 118 handles “>=” (greater than or equal to) the exact same way — same for line 124 (“>”, greater than), line 130 (“<=”, less than or equal to), all the way through the end of the Select Case statement on line 161. In fact, you might even be thinking that this section is very similar to the last section of Step 2 – Exploration — and you’d be right!
Line 168 is critical here you guys, so listen up. If blnAllMatched is False at the end of a check for ANY Range / Criteria pair, we know that this cell will NEVER be added to median calculation! If that’s the case, we can break the Range / Criteria loop (which is using lngCriteriaIdx) and move on to the next row.
Almost there you guys! Lines 174 through 181 handle value accumulation — let’s break that down.
On line 174, we check to see that blnAllMatched is True. If so, we know that every Range / Criteria check was successful, so we’re PROBABLY going to add the median_range value to our accumulator array, varAccumulator.
On line 176, we make sure that the value from median_range is numeric (since you can only calculate a median on numbers) and not empty (again, since you can only calculate a median on numbers). If the value passes those checks, we add it to the end of varAccumulator, then increase the size of varAccumulator by one on lines 177-178.
Once we get to line 186, our code has worked its way through every row! Since the last time a value was added to varAccumulator we increased its size by one, we know that the last element is empty — which is why we chop it off.
Finally, we take advantage of WorksheetFunction.Median, which calculates the median of varAccumulator like a champ. This is where you can substitute any statistic you’d like — for example, if you want to get the mode instead of the median, simply call WorksheetFunction.Mode instead! So right, so tight…
(The last few lines, from 193 to 201, are “nice to haves” — they add a description to the MEDIANIFS function and drop it into the “Statistics” group.)
PHEW! That was a lot. In case you’d like a visual review, here’s a 12-minute video of me walking through the same steps and demo-ing the new functionality:
Is your MEDIANIFS function humming along? Are you remixing it to build a different