Thursday, October 20, 2011

Function: SUMIF()

Welcome to Learning to Excel! In coming days, I hope to start rolling out posts that help the average Excel user learn new skills, better ways to use Excel, and enjoy the experience of solving problems a little bit more.

The first topic on the agenda is the SUMIF() function. This little function is invaluable in adding lines based on a condition. For example, you may have a sheet of shipping data, hundreds or thousands of records, identified by customer name and you'd like to find out how many units you shipped to each customer. Let's set up a sample to work with first.


Optionally: Download the file here: SUMIF() example
As you can see, we're looking at shipments made to three companies: Apple, Google, and Microsoft. For this example, Item Type and Date are irrelevant (they'll be used in later examples). We want to select and add together only those shipments that belong to a particular company; in order to do this, we can use the function SUMIF(). The benefit of this method is that it does not modify the original data. It can stay organized in the same way it is now which is important as the data grows or becomes more complex. The drawback is that we cannot define multiple criteria -- we cannot select all Code shipped to Apple using this function without adding something to our data, another column in this case.

To begin, we can place our three company names to the side of our data. You want to avoid putting it underneath the data as you can run into problems with Circular References. This error is caused when you have a reference in a cell to another cell that is also referencing the original cell. Here is a visual representation of what is going on:

Circular Reference

Cell A depends on Cell B for its value while Cell B relies on Cell A for its value. Imagine if you were falling, a good reference would be grabbing onto a person standing on solid ground, a circular reference would be like grabbing onto a persons hand who is falling along with you. You certainly can put the summary data under the original data, you simply need to be aware of and avoid creating circular references.

Now that we've covered good practices regarding where to place our summarized data, we can begin to write the formulas.


We've added Apple, Google, and Microsoft in column F and added our formulas to column G. Let's take a look at the formula: what it is doing and why it is doing it. First, we'll look at the syntax of the function (or what you need to enter to make it work):

=SUMIF(range, criteria, [sum_range])

Required: range, criteria
Optional: sum_range

In our example above, the sum_range is not optional because when it is omitted, it uses the range field as the data to add together (and words can't be added together like numbers). If we wanted to find all occurrences of ship quantities of 9, we could enter the range into SUMIF() on row D, make the criteria 9, and leave sum_range blank. Likewise, if we wanted to sum up shipments greater than 5, we could write =SUMIF(D:D, ">5").

We have a couple of options when referring to a range, using: an entire column or row, a range of cells (say A1:A10 or A1:D1), or a named range (which we'll cover in more detail at a later time). All we need to know right now is that the data we want is in column A and we can use the entire column as our range in the function. When we click the A header or type in A:A, we are saying select the ENTIRE column, be careful that there is no data below the data you want. If, for example, we had another table of shipments beginning on row 100, it would also be included in the summary and return unexpected results.

The criteria can be static data, a cell reference, or it can make use of > (greater than), < (less than), <> (not equal to) characters if put inside of quotations: ">5" is valid but >5 is not...or it can be a combination of these joined together with &. For example, =SUMIF(D:D, ">" & D2) will sum all values greater than 9 (which is cell D2) in column D.

The sum_range need not be the exact same size as the range, any data that is above or below the range will be omitted, however. If the range is A1:A10 and the sum_range is D:D, the results will still be accurate in our example.

To put this information to use, we have the following formulas in cells G2, G3, and G4:

G2: =SUMIF(A:A,F2,D:D)
G3: =SUMIF(A:A,F3,D:D)
G4: =SUMIF(A:A,F4,D:D)

A:A tells SUMIF() to look at the entire A column. F2, F3, and F4 tells the function to look at the values in those respective cells. D:D tells the function to use column D as the sum column. It looks at each value individually. If A1 matches cell F2, then it adds the value of D1. Here is a visual description:


I hope this guide to using SUMIF() proved helpful! Please leave any feedback or questions you may have in the comments!

No comments:

Post a Comment