
Table of Contents
I love M&M’s. I’m partial to the plain Milk Chocolate variety, but I’ve been known to have a Peanut from time to time in order to remind myself why I don’t like them that much. Often, while eating a pack, I’ll wonder how they’re made and how the colors are distributed.
I once took a factory tour at Ben & Jerry’s and saw that they make ice cream by making one flavor per production run and then storing them to be shipped out later. While that kind of production makes sense for ice cream since there are many different flavors and each flavor has many different ingredients, it doesn’t make sense for M&M’s since, except for the color of the candy shell, they are all the same. I assume that all the different colors are made at the same time and they’re combined together along the way into the different size packages.
After wondering about it a little more, I checked out M&M’s web site. According to it, each package of Milk Chocolate M&M’s should contain 24% blue, 14% brown, 16% green, 20% orange, 13% red, and 14% yellow M&M’s. I checked the next few packages of M&M’s that I ate and found that their percentages were not even close to the stated distribution. In my mind, this sort of confirmed my thoughts about how they produce M&M’s: When they make M&M’s, in any production run, they produce the stated percentage of each color and then just fill the packs off a conveyor line or some other weight based method. This would mean that any single package could be way off from the stated percentage; but analyze the counts over a large number of packages, and they should converge towards the stated percentages.
That’s what I aim to do here.
Overview
I thought about taking a random sampling of packs by grabbing many different packs from different locations, but I ruled this out because it was entirely possible that I would just get the packages that were way off the stated percentages. In addition, they would all be from different production runs and that alone would skew the numbers a bit, and to get a true representative sampling, I’d have to purchase a whole boatload of them. Instead, I decided to get a single “case” of M&M’s since I would be assured that all inside came from the same production run.
M&M’s that are sold at retail come in a cardboard box containing 48 packages of M&M’s. After acquiring a “case”, I counted each package for the total number of M&M’s in the package. Next I counted each color, and then compared the sum of all the colors in each pack to the total of the pack as a form of error checking. All numbers were entered into a database to allow for easy analysis of the data.
Summary Results
48 packages of M&M Milk Chocolate 1, containing a total of 2620 M&M’s, were used in this project. On average, each package had 55 M&M’s.
| Blue | Brown | Green | Orange | Red | Yellow | Total M&M’s | |
|---|---|---|---|---|---|---|---|
| Percent expected | 24% | 13% | 16% | 20% | 13% | 14% | ––– |
| Percent observed | 18.36% | 14.16% | 18.44% | 20.76% | 14.20% | 14.08% | ––– |
| Qty. expected | 629 | 341 | 419 | 524 | 341 | 367 | ––– |
| Qty. observed | 481 | 371 | 483 | 544 | 372 | 369 | 2620 |
| Difference | -148 | +30 | +64 | +20 | +31 | +2 | ––– |
| Average per pack | 10.02 | 7.73 | 10.06 | 11.33 | 7.75 | 7.69 | 54.58 |
| Maximum in pack | 16 | 12 | 17 | 17 | 12 | 14 | 57 |
| Minimum in pack | 5 | 3 | 5 | 7 | 2 | 2 | 52 |
| Std. deviation | 2.82 | 2.19 | 2.59 | 2.54 | 2.62 | 2.65 | 1.32 |
| Variance | 1.74 | 7.98 | 4.80 | 6.70 | 6.44 | 6.87 | 7.03 |
The quantity expected row is based on the total numbers of M&M’s observed and calculated using the percent expected values from M&M’s web site.
Blue, the most populous color according to M&M, was observed to be the third most populous color, and was almost 25% less than it’s expected amount.
Brown, orange, red, and yellow were all within two percentage points of their expected quantities, with yellow coming closest.
After analyzing all the individual pack’s data, it seems like pack #22 is the closest to M&M’s published numbers, as well as being the most “average” pack in this project.
Each individual pack’s data is listed on a separate page.
This single case does not prove anything, but it does show that some colors were close enough to verify M&M’s claims, while others were off. A second case should be analyzed to confirm the findings.
All M&M’s were brought to my office and donated to sugar deprived co-workers.
Graphs

M&M colors by percent
- All packages were 1.69 ounces and were from the same 48-count box. This is the standard newspaper stand/vending machine size. ↩

ocd much??? i never thought you would have finished this one, but you’ve proved me wrong. kudos
My young padawan…never doubt a master.
I’ll have to agree with Beau…OCD displayed in its finest fashion. But, I’m glad someone out there had the patience to complete such a project, I know I would never have made it. All the blue ones would be eaten, and the co-workers would have received half-full bags of brown M&M’s.
Is orange and blue dye less expensive than the others?
There are some obvious errors in the statistical analysis presented.
* Either variance or standard deviation are calculated wrongly, since if s is the standard deviation s^2 is the variance.
* The variance and standard deviation make no sense if given in absolute numbers (of M&Ms), since the packs are not all of the same size. Rather, they should be given as percent points.
* Obviously, the t test, p value and confidence intervals are missing to test the validity of your conclusions.
I did the same analysis way back in the 1980s when I was working at DEC. Very similar results. Used to really look forward to the packs with outliers, like 18 Greens. Gained 10 pounds during the testing, but that’s another story.
That’s exactly why I brought all the M&M’s to work and let my co-workers deal with them.
Josh, your sampling was flawed. By using packs from the same box you ensured that they all would be the same lot and production is not a random process. If the machine is mixing incorrectly, it will continue to mix incorrectly until someone adjusts it. You need to sample different lots over a long period of time to generate a true population estimate. If you purchased the 48 packs at 48 different locations, you could make valid claims.
By the way, I think I saw a documentary on making M&M’s once, and they make each color on a separate production line, color them at the end in a tumbler and mix the colors together later.
I suspect, with my little industry knowledge, that the design percentage (as we can see, not necessarily the actual percentage) of blue and brown M&M’s per package was decided based on product research; specifically, how comsumers ranked each color individually and how appealing the overall color mix was.
Say what you want about all M&M’s being the same, but I bet studies show differently. Your eyes are connected to your stomach. Remember when they came out with different colored ketchup? That didn’t go over so hot. Brown M&M’s are boring. Blue (a color rarely associated with food in the natural world) is exciting and fun.
Greetings and salutations,
I’m a QC Line Inspector for a company that makes specialty grinding wheels. I like your approach to analyzing the tantalizing question as to whether or not the Mars company is being totally honest with us about what their product actually contains. What you did here is EXACTLY the kind of thing we do to determine the quality of our product.
I bet you exceeded the Mars company QC standards with your testing. At the very least if you should need another day job they should be happy to offer you one. If they don’t I’ll be happy to recommend you to my company.
You’re good.
My minor grievance with your testing procedure involves the last line in your presentation. You stated:
“All packages were 1.69 ounces and were from the same 48-count box. This is the standard newspaper stand/vending machine size.”
That’s interesting from a Line QC position because of the aberrations between the number of M&M’s included in each bag vs. the weight of each bag.
A bag that has 57 M&M’s shouldn’t weigh the same as a bag containing 52, yet all bags are treated as weighing 1.69 oz. Perhaps the unit measurement of an ounce isn’t the best resolution if a gram gives a finer gradient.
Perhaps you decided to give the Mars company the benefit of the doubt (a bad idea) and took their word that each bag weighed 1.69 oz without measuring the weight of each bag yourself. And at the resolution of an ounce maybe 1.69 is close enough for government work, but is it really close enough for what your trying to do?
Perhaps, but considering the attention to detail you spent on this project you want to get it EXACTLY right and are willing to blow another twenty bucks (((?)do you get a rate for “professional” work? I love M&M’s)) on a “case” of M&M’s to do just that.
Overall I’d have to rate the work you’ve done as superlative. It’s certainly on a par with what we do with millions of dollars at stake, and you’ve done this just for fun.
Let me know if I can help. I love M&M’s.
Rikonjohn
Hi Rikonjohn,
Perhaps another QC check is to weight the candies from each bag then analyze the weights to see if there is statistical significant difference to the net weight.
A student weighted little C&H sugar packets for his project. He used his company's calibrated scale, and drew samples from two lot numbers (two stores). His analysis showed that C&H, on average, was giving away 20% more sugar than the specified net weight.
Limitations – This was not a large sample and only two lot numbers were examined. Also, packet packaging weight was not considered. Since he was the user of the scale, his technical skills were not questioned.
Nameless Chicken: That’s an interesting way to think about the color distribution, but I tend to agree with Bridget and say that the distribution of colors is more likely related to focus groups than cost of materials. I can’t imaging food coloring would have a dramatic cost difference between colors.
Bridget: Agree 100%. I actually have trouble eating the blue ones since there is nothing in nature that is even close to that color.
Rikonjohn: The packages of M&M’s are marked with both ounces and grams, and they are all the same. I assume there are tolerances under and over the printed amount that each package is allowed to be within. I did think about weighing each package, but figured that would not really accomplish anything since I was going to count them any way.
I believe that if you provide your analysis to Mars along with any discrepancy in package weights Mars will send you a very nice letter coupons for replacement packages. I reported an issue to them once (there were several white M&M's in a standard bag) and they were very gracious.
I can’t believe you’ve been doing this for three plus years!
Love to read about one other fool then our selfs. Thank you for making our day Josh. Here in the Netherlands there are just the usual M&M.
Lucky one of our sons live in the US, so we buy the special collections as there are for Halloween, Valentines day, Christmas and so on. And yes. Than back in Holland, we are doing our very very small research!
Enjoy your day, with or without M&M. Josh.
Interesting, Now that you have the observed and expected numbers of colors, you could statistically analyze the data o see if significant differences in color number exist (aka. is what is posted in their website the same as to what you found in your survey). You could do this by performing a chi square statistic.
Interesting, Now that you have the observed and expected numbers of colors, you could statistically analyze the data o see if significant differences in color number exist (aka. is what is posted in their website the same as to what you found in your survey). You could do this by performing a chi square statistic.
I don’t see the one follow-up that the world really needs to know: have you contacted M&M;‘s marketing department, and asked them for a comment?
Oh, Josh, I love that you did this project. I think about color distribution every time I eat M&Ms, but I’ve never gone as far as carrying out a scientific study.
I love your site; I like knowing that there’s at least one other person “out there” who thinks about the minutiae of life like I do!
This one time, at math camp, this guy gave me some of his M&M’s. He shook the bag and out came 5 – each a different color. (Back then there was light brown, dark brown, yellow, green, and orange.) I remarked that I wonder what are the odds of that happening. I was labeled at nerd – at math camp of all places! I am very happy to see your experiment.
uhhhh dude that made the site you may want to check the company’s numbers again because with the percents on here it equals up to 101% for the percent of each color.
The chi-squared analysis the the way to compare observed frequencies with expected frequencies. I can’t get to the M&M’s web site mentioned in your article. The only percent I could find was the preference percents.
Looks like M&M’s redesigned their web site. I’ll have to find that screencap of the original site I have.
thanks for doing this it helped my group a lot. we needed help with the whole “m&m’s ” thing.
so KUDOS!!!
Why are Yellow, Brown, Reg M&M’s so popular?
This thread has been closed down for reasons of National Security. The statistical distribution of M&M colors, like the quantity of toilet paper shipped to military installations (reveals troop strength, as well as odor), is highly classified information. The NSA is knocking on my door yelling "rendition team, open up". You have been warned.
Reg M&Ms? u mean brown. I dont think M&Ms has the apostrophe in it either
ive done a couple of experitments like this one ended up with the same out come.
i swear you have no life!
I am doing this project for my science fair project now in the 7th grade. I belive this project will help my brain mentally in math.
Hey, what did you do with the empty wrappers (vis-a-vis one of your other research projects)?
You have a lot of time on your hands
Just want you to know that several of my 7th grade students found your analysis of great benefit as they searched for research before an experiment similar to yours. They presented me with your URL in their findings in their lab reports. Also, they did use a 47.9 gram bag.
I attempted to access the official M&Ms; website from the link in your analysis, but it seems to be broken or a non-working link. So right now at least, you’re the only site that has the expected percentages. I teach statistics at the undergrad level, and we do similar experiments in-class (math can be tasty, if not fun!). One thing I have noticed is that if you analyze the largest consumer bags, something like 2 lbs, the percentages are right in line with their expected values. However, if you go all the way down to the “Fun-Size”, or packages likely to be given out for Halloween, the percentages are wildly inaccurate—I’ve seen up to 7 of one color in a bag of 8 candies. Also, I’ve made it a point to purchase bags from different lots. Lot numbers are printed on the packaging. This would help randomize your samples. Overall, great website, and thanks for the data!
What does this tell us about random sampling?
Thanks
Why does the total percentage in your pie graph is 99% but not 100%
FYI the "proper" way to test to see how well the true to life data fits our expectations is a chi-squared goodness of fit test. I'm doing this in my statistics class!
@woo
Sometimes percentages don't add up correctly because of rounding error.
I'm a stat teacher who probably doesn't belong here, but …
Indeed, the chi-squared goodness-of-fit test is appropriate here, and doable on a Ti-84 (and maybe on a TI-83) using the Observed and Expected lines in your analysis. With a p-value of about 10^(-9), we'd reject the "official" distribution for any reasonable significance level. (Even the Brookhaven physicists looking for a new form of matter are happy with 10(-6) .)
bobstat
I use this exercise to teach process variability and was told by a representative (that I called on the 800 # printed on the package) that:
a) M&Ms are sold by package weight not number of M&Ms.
b) The color variation is changed every once in awhile just to keep things fresh
c) They do indeed get angry customers who are unhappy because there were too many (or not enough) of a given color.
d) My contention that I received W's, 3's, and even E's instead of the advertised M's was not grounds for a lawsuit.
Jason: I asked that question of M&Ms and was told that individual colors are not what's popular rather its the overall aesthetic.
Phil La Duke
Rockford Greene International http://www.philladuke.wordpress.com http://www.rockfordgreeneinternational.wordpress.com
So I just crunched these numbers, and yea, there's a discrepancy, but it's probably true that you probably got basically one sample from one lot, even though you opened a lot of little bags. Anyway, the culprit is obviously less of the blue- and a little more across all the other colors… (p<.05)
Amazed.
I'm sitting here at my desk with three small bags of leftover haloween M&Ms sorted on my notebook and thought: is this the expected distribution? I type "color distribution of…" into Google search, it autocompletes M&Ms, and this was the 2nd option.
The article, now almost 5 years old, and is not complete or especially thorough, but told me what I wanted to know.
How did my parents/grandparents live without the power of the internet?!
Thanks for sharing your work Josh.
personally, i like the blue and green ones
The yellows taste better because the color makes me happy a split-second before I eat it.
I just conducted a statistical analysis similar to this on M&Ms packing. You might find it of interest to know that statistically, you cannot reject Masterfoods USA's claims regarding the proportions of the different colored candies in a given 1.69 oz. bag of M&Ms candies. They aren't exact, but they are within acceptable standard deviation to be allowed to make the claim.
Stat guy is correct. M&M counts aren't normally distributed, so you can't use a t-test. Chi-square is the appropriate method to measure distributions of discrete data like this. That said, I agree with your assessment that M&M Mars seems to have a significant variation between packages.
To improve the power of your test, I recommend increasing the sample size by measuring the large size bag instead of the standard 1.67 oz bag. Unless you have at least five items in each color category, you don't have enough data for a statistically valid analysis. This also provides you with extra chocolate to eat, which is in general a good thing to have.