Mark Stetzer's Blog

My semi-random thoughts on big data, data science, Legos, and just about anything else someone would label nerdy...

Applied combinatorics and Legos

28 May 2013

The following is a blogged version of a 5-minute lightning talk I gave recently at a Data Science Indy meetup. The title of the talk was Applied Combinatorics or: How much do I have to spend to get the Chicken Suit Guy?

Lego Series 9 Minifigures

I love Legos. More specifically, my 2 year-old son loves Legos, and I love playing with him. One of our favorite things about Legos are their minifigures. Lego has these neat rotating series of minifigs (as they're called) where they create some unique figs and sell them in "blind bags" so that you don't know which figure in the series you are getting. Pictured above is the Series 9 collection, which is comprised of 16 unique figures like Mr. Good and Evil, Heroic Knight, and Chicken Suit Guy.

A store near our house sells these bags for $3, and we wanted to get Chicken Suit Guy. The question is, how do we get him without looking like a creepy person in the store that is feeling up all of the lego bags? One option would be to just start buying bags until we get him. In fact, that's what his mom and Grandmother wanted to do. Me being the nerdy fun killer I knew there was a smarter, albeit more math-involved, way to get Chicken Suit Guy without breaking the bank. Math to the rescue! Here's the breakdown:

Problem

At first we wanted Chicken Suit Guy, but we didn't want to spend any more money than we had to. That eventually turned into us wanting all 16 figures with spending the least amount of money.

Known Facts

  • Blind bags cost $3 (each).
  • The figure inside a blind bag is random.
    • The degree of randomness is unknown (some figures appear to be more rare than others), but all figures in this series appear to be reasonably common.
  • You can go on eBay and buy figures like Chicken Suit Guy, typically for about $7.

Approach

In combinatorics, this is a variation of the classic coupon collector's problem. Simply put, given n minifigs, how many minifigs do I expect to buy before getting all of them. Since this was going to be presented at a data science meetup, my first thought was to talk about logarithms and harmonic numbers, but I was told there were a fair amount of people at the meetup who were more data science-curious, so I instead decided to base my talk on the output of running a large number of samples and taking the average as a determining factor. I would leave something for the stat/probability geeks in the house though, as I'd do my simulations in R.

Below is R code that defines a series of functions to approximate how many purchases it would take to collect a distinct set of all figures of size n, how many purchases would be needed to finish a set of size n given a starting size seen, and how many purchases it would take to see the next unique figure, again given set size n and starting size seen. At the bottom of the code, we do an example trial run.

An example run of the code yields the following results:

Money spent to acquire minifigs

Total Figures Unique Figures Money Spent
1 1 3
2 2 6
3 3 9
4 4 12
5 5 15
6 6 18
7 6 21
8 7 24
9 8 27
10 8 30
11 9 33
12 9 36
13 10 39
14 11 42
15 11 45
16 11 48
17 11 51
18 11 54
19 12 57
20 12 60
21 12 63
22 12 66
23 12 69
24 12 72
25 12 75
26 12 78
27 12 81
28 12 84
29 12 87
30 12 90
31 12 93
32 12 96
33 12 99
34 12 102
35 12 105
36 12 108
37 12 111
38 12 114
39 12 117
40 13 120
41 13 123
42 14 126
43 14 129
44 14 132
45 14 135
46 14 138
47 15 141
48 15 144
49 15 147
50 15 150
51 15 153
52 15 156
53 15 159
54 16 162

Conclusions

What conclusions did I draw from my simulations?

  • It will take about 54 purchases, or $162, on average to collect a complete set of all 16 minifigs.
    • Going to eBay and buying each fig for $7 each is actually cheaper, with a total cost of around $112.
  • If I switch to eBay purchasing after collecting 9 figures, I will optimize my total cost down to ~ $90.

$5.63 / figure is pretty good considering a starting cost of $3 for a random selection!

Addendum

As one meetup member pointed out, I am not assigning any value to duplicates that we collect. Honestly that's mainly a factor of my son being little and losing his toys all over the house, so duplicates are good for spare parts :)

Meetups I'm involved with

Personal Pages