This was bugging me over the weekend: What is a good way to solve those *Where's Waldo?* [*'Wally'* outside of North America] puzzles, using Mathematica (image-processing and other functionality)?

Here is what I have so far, a function which reduces the visual complexity a little bit by dimming some of the non-red colors:

```
whereIsWaldo[url_] := Module[{waldo, waldo2, waldoMask},
waldo = Import[url];
waldo2 = Image[ImageData[
waldo] /. {{r_, g_, b_} /;
Not[r > .7 && g < .3 && b < .3] :> {0, 0,
0}, {r_, g_, b_} /; (r > .7 && g < .3 && b < .3) :> {1, 1,
1}}];
waldoMask = Closing[waldo2, 4];
ImageCompose[waldo, {waldoMask, .5}]
]
```

And an example of a URL where this 'works':

```
whereIsWaldo["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"]
```

(Waldo is by the cash register):

I've found Waldo!

**How I've done it**

First, I'm filtering out all colours that aren't red

```
waldo = Import["http://www.findwaldo.com/fankit/graphics/IntlManOfLiterature/Scenes/DepartmentStore.jpg"];
red = Fold[ImageSubtract, #[[1]], Rest[#]] &@ColorSeparate[waldo];
```

Next, I'm calculating the correlation of this image with a simple black and white pattern to find the red and white transitions in the shirt.

```
corr = ImageCorrelate[red,
Image@Join[ConstantArray[1, {2, 4}], ConstantArray[0, {2, 4}]],
NormalizedSquaredEuclideanDistance];
```

I use `Binarize`

to pick out the pixels in the image with a sufficiently high correlation and draw white circle around them to emphasize them using `Dilation`

```
pos = Dilation[ColorNegate[Binarize[corr, .12]], DiskMatrix[30]];
```

I had to play around a little with the level. If the level is too high, too many false positives are picked out.

Finally I'm combining this result with the original image to get the result above

```
found = ImageMultiply[waldo, ImageAdd[ColorConvert[pos, "GrayLevel"], .5]]
```

My guess at a "bulletproof way to do this" (think CIA finding Waldo in any satellite image any time, not just a single image without competing elements, like striped shirts)... I would train a Boltzmann machine on many images of Waldo - all variations of him sitting, standing, occluded etc; shirt, hat, camera, and all the works. You don't need a large corpus of Waldos (maybe 3-5 will be enough), but the more the better.

This will assign clouds of probabilities to various elements occurring in whatever the correct arrangement, and then establish (via segmentation) what an average object size is, fragment the source image into cells of objects which most resemble individual people (considering possible occlusions and pose changes), but since Waldo pictures usually include a LOT of people at about the same scale, this should be a very easy task, then feed these segments of the pre-trained Boltzmann machine. It will give you probability of each one being Waldo. Take one with the highest probability.

This is how OCR, ZIP code readers, and strokeless handwriting recognition work today. Basically you know the answer is there, you know more or less what it should look like, and everything else may have common elements, but is definitely "not it", so you don't bother with the "not it"s, you just look of the likelihood of "it" among all possible "it"s you've seen before" (in ZIP codes for example, you'd train BM for just 1s, just 2s, just 3s, etc, then feed each digit to each machine, and pick one that has most confidence). This works a lot better than a single neural network learning features of all numbers.

I don't know Mathematica . . . too bad. But I like the answer above, for the most part.

Still there is a major flaw in relying on the stripes *alone* to glean the answer (I personally don't have a problem with *one* manual adjustment). There is an example (listed by Brett Champion, here) presented which shows that they, at times, break up the shirt pattern. So then it becomes a more complex pattern.

I would try an approach of shape id and colors, along with spacial relations. Much like face recognition, you could look for geometric patterns at certain ratios from each other. The caveat is that usually one or more of those shapes is occluded.

Get a white balance on the image, and red a red balance from the image. I believe Waldo is always the same value/hue, but the image may be from a scan, or a bad copy. Then always refer to an array of the colors that Waldo actually is: red, white, dark brown, blue, peach, {shoe color}.

There is a shirt pattern, and also the pants, glasses, hair, face, shoes and hat that define Waldo. Also, relative to other people in the image, Waldo is on the skinny side.

So, find random people to obtain an the height of people in this pic. Measure the average height of a bunch of things at random points in the image (a simple outline will produce quite a few individual people). If each thing is not within some standard deviation from each other, they are ignored for now. Compare the average of heights to the image's height. If the ratio is too great (e.g., 1:2, 1:4, or similarly close), then try again. Run it 10(?) of times to make sure that the samples are all pretty close together, excluding any average that is outside some standard deviation. Possible in Mathematica?

This is your Waldo size. Walso is skinny, so you are looking for something 5:1 or 6:1 (or whatever) ht:wd. However, this is not sufficient. If Waldo is partially hidden, the height could change. So, you are looking for a block of red-white that ~2:1. But there has to be more indicators.

- Waldo has glasses. Search for two circles 0.5:1 above the red-white.
- Blue pants. Any amount of blue at the same width within any distance between the end of the red-white and the distance to his feet. Note that he wears his shirt short, so the feet are not too close.
- The hat. Red-white any distance up to twice the top of his head. Note that it must have dark hair below, and probably glasses.
- Long sleeves. red-white at some angle from the main red-white.
- Dark hair.
- Shoe color. I don't know the color.

Any of those could apply. These are also negative checks against similar people in the pic -- e.g., #2 negates wearing a red-white apron (too close to shoes), #5 eliminates light colored hair. Also, shape is only one indicator for each of these tests . . . color alone within the specified distance can give good results.

This will narrow down the areas to process.

Storing these results will produce a set of areas that *should* have Waldo in it. Exclude all other areas (e.g., for each area, select a circle twice as big as the average person size), and then run the process that @Heike laid out with removing all but red, and so on.

Any thoughts on how to code this?

Edit:

Thoughts on how to code this . . . exclude all areas but Waldo red, skeletonize the red areas, and prune them down to a single point. Do the same for Waldo hair brown, Waldo pants blue, Waldo shoe color. For Waldo skin color, exclude, then find the outline.

Next, exclude non-red, dilate (a lot) all the red areas, then skeletonize and prune. This part will give a list of possible Waldo center points. This will be the marker to compare all other Waldo color sections to.

From here, using the skeletonized red areas (not the dilated ones), count the lines in each area. If there is the correct number (four, right?), this is certainly a possible area. If not, I guess just exclude it (as being a Waldo center . . . it may still be his hat).

Then check if there is a face shape above, a hair point above, pants point below, shoe points below, and so on.

No code yet -- still reading the docs.

I agree with @GregoryKlopper that the *right* way to solve the general problem of finding Waldo (or any object of interest) in an arbitrary image would be to train a supervised machine learning classifier. Using many positive and negative labeled examples, an algorithm such as Support Vector Machine, Boosted Decision Stump or Boltzmann Machine could likely be trained to achieve high accuracy on this problem. Mathematica even includes these algorithms in its Machine Learning Framework.

The two challenges with training a Waldo classifier would be:

- Determining the right image feature transform. This is where @Heike's answer would be useful: a red filter and a stripped pattern detector (e.g., wavelet or DCT decomposition) would be a good way to turn raw pixels into a format that the classification algorithm could learn from. A block-based decomposition that assesses all subsections of the image would also be required ... but this is made easier by the fact that Waldo is a) always roughly the same size and b) always present exactly once in each image.
- Obtaining enough training examples. SVMs work best with at least 100 examples of each class. Commercial applications of boosting (e.g., the face-focusing in digital cameras) are trained on millions of positive and negative examples.

A quick Google image search turns up some good data -- I'm going to have a go at collecting some training examples and coding this up right now!

However, even a machine learning approach (or the rule-based approach suggested by @iND) will struggle for an image like the Land of Waldos!

Similar Questions

I know how to install a specific commit from git but how do I find what commit version I have? pip freeze | grep package doesn't list the commit version and I didn't see anything in the docs for this.

Consider a class Waldo, that inherits from Foo and Baz, viz.: class Waldo : public Foo, public Baz { ... }; When I create a new instance of Waldo: Waldo *w = new Waldo; do the Foo and Baz construct

PivotTables in Excel (or, cross tabulations) are quite useful. Has anyone already thought about how to implement a similar function in Mathematica?

I have two approximated functions and I want to find the maximum value (error) between their graphs, to see how much they approach. I used : FindMaximum[Abs[f[x] - p[x]], x], but Mathematica 8 gave me

How do I find the playback time of media with gstreamer?

I can do this in MATLAB easily but I'm trying to do it Mathematica. I have a 27000 element (15 minutes*30 measurements per second) list of wind speed values. I want to find the max value in each 2700

How to properly enable syntax highlighting for Mathematica in Vim? I want it automatically on when opening .m files with Vim too.

I'd like to write command line scripts in Mathematica, but I can't seem to find an Argv[i_Integer] like function. (The docs are FANTASTIC otherwise.)

I develop app by using Lua Glider 2 + Corona SDK latest release. How can I find which version of Lua do it use? Thank you.

I have just started using Natural Language Toolkit (NLTK) as a part of my Engineering college project. Can anybody please tell me how do I read an input paragraph text and 1) break it down into textua

If I want to obtain the best approximate fraction/rational for a given real number and the specificied maximum denominator as an integer, how to do this in mathematica? Many thanks.

How can I define a general function without the exact expression in Mathematica? For example, I don't need this: a[x_, y_]:= 2x + 3y, I need to work with a general parameter a(x,y) instead: a[x_, y_]

I am having a mental block and I know I should know this but I need a little help. If I declare a string variable like this: string word = Hello; How do I find the memory address of Hello? Edit:

I received help from someone here a week or so ago, but there seems to still be a problem with my code. I am running Android 2.0 so I cannot use the methods to get the UI, instead I need to call the m

I have a network but this network is not connected. I want to know how I can find a biggest connected graph in this network?

How do I find all the files that were create only today and not in 24 hour period in unix/linux

I am trying to read a file which I read previously successfully. I am reading it through a library, and I am sending it as-is to the library (i.e. myfile.txt). I know that the file is read from the

Suppose I have a dictionary in C#. Assuming the keys are comparable, how do I find the smallest key greater than a given k (of the same type as the keys of the dictionary)? However I would like to do

How do I find the .NET framework version used in an SSIS 2008 R2 package?

The following Mathematica code generates a highly oscillatory plot. I want to plot only the lower envelope of the plot but do not know how. Any suggestions wouuld be appreciated. tk0 = \[Theta]'[t]*\[

I am on another computer, but I want to scp some files over to my macbook pro. The command for this is: scp -rp filename.txt user@path How do I find the address/path of my mac? I tried ifconfig on m

How do I find the longest increasing sub-sequence of integers from a list of integers in C#?

How do I find out if the space character is in a NSString more than once? It seems like a simple thing but while looking through the NSString methods I couldn't find the answer.

We want to find the largest value in a given nonempty list of integers. Then we have to compare elements in the list. Since data values are given as a sequence, we can do comparisons from the beginnin

I want to create multiple variables with iterated names in Mathematica, something like this: Do[f <> ToString[i] = i*2, {i, 1, 20}] where I get f1=2, f2=4, f3=6, ... and so on. I the error:

Consider the following arbitrary figure generated in MATLAB as an example. The basic idea is that I have a contour plot and I want to showcase selected slices from it in subplots on the right. Is ther

How do I find the max of inputed numbers? Here's what I have so far. It's giving me an error message: 'int object is not iterable' def greatest(num): for index in range(10): num=input('Enter the num

I had an earlier question about integrating Mathematica with functions written in C++. This is a follow-up question: If the computation takes too long I'd like to be able to abort it using Evaluation

How do you obtain graphic primitives and directives from a Graphics object? Leonid Shifrin showed how to remove them in the post Mathematica: Removing graphics primitives. I tried applying something s

Shamelessly jumping on the bandwagon :-) Inspired by How do I find Waldo with Mathematica and the followup How to find Waldo with R, as a new python user I'd love to see how this could be done. It see

In Mathematica, I'd like to do something along the lines of: f[Rational[a_, b_], Rational[c_, d_]] := {a+c, b+d} But if I evaluate it with expressions of the following form I get the wrong result: In

I want to do the following in Mathematica Table[p[i], {i, -3, 0}] = Flatten[{Table[0, {i, -3, -1}], 1}] But I got an error: Set::write: Tag Table in Table[p[i], {i, -3, 0}] is Protected. However, it

While using GDB to debug a program I have been having issues with the program stopping while in debug mode. When I do a backtrace, I find that it's deep within a proprietary third party library call s

I programmed a Euler function but misread the instructions, so now I have to make a new one, but I can't figure it out. I have made the following automatic Euler function. f[x_, y_] := -x y^2; x0 =

How do I find out if a class is immutable in C#?

I can't seem to find a specific node in the graph without traversing the whole thing. Is there something I'm missing? I'm using tinkerpop blueprints. Orientdb gives some sort of unsemantic id to a no

I'm trying to force Mathematica to implicitly differentiate an ellipse equation of the form: x^2/a^2+y^2/b^2 == 100 with a = 8 and b = 6. The command I'm using looks like this: D[x^2/a^2 + y^2/b^2 ==

I am going to launch my PHP/mySQL website. I have enabled statistics on mySQL. However, what tools can I use to find out the missing indices?

If I have a Manipulate statement, such as: Manipulate[ Graphics[Line[{{0, 0}, pt}], PlotRange -> 2], {{pt, {1, 1}}, Locator}] How do I change the appearance of the Locator object in the easiest w

I recall reading somewhere recently that you can obtain the server session start-time in timestamp format of the exact time a php session started for a user, but I cannot find this article again. What

I am having trouble on how to input data with frequency in Mathematica. For example, I have 0 with 10,000 frequencies, 1 with 9000 freq, 2 with 3000 freq and 4 with 1000 freq. Can anyone help me to in

I have a Windows 8.1 WinRT (modern/metro) application that is partly native C++ and partly .net (C#) At some point the C++ part is calling a delegate that is set to a C# method, when this delegate is

I'm trying to find out if there is a Save As box open with this WinExist script if WinExist,Save As,Save As MsgBox, is there else MsgBox, is not there

In a plane (two dimensional), a path can be represented by a sequence of (n+1) points (Xo,Yo),(X1,Y1),...,(Xn,Yn) such that, for any i (integer 1 < i < n-1): Pi(vector) = [Xi-X(i-1),Yi-Y(i-1)]

This question is very similar to How do I find the Client ID of control within an ASP.NET GridView? However I'm using a listview and a label: <ItemTemplate> <asp:ImageButton ImageUrl=Resourc

how do I find in XCode all caller functions of a specific function like eclipse's Call Hierarchy

How do I find out all of the strings surrounded with double quote and replace them a new string? Input Output ----- ------ `Hello User, what's up?` -> `Hello l(User), what's up?` `Regexes are

When using the filling function in Mathematica, I would like to change the style of the edge line in a way that it has a) the same color like the filling/no edge line b)it will be below/covered by ano

I have two tables that are joined together. A has many B Normally you would do: select * from a,b where b.a_id = a.id To get all of the records from a that has a record in b. How do I get just the

I have a file with some non-printable characters that come up as ^C or ^B, I want to find and replace those characters, how do I go about doing that?