Don’t you hate blog posts that have some stupidly audacious title like “How to Beat Google at Search for $50,000 in 10 Easy Steps” and then fail to deliver the goods? Well, get ready, because this isn’t going to be one of those posts. SetJam is the world’s best search engine for online TV shows and movies. We really did beat Google (and every other company that’s built a similar search engine). We really did it for $50,000 ($50,231 to be exact). And yes, I’m actually going to explain to you how we did it. And when I mean “explain” I don’t mean some vague platitudes about hard work and listening to your users. I’m actually going to give you “secret” information–the real details that let us pull this off–details so good you could copy us.
One word on why I’m doing this. Partly, it’s my belief in openness and sharing. I think we’re all in this “thing” together–call it technology, entrepreneurship, life, whatever. I owe a great deal to this most open of platforms, the internet, and if I can help you make it better, the better for all of us. I’m also doing this out of self-interest though. I don’t have the marketing power of Google (or any of our competitors). By giving up the goods, I expect you to:
- Use SetJam.
- Tell your friends, your family, the guy you buy coffee from–EVERYONE to use it too.
- Tell us how to make it better. We know it’s broken, but we’re too close to it to see HOW it’s broken. We need you to take the time to let us know.
Okay. To begin with I’m going to start with 3 of those platitudinous rules, because they apply to all startups. After that, I promise to get into the secret details (with video examples!) that are specific to how we built a better search engine than Google.
Step 1: Don’t try to beat Google. Try to solve a problem. We weren’t TRYING to build a search engine. We were just trying to watch TV shows and movies online. If we would have started out with the goal “I’m going to beat Google at search”, I think we would have failed. That’s because we would have made our search TOO much like Google. We would have seen all the amazing details they do that make web search great and we would have ineluctably copied them. Since we were focused on solving a real world problem though, we were able to take a completely different approach.
Step 2: Don’t be afraid. I would have never tried to build a search engine because search engines are really “hard-core” technology that operate through “secret algorithms” and cost billions of dollars to make. Not true. Search is just like any other technology problem. The funniest thing to me is that I was so terrified of search, that I didn’t even realize what we’d built until I tried to explain it to other people. I would use phrases like “TV guide for the web”, “A concierge for online video”, “Your DVR for online TV”. No one had a clue what I was talking about. I’d then try to explain it with “Just add your favorite shows, and we’ll bring you the best links, perfectly organized” I kept thinking, if only there were some easy way to explain a magic text box that brings you the best links. I kid you not.
Step 3: Get your product out there and listen. We put SetJam out into the wild well before it was usable. We did this because I really wanted to see our mistakes early. My initial vision for SetJam was a way to build a TV queue (much like you do with a DVR) and then to bring you the best links to watch your shows. The problem with this is that to build a queue, we need to know who you are, which means registering–blech! To mitigate this issue, we used Facebook Connect to remember who you are instead of making you create another account. Instead of mitigating the issue, however, it exacerbated it. Not only did we get complaints that we were forcing people to authenticate before seeing any results, we were forcing them to authenticate through Facebook so we could spam their friends!
These problems seem obvious now, but when you’re in the heat of the product battle, you miss obvious things. We KNEW our results were amazing, so we never thought that anyone would need PROOF before starting to build a queue. Why not save the step of having to manually add the search results? We KNEW we were using Facebook to PROTECT our users from creating another account and we KNEW we would never write anything to their News Feeds unless they told us to. Our users had been burned before though, and they didn’t trust us yet. If we hadn’t gotten SetJam out and listened, we’d probably lose most of our visitors BEFORE they could even see what we’d built.
That’s it for the platitudes. Now let’s get into the secret details. For most of these, I’m switching over to video mode, so I can show you what I’m talking about.
Step 4: Focus your corpus. Oh shit… the secrets are in Latin! Don’t worry, this is just search engine jargon. Your search “corpus” is the body of information you’re trying to search against. If you’re a small team with limited resources, I do not think you’re going to beat Google at searching the entire internet. By focusing on a smaller (yet still REALLY important body of information, like TV shows and movies), you can do things they can’t though. Check out the video:
Step 5: Index what Google doesn’t. Google doesn’t index all the data in the world. In fact this may surprise you, but Google doesn’t even index the MAJORITY of the data in the world. That’s because MOST of the data in the world is locked up behind password protected sites or not even networked. Bring that information to your search and you’ve gone a LONG way to beating Google. Check out the video:
Step 6: Cut out the clutter. One of the biggest problems we face as humans today is that there is just TOO much information. This means that getting the MOST information (by indexing things Google doesn’t) is just the beginning. Your next step is to get rid of the stuff that doesn’t matter. Check out the video:
Step 7: Organize your results for a purpose. Google has no specific purpose. Because of this, it organizes its search results in a very generic way. If you’ve followed step 1 and started by trying to solve a real problem, you can organize your results in a way that serves that SPECIFIC purpose. Check out the video:
Step 8: Prove the negative. Okay. I can’t BELIEVE I’m giving this one up because this is some secret mojo that makes SetJam great. Basically this rule means that sometimes it’s just as valuable to show what is NOT available as it is to show what IS available. You all OWE me for this one. Check out the video:
Step 9: Keep it simple. This rule is straight out of Google’s handbook, but it bears repeating because if you do NOT follow it, Google will kick your ass. I know you want cool feature X,Y,Z, but let me just tell you that your users will not get it. Never underestimate how complicated and confusing new software is to your average person. If you can do one thing and do it in a way that people actually get, you are WAY ahead of 99% of your competition. Check out the video:

(Btw, if you’d like to see how I reacted when I found out that Clicker, an $8 million dollar startup, launched 3 weeks before us, check out this post.)
Step 10: Test Driven Development. This last rule is not one that I can really explain in a video, but it is absolutely critical if you want to beat Google at search. Here’s a scenario you’ve probably never considered unless you’ve tried to build a search engine. You’ve just finished your prototype and you’re looking at your results. You type in a result that gives you bad information. You figure out the problem and change your code to fix that problem. How do you know that you didn’t just ruin the results for the 9 million OTHER things you index? Ah-ha… you don’t unless you’ve been building unit tests and data integrity tests all along the way. Without these tools you will be in a never-ending cycle of fixing problems while simultaneously causing them elsewhere. If you haven’t been a believer in test driven development before, you better GET that religion before you try to become a search company.
That’s it. 10 steps to do the impossible. Now get out there and start making the world better. Before you do though, signup for our private preview at www.setjam.com. To be fair SetJam still has a LOT of issues. We’re still battling dupes, merged series, and missing links. We still don’t index half of what we want to. Despite all this, SetJam is still the easiest way to find and watch full-length TV shows and movies online, and with your help, it will only get better!

Pingback: Borys Musielak (michuk) 's status on Tuesday, 10-Nov-09 14:26:48 UTC - Identi.ca