How Spiders Are NOT Like People
Last week, I talked a bit about how search spiders are like people, and the positive impact this gradual convergence has had on usability and SEO. Of course, it would be overly simplistic and even potentially harmful to suggest that designing for spiders and designing for people is one in the same task. Doing usability or SEO right also requires understanding the ways in which spiders and people still differ.A Picture Is Worth 3-5 Words
Spiders are blind, plain and simple. While they've improved incrementally in their ability to penetrate some rich media (images, Flash, etc.), spiders have a diet that is 99% text. The most they'll see of the pictures on any website is those 3-5 words of ALT text you use to describe them.Humans, on the other hand, are drawn to images. Of course, people are complicated, and sometimes we develop habits such as banner blindness, but generally speaking, pictures attract our attention. We're also visually attracted to change, whether it's blinking red text or dancing hamsters.
Spider Vision: The Best SEO Trick Ever
Do you want to see what spiders see? I'll share one of the best SEO tricks I've ever learned, one that's doubly great because it's so simple. Go to your website and select all of the content (Ctril-A, if you're a PC user). Now, copy (Ctrl-C) that content, open up any text editor (Notepad works well) and paste (Ctrl-V). Don't try this trick with Word, as you'll get the whole web page, unless you tell it to paste only the unformatted text.See that mess? That's what the spiders see. It's not pretty, it's not organized, and all of your carefully crafted buttons and painstakingly selected photos are nowhere to be seen. By the way, this trick also doubles as an accessibility test, as what the spiders see isn't that different from how a visually impaired person experiences your website.
Spiders Are Patient Critters
Spiders have one virtue that people don't; they'll keep coming back to your website day after day, forgiving your bad layout, design flaws, and even long load times (up to a point). Reading through your site day after day is their job, and they do it well. People aren't nearly so forgiving. Our attention spans seem to be getting shorter by the day, and unless your website has a strong brand or a visitor was recommended to it, you've often got seconds to capture a person's interest.So, Who Do You Design For?
Well, I'm a usability specialist, so I think you know my answer. Design for people and, most of the time, you'll keep the spiders happy. At the same time, though, remember that search results are an important part of many visitor's journey to your site, so try to play nice with the spiders.Dr. Pete
· Tuesday, November 6I'm not aware of a strict ALT text limit, but I've seen suggestions that you run into display problems after more than one line (one author mentioned 150 characters, another 250 characters; I'd suggest even less). The newest version of HTML apparently supports a "longdesc" attribute, but reports seem to be that that isn't supported by most search engines. At any rate, I wouldn't keyword load an ALT tag; keep it short and sweet.
There are some tools that are slightly beefed up versions of the trick I mentioned. I just find the low-tech approach fascinating, for some reason. Maybe because it was one of the first SEO tricks I learned.
Sarven Capadisli
· Wednesday, November 7@Dr. Pete:
"Design for people and, most of the time, you'll keep the spiders happy."
"Design" what? Do you mean to say how to markup a document?
"The newest version of HTML apparently supports a "longdesc" attribute"
HTML 4.01 Strict specification has been a recommendation since 1999. The
longdesc attribute was originally meant to be used with accessibility in mind. However, this has not taken off in the real world as predicted.The
img is an inline-block element meaning that it will dump in the information from where is requesting the resource (an image). The alt attribute then can contain plenty of information (quality not quantity of characters - a single meaningful sentence will do) in place of the image. It is meant to be used in cases where a user (a machine as well) cannot not experience the image visually or where the image is not available or broken.There is a difference between spiders and indexing information. Spider's don't simply extract the "text" from a document. Its main purpose is to collect (new) documents and return the structure (connectivity of two resources) to the indexer. A document that is written properly surely contains semantic information that is useful to indicate a number of things, for instance: a logical order, emphasis, nature of content. It is up to the indexer to make sense and serve that information back to the user (or other machines).
As far as "seo" is concerned I've written an article on internal seo guidelines (good web development practices) that you might find interesting.
Dr. Pete
· Wednesday, November 7Sarven, I just meant "design" in the broad sense, less in the details of the markup and more in the sense of information architecture, navigation, layout, etc. It's true that the search engines do take more into account than just the plain text (emphasis cues, such as bold or header tags are certainly one, as you mention), but I think the exercise is still very valuable as a gross approximation of just how different the spider's field of vision is.
Montreal hotel
· Wednesday, December 5Spiders are also not like people in that most people will ask you to crush a spider that came over uninvited! ;)



Mike Maddaloni
· Tuesday, November 6A couple of thoughts:
* Is there a limit to the amount of ALT text a spider will read?
* Have you seen the SEO Text Browser at Domain Tools? Here's an example - http://whois.domaintools.com/thehotiron.com - is this similar to your trick?
mp/m