(@) Main | About The World | Web Hosting | Help | Memo | Login | WebMail login
(From "Today on The World")
One of the first questions people ask when they're putting up a Web page is "How do I get it listed in all the popular search engines?"
You don't have to fool around with <META> tags or use a special program or pay anyone money. Really. Search engines are designed to list everyone who wants to be listed -- just go to the search engine's front page and look for a link named "Add a URL", "Index My Site", "Please List Me", etc.
For instance, let's go to AltaVista:
At the bottom of the main page is a set of links:
> About AltaVista | Set your Preferences | Add a Page | Text-Only Version
Following that link will show you a page of policies ("Please don't spam us") and give you a little form where you can type in your URL.
That's all you have to do. Give it the URL of your front page, and the "crawler" from the search engine will go there the next time it runs (usually in a day or two) and follow all your links to the other pages on your site. (Most search engines work this way.) Let's try it with HotBot:
Again, the link we want is way down in the bottom right corner in the tiniest text imaginable:
> About Wired Digital | Our privacy policy | Text-only version | Add URL
This also presents a simple form where they ask for the URL of your site (as well as the common "Check this box if you want us to spam you" item that shows up on everyone's feedback for these days.)
So just go to each search engine you know and look for something you can click on to add your site. (Sometimes it's pretty well hidden.)
I love Google (www.Google.com) but Google's site does not make it easy to find their "Add A URL" link (I had to search Google's site to find it): Add A URL to Google ...note that Google will likely find your site by "crawling" links from other people linking to it, even if you don't manually add your site to Google.
There are also other services you can use, and programs you can buy, that go to a dozen search engines and send in that URL for you, but why bother with that when it's so trivial to do it yourself?
Different search engines have different amounts of crawling on their to-do lists, and they crawl at different levels of aggressiveness, so depending on the search engine it may be a matter of hours or days or weeks before your listing is added. Also, some update their database more frequently than others, so when one of your pages changes, they may still list the old content for a while (of course, you can always re-submit your URL when you change your pages...)
And, because the search engines work by following links from page to page, they may well have found your site (or at least part of it) already if you have friends who have linked to your site from theirs. The more links there are TO your site, obviously, the more likely it is that both human beings and robots will visit you.
You can stop here if you're happy just being listed and don't feel the urge to tweak your site in minor ways to be more competitive with other sites.
So what IS the deal with all those pages that have fifty zillion copies of their keywords at the bottom of the page in semi-visible black-on-black lettering? What IS the deal with the mysterious <META> tag?
Some people want to make it more likely that their site is listed near the top of the search engine's results (usually because they're selling something or because they're offering something that is also available from several better/more popular sites) so they take advantage of the search engines' page-ranking technique to improve their site's score. Know how some search engines give you a little "confidence rating" (100%, 90%, 20%...) for each site? If you understand the algorithm the search engine is using for these calculations, you can improve your score.
Each search engine uses a different technique (with some overall similarities) and in many cases we can only speculate on how they work (the search engine makers don't want to tell the people advertising porn how to get their site listed on everyone's search results!) So please take this advice with a grain of salt of unknown size and shape.
Let's assume that your page is "Fred's library of Bee-Keeping tips". In general, search engines may be scoring your site based on these things:
- Do the words the people are searching for occur on your page? (This is usually what determines whether or not your page is shown at all in the search results.) So if your page is about bees and someone is looking for "bees", they'll find your page if it says "bees" on it somewhere... but not if it only calls them "yellow buzzy bumble-things." Similarly, if your page consists entirely of pictures of bees, with no text, it won't be found.
- Do the words occur relatively often in the text? If your page is very long and says "bees" a dozen times, it may be listed above a page of equal length that mentions "bees" once... but below a shorter page that also mentions "bees" a dozen times. In other words, some search engines look for "keyword density".
- Do the words occur close to the top of the page? A page that begins with the sentence "Bees are fun." could be listed above a page that begins "There are few things more fun than bees."
- Do the words occur in the <TITLE> of the page? (And, as above, is the density of them relatively high relative to the length of the <TITLE>? Do they occur near the start of the <TITLE>?) A title of "Bees" may rank higher than "Bees and other stuff" which may rank higher than "Other stuff and bees". (Bear in mind, also, that some people keep their bookmarks alphabetized, another reason that your bee site's title should start with "B" for "Bees" and not "W" for "Welcome To" or "T" for "The Home Page Of". Also, many Web browsers chop off the end of <TITLE> if it's more than about 64 characters!)
- Some search engines index the ALT="..." text attached to your images, and some don't. Some index the filenames of the image files, and some don't. (Some even let you search for images, such as HotBot and GifWizard.) So if you have a picture of some bees, you might want to change <IMG SRC="pic1.gif"> to <IMG SRC="bees1.gif" ALT="Bees">. (Besides, using ALT="..." is always a good idea if you want your page to be viewable when the reader hasn't loaded the graphics... or can't load the graphics.)
- Some search engines award bonus points to shorter URLs, under the assumption that they're the "top" of the site. So "http://www.mysite.com" would rank above "http://www.mysite.com/root/home/index.html".
- <META> tags can give search engines more hints. Read on...
<META> is the all-purpose "do with it what thou wilt" tag left purposefully undefined in the HTML specification. In the words of the W3C HTML 4.0 specification:
> The META element is a generic mechanism for specifying meta data.
Its reason for existence is that it gives you a place to put special-purpose items that people aren't meant to see, in other words, the people who write a program for creating or editing HTML can store their own special stuff there without affecting anything else. Some search engines have decided to encourage the use of <META> for lists of keywords, so that your page can be counted as containing keywords without having to have them all in the "clear text" of your page. While it is not clear how many search engines can see <META>, and there is differing advice on how to format the keyword list (commas, spaces, or commas followed by spaces?) the form used most often is:
<META NAME="keywords" CONTENT="word,word,word,word">
...with commas between the words and no spaces after the commas. An example for the front page of the bee site could be:
<META NAME="keywords" CONTENT="bees,bee,beekeeping,bee-keeping,bee keeping,
hive,hives,apiary,apiaries">Because some search engines, when you search for "bee", will find "bees" -- but some won't -- it makes sense to list both the plural and singular versions of nouns. (Especially in the case of "apiary"/"apiaries"!) Putting "apiary" here is a good idea because some people might refer to your hives with that fifty-cent Latin-derived word instead of the "hives" you've mentioned all over your page. The <META> keywords give you a chance to include the words you didn't say in the body of the page.
This doesn't mean you should pack every word in the dictionary onto your "Keywords" list. Most search engines weight things so that if your keyword list is "bees,spatulas,pork rinds,Pez,nougat,elastic, pants,bubble gum,socks" the keywords will be considered less important than if the list was just "bees,spatulas". Also, the keywords at the start of the list may be considered of greater importance than the ones at the end of the list. I recommend listing first the keywords that differentiate this page from the other related pages on your site, followed by the keywords which are the same on all your pages, with the really general stuff last. For instance, if you have a page about honey and a page about queen bees, you might use:
<META NAME="keywords" CONTENT="honey,clover,flowers,bees,bee,
beekeeping,apiaries"><META NAME="keywords" CONTENT="queen,queens,queen bee,queen bees,
bees,bee,beekeeping,apiaries">Why do we have "queen", "bees", AND "queen bees"? Because the person looking for bee information might be searching for "The phrase 'queen bees'" or "'queen' and 'bees'". Compound words (including hyphenated ones) and phrases can be included both as individual words and as things with spaces (or hyphens) in them to ensure that you cover all the variants. (Don't forget variant spellings! A page about colors might want to list "colours" for the British people.)
Note that if the list included "bees,bees,bees,bees" that would probably not help you much if someone were looking for "bees". Anyone who really wants to find pages about bees should be able to find your page by default even if you don't use every trick in the book to lure them in -- that's what the search engines are for. Using repeated words or enormous lists of words unrelated to the topic at hand will just set off the search engines' bozo detector. As HotBot says:
> If HotBot recognizes any spoofing technique, it will severely penalize a
> page's ranking.There are a zillion other things you can do with <META>, but the only other kind of <META> relevant to the search engines (or so I'm told) is "description":
<META NAME="description" CONTENT="Beekeeping tips and tricks from
a bee person who has been stung over 1,000,000 times.">...most search engines don't pull out that when generating a summary of all the sites they found, but a few will show that as the description of your site (whereas most will show either the <TITLE> or the first couple of lines of the body text.)
Note that there are a lot of free or shareware or overpriced commercial programs that claim to generate <META> tags to improve your site's ranking. They basically work like this:
PLEASE TYPE IN YOUR KEYWORDS HERE:
> bee,bees,beekeeping
GREAT! NOW ADD THIS TO YOUR PAGE AND GIVE US $50:
<META NAME="keywords" CONTENT="bee,bees,beekeeping">
Big deal!
Note that to REMOVE your listing from search engines, in general, there are ways of doing this (but they vary, and it's not always possible.) Most search engines let you follow the "robots.txt" standard for keeping robots off your site; however, you have to have your own domain name (www.yourname.com) to be able to do this (Home Page Alone customers on The World can't make a robots.txt file for world.std.com!)
Some search engines (not all) can be told to NOT index your site with a <META> tag:
<META NAME="robots" CONTENT="noindex">
If you don't want your page to ever be listed in the search engines, don't submit your page to them, and don't tell any of your friends to make links to your page (remember, the search engine crawlers follow all the links they see) and that will greatly decrease the chances that anything (or anyone) will blunder across your pages without knowing the exact URL.
Search Engine Watch's "How To Use <META> Tags":
...has a brief tutorial on using <META> for keywords and description, as well as a very good set of links to related tutorials. (Also poke around the rest of the Search Engine Watch site for other useful tips!)
General information on <META> tags of all kinds:
http://vancouver-webpages.com/META/
http://www.webdeveloper.com/categories/html/html_metatags.html
Official W3C HTML specification for <META>:
And the W3C has notes on helping search engines index your site:
(that last one also covers robots.txt.)
Robot Exclusion Standard (for robots.txt):
http://info.webcrawler.com/mak/projects/robots/norobots.html
|
|
Comments? Questions? Problems? Contact us. Page last modified September 5, 2003. Web site contents & design Copyright © 2009 Software Tool & Die. Legal information. | Privacy policy. |
|