search engines - submissions advice[please note -- this page has not had a major update since 2004 -- however much of the advice it contains is still relevant]common sense search engines vs directories how search engines rank the results of a query preparing web pages theme-based indexing submissions purchasing query terms paying for search engine inclusion Yahoo and LookSmart feedback gateway (doorway/bridge) and hallway pages popularity-based indexing warning! problems visitors so what do i know about it? common sensethe best search engines are the ones which return the most relevant results for search queries - when people use search engines they are looking for the best site(s) on the subject they are interested in - if a search engine doesn't return relevant results, then people won't use it - therefore it follows that, other things being equal, the most reliable way of getting to the top of the results page on the major search engines for a particular query term is to have the best page on that subjectsearch engines vs directoriessearch engines index individual pages - directories list sitessearch engines use programs (called robots, spiders or crawlers) to actively search the web, downloading page content for indexing as they go - people looking for sites on search engines typically enter a query term (one or more words, or a phrase), and after a few seconds a list of links to sites containing the query term, ordered by relevance, is displayed the information held by directories is entered by its visitors, who supply details of either their own site, or sites they wish to be included in the directory's database - web sites are arranged by category, and are typically found by following a series of hierarchical links until the appropriate sub-category is found, where the sites will be listed alphabetically search engines contain links to large numbers of pages, but queries can return results many of which are not relevant to the search request - directories contain links to a smaller number of pages (usually restricting the number of submitted pages from any one site), and sites are often reviewed to ensure categories only contain relevant sites in practice, most directories have an internal search facility and use back-up info from one or more of the search engine indexes; and many search engines incorporate site details from one or more of the directory databases how search engines rank the results of a queryone or more of the following are used -
preparing web pagesthe most important part of the page re indexing by search engines is the <title>.....</title> tag - place it immediately after the <head> tag - not more than 7 (or 8?) words - search engines will display between 60 and 115 characters of the title - use different titles on different pages - the title should contain the (one or) two query terms relevant to the page content that are most likely to be used by people looking to find the page via search engines (usually this is not a company name!) - words or phrases used in the title should also appear in the page textpotential query terms (keywords) that people may use for finding sites via search engines shd be scattered throughout the text, especially the first 25 words - worth ensuring that they also occur within header tags - eg - <H1>query terms here</H1> - but overuse of the same keyword in a small amount of page text may be penalised (not more than 7 times?) - at least some of the pages on the site should be rich in plain text (200-600 words is sometimes recommended) description meta tags are worth adding - even when not used directly by the search engines, the text is sometimes indexed as tho it were part of the page - often used intact as a description of the site when it appears on the results page for a search engine query - optimum number of characters 100, max number of chars 150 keywords meta tag almost defunct, because this provides a way of attempting to spam the search engines, meta keywords now only used by Inktomi & Teoma - most of the following info is probably no longer relevant - first letter of keywords possibly worth capitalising, so that it can be found by a query term starting with either a lower or upper case character (but trend now for search engines not to distinguish - only AltaVista is fully case-sensitive) - max no of characters 1024 minus the no of chars in meta description - may be better to separate individual words or phrases with commas (rather than with spaces, the other option) as phrases may then be indexed as phrases rather than individual words (??) - ensure everything on 1 line (no line breaks) - query terms at beginning of list are likely to be considered more important than those at end - it is possible that more than 3 occurrences of the same keyword may be penalised (tho this may depend on how it is used... eg whether it is just repeated, or whether it is used in conjunstion with phrases, as with hyphenated words) - rumoured that keywords may sometimes be counter-productive if not also encountered in page text, but also possible that keywords not in the page text will be indexed (Oct '02) alt text (as used with image/sound files) is usually indexed as tho it were standard page text - comments text may be indexed by some search engines (but not by Google or Alta Vista) potential query terms can be used as file names, directory names or domain names, but domain names are not given much weight (far less then titles and page text - if using query terms for file names then hyphens are better than run-ins or underscores eg animal-farm.com is better than animalfarm.com or animal_farm.com) - domain names (excluding suffixes such as ".com") best kept to less than 55 characters (Jun '02) the amount of text on a page effects search engine rankings (sometimes a maximum of 250 words is recommended for the home page) - each page shd be centered round 2 or 3 potential query terms, and these shd each be repeated 3 times if this can be done without disturbing the flow and sense of the text submissions are most effective when they are for simple static text-based html pages - frames and redirection pages may not be indexed - it is best to avoid refreshes and redirects - if redirects have to be used, they shd not be at domain name level javascript is best placed in a separate plain text file using the .js file extension - otherwise it shd be moved to the bottom of the page - this ensures that keyword rich page text is the first thing the search engine robots come across pages optimised for search engines shd be linked to directly from the home page and if possible be placed in the root directory dynamic URLs containing a "?" "=" and other query strings may not be indexed by some search engines - for ways round this see MarketPosition, NetMechanic.net, ASP 101, Apache, High Rankings, and Digital Web Magazine dynamic page content (eg from a database) is best generated from a page with a static URL, however spiders will only index one set of information per visit the noscript tag shd be used for dhtml menus - also a text version of any Flash info can be placed in an ALT tag within the noscript tags in an attempt to stop people spamming their index, Google introduced new algorithms in November 2003 which removed or penalised many sites with previously high rankings - over-optimising a site may result in pages getting lower rankings than if they were not optimised - two of the signs of over-optimisation are thought to be filenames which include a particular key word/phrase for which a page has been optimised, and overusing a key word/phrase - the solution is to design the page content for human visitors rather than for search engine robots but not to do anything which might prevent robots from indexing the page appropriately theme-based indexingthere have been suggestions that some search engines may take into account the content of all the pages of a site when indexing individual pages (reducing the essence of the site to a couple of query terms) - if so, then important keywords would need to appear on all the pages of the site, and any links would need to point to sites which have a similar theme (that is, contain the same keywords) - but current opinion does not support the idea that search engines make use of theme-based indexing.submissionsmost of the popular english language search engines and directories are listed at http://www.wussu.com/search/ - Google, MSN, Teoma, Yahoo and Gigablast are the biggest ones - a good search engine produces relevant results in a clear format, has a fast response time, keeps cached copies of pages, and provides a translation facilityall the emboldened ones in the list are either search engines generally worth submitting to, or sites of special interest and relevance those which have an "add URL **" link are very easy (ie quick) to submit to; those which have "add URL *" are fairly easy to submit to; those which just have "add URL" can take quite a while to enter all the details requested and/or find the appropriate categories for entries to be placed - those sites where the "add URL" is not linked have no general page for submissions but (eg) might require a category to be selected before the submission procedure can begin the average time after submission before sites appear in indexes varies - except for paid listings, it's best to allow 2 months for submitted sites to appear in the search engines results most search engines claim that they will reindex sites automatically (sooner or later) if content changes - it is also sometimes claimed that the more often page content changes, the more frequently will robots revisit the page to reindex it - resubmitting may speed up the process, but if a site already has a good ranking the resubmission may result in a reevaluation and the site's position might drop if changing the domain for a site, some search engine consultants think it is best to ensure the old site's pages are first removed from the search engines' databases before resubmitting (removing pages is usually done by resubmitting pages which no longer exist) - otherwise if the same page is present in a search engine under 2 URLs, the search engine might think it is being spammed, and drop both pages purchasing query termsalso called "paid placement", "paid listings", "featured listings" & "sponsored listings" - "ppc" = pay per clickthruquery terms can be purchased from some search engines - payment is so much per clickthru with a minimum monthly spend - results are ranked on how much people are prepared to pay to have their sites at the top of the results table when their query term is matched exactly - results that have been paid for (and which therefore appear at or near the top) are sometimes indicated together with the amount paid a list of ppc search engines where keywords can be purchased is available from PayPerClickSearchEngines.com - Overture has been the most popular, supplying results (especially it's top 3 paid listings) to several other search engines info re popularity of various keywords can be obtained from WordTracker and Google (tho Google info has a cut-off, only giving details of keywords appearing more than 200(?) times a day) - there's a good article re Wordtracker at 1st Search Engine Ranking - it now has a database of 350M keywords taken from the last 60 days queries from Alta Vista, Dogpile and MetaCrawler (Jun 01) Google provides an AdWords service where query terms are bid on to determine frequency of occurrence and position within the featured listings (which are kept separate from their main results KeywordSpy and NicheBot also provide keyword lookup services, but do not offer a free trial Yahoo! and dmoz (The Open Directory)Yahoo operates a policy of "paid review" - commercial sites are charged annually for submissions to the main (US) directory - however payment is only for the site to be reviewed and does not guarantee that the site will be acceptedthere is free submission to non-commercial categories and to local Yahoo directories (at least in theory) each site is manually reviewed to ensure it is worthy of being included in their database (hmmm)... only "high quality" sites are accepted (top level domains given preference) -- most (all?) sites using the free submissions procedure are rejected, according to Yahoo because they are not of good enough quality; others say it is because Yahoo doesn't have time to review them -- hoever this may be a deliberate policy to try and ensure the paid review option is used commercial sites may need to have a postal address displayed before being accepted if submission not successful after 1 month, then resubmit, and if not found after another month complain politely by email to url-support@yahoo-inc.com - include the URL, but categories and previous submission dates not needed (?) (Jan 99) (previously it was suggested that all initial submission data be included) to change Yahoo entries a) submit change, b) wait 10 days, c) email them with URL + date of the change request, d) repeat from b) until they make the change [flying pigs link to go here] dmoz, also known as The Open Directory, does not charge for submissions - it uses volunteer editors and has built up a dtaabase of over 3M sites - it supplies many other search engines with their first listed result(s) and being listed there helps get good rankings on Google -- however submissions to categories which don't have editors (or where editors are not so conscientious) may take longer to be processed, and requests for changes are not always dealt with speedily - however it is fast and clean to use and it's straightforward to submit to paying for search engine inclusionthe advantage is speedy entry into databases and regular reindexing of content - however if a site is already well ranked, then it is likely to be counter-productive to pay for inclusion, because factors contributing to the high ranking which are built up over time (such as linkage and popularity) may no longer be taken into accountpaid inclusion programs are likely to have their own spiders -- when the subscription ends the URL may be dropped from the search engine results unless/until it occurs in the default database (fed by the standard spiders) paid inclusion can be used to get search engines to index dynamic URLs - it is aldso possible to take advantage of the regular reindexing facility to tweak pages on a trial & error basis in order to get better search engine rankings for particular query terms AltaVista, Ask Jeeves/TEOMA, FAST and Inktomi offer one or more means of paying for individual URLs to be spidered feedbackneed to check at intervals to see if submissions have been successfulnewer sites often take a few months to climb the rankings because link analysis is used, and newer sites usually have less links pointing to them gateway (doorway/bridge) and hallway pagessome submission agents try and ensure high ranking results re search engines searches by writing different entry pages (often called gateway pages) for each of the major search engines - this takes account of the fact that search engines use different indexing methods - each entry page is then adjusted and resubmitted until it comes top of the rankings for specific query terms - the rankings are then continually monitored to accommodate changes in search engine indexing algorithms (which are continually changing in an effort to prevent people trying to circumvent them), and also to prevent the entry pages being usurped by newly submitted sites (which also may be using gateway pages)for most sites there will be several relevant query terms which potential visitors might use, in which case it would be necessary to provide a separate gateway page for each query word for each search engine! in some cases, pages may be ranked higher if they are found indirectly by the search engine robot via a link from a submitted page - sometimes a hallway page is submitted, a specially designed page containing links to gateway or content pages however, there is an ongoing debate about the validity of using these techniques, and search engines prefer to index what the visitor to the page sees rather than a page specifically customised for their own robots popularity-based indexingfollowing the success of Google search engines are increasingly using popularity to influence the results of searches - by -
generally, the more links to a site the better, but links from Web sites which have a lot of sites linking to them may be of greater importance than links from less popular sites - keywords included in link text may boost the ranking of the linked page for that keyword especially if the link text is the same as the title of the page being linked to - links to domain names (rather than subdirectories) more likely to be taken into account (?) further info from Search Engine Watch Free For All (FFA) pages are long lists of links often in submission date order - Google Inktomi and Fast have all stated that they disapprove of them - Google bans sites which are believed to be members of link exchanges or are involved in artificial ways of boosting links to themselves - presumably this includes some forms of reciprocated linking - Google is reported to have banned sites which offer incentives for other sites to link to them warning!search engines take steps to prevent people fooling them by using various spamming techniques - it may be possible to fool the search engines for a while using a variation of one of the many tricks already tried, but the major search engines change their ways of ranking sites almost daily, and they penalise sites which deliberately flout either the letter or spirit of their rules of fair playa simple text-based page containing high quality original content with key-words appearing once in the title and two or three times in the page text is still the best free long-term method for getting high rankings in search engines which analyse page content problemssome reasons why submitted pages don't appear in search engine results -
if all else fails, a new domain should be used, at the same time moving one's hosting to a new service provider (who will have a different set of IP numbers) visitorsthe following is from research at Penn State (June '03)users typically visit only the first three results from a search query - when they have reached a web page, one in five searchers stays for 60 seconds or less - 40% of searchers will have left the pages within three minutes 54% of users view only one page of results in each session - 19% went on to the second page, and 10% looked at the third page of results - about 55% of users look at one result only - more than 80% stop after looking at three of the listed web pages the description of the site in the search engine results needs to be as clear as possible about the purpose of the site - pages need to be well-designed, easy to load and relevant to a searcher's needs so what do i know about it? |