Thursday, July 24, 2003

The history of the way we search for IT
A brief but descriptive overview of how the human race
came about search engine IT with no voodoo.


Where I got my information from,”(Mecklermedia)Web Developer.com
Guide to Search Engines” by Wes Sonnenreich and Tim Macinta,
a Wiley publication. More information may be accessed online at
http://www.wiley.com/compbooks and
http://www.wiley.com/legacy/compbook/sonnenreich/webdev/index.htm.


MIT media labs in the early 1990s, corner stoning trends for the
popularization of Internet practices like homepages and homepage
directories, then inevitably homepage search requirements and
listing initiatives.

“Archie” was created in 1990 by Alan Emtage a student
at McGill University in Montreal.

At this time there was no World Wide Web
(hard to recall but yes it’s true). The Web is still less
than ten years young but maturing steadily and benefiting
tremendously from all collaboration efforts.
(Something to think about off-line as well).

There was however, the Internet which, was made up
crudely of files scatter throughout vast networks. This
method or collection of networks was primarily made up
of a series FTP (file transfer protocol) servers and actions.

How FTP works as described directly from the source:

“The primary method of storing and retrieving files was
via the File Transfer Protocol (FTP). This was (and still is)
a system that specified a common way for computers to
exchange files over the Internet. It works like this: Some
administrator decides that he wants to make files available
from his computer. He sets up a program on his computer,
called an FTP server. When someone on the Internet wants to
retrieve a file from this computer, he or she connects to it via
another program called an FTP client. Any FTP client program
can connect with an FTP server program as long as the client
and server programs both fully follow the specifications set
forth in the FTP protocol.

Initially, anyone who wanted to share a file had to set up an FTP
server in order to make the file available to others. Later
”anonymous” FTP sites became repositories for files, allowing
all users to post and retrieve them.

Even with archive sites, many important files were still scattered
on small FTP servers. Unfortunately, these files could be located
only by the Internet equivalent of word of mouth: Somebody
would post an e-mail to a message list or a discussion forum
announcing the availability of a file.

Archie changed all that. It combined a script-based data gatherer,
which fetched sites listings of anonymous FTP files, with a regular
expression matcher for retrieving files names matching a user query.
In other words, Archie’s gatherer scoured FTP sites across the Internet
and indexed all of the files it found. Its regular expression matcher
provided users with access to its database.”


“Gopher” servers are like FTP only for plain text documents
(no images or hypertext) thus, extremely limited outside of
business related Intranet environments. 1993, Archie finds a
wife and her name was Veronica (Very Easy Rodent-Oriented
Netwide Index to Computerized Archives) Running a system
similar to Archie but for Gopher servers. Like any good marriage
there were offspring humorously named to follow suit [FYI = Jughead].

The next area of strong contribution would be The Wanderer
which was the first robot enabled web tracking device which
served initially to keep track of servers then expanded to capture
Urls regularly springing up within a web database.
*Encompassing metadata used to define search topics
for organizing search results and access musts were turned
over to web databases in the spirit or form of open source databases.

Its name was finally Wandex, another note, as a newbie to the
historical facts of search engine developments I ran as search for
notations or remnants of Wandex and was unable to locate any
additional information to add to this resource.

If you are more scientific in thinking and following a time line,
Robots became a new term for web or Internet applications as
a tool that could perform specific functions with greater dependability
and consistency than its human counter part and therefore became
a vital creation for integration. Bit of trivia the term “Robot” comes

from the Russian word for “work”. Which is now what we think of as
Internet “spiders” because their function is not only to constantly
seek out new occurrences but to also monitor pre-existing ones
for new changes. The obvious and immediate advantages of
“robots” then “spiders” became a widely popular and readily
acceptable solution for the Search Engine Development Generators.

“ALIWEB”, “JUMPSTATION”, “World Wide Web WORM”, AND
REPOSITORY-BASED SOFTWARE ENGINEERING (RBSE) SPIDER.
Were some of the key contributors to elevating the modern success
and future horizons for where the Search Engines is now and is
striving to get to.

Researching these terms produced one of the few “no results”
findings for me. I can only deduce that they came they got
gobbled up for everything useful they had to offer and then
disappeared in their original state only to become fragments
of the current definitions of the Search Engine.

1994 resulted in the launching of “Galaxy”, which contained
Gopher and Telnet search features that focused on URL
description based spidering and collecting.

This brings us to April of 1994. Personally, I was in Paris party-ing
my brain out and learning to “parle en Francaise” avec a cute
Frenchmen named Jean Francois not giving the Internet a second
thought. April of 1994 also broadcasted that Nirvana front man
Kurt Cobain was found shot to death in a Washington home,
but in the world of Search Engines, what we now know as Yahoo!
Was in the throws of passionate conception.

Stanford University Ph.D. candidates David Filo and Jerry Yang
created some popular pages called Yahoo! because they
considered themselves to be a couple of “Yahoo’s”.

As the number of links grew and the number of “hits”
began to reach the thousands, the team created ways to
better organize the data. Thus www.yahoo.com became a
searchable directory which included data descriptions now
considered to be the standard for SEO (search engine optimization).

October of 1994 produced the beginnings of “webcrawler”
and in 1997 WebCrawler was gobbled up by AOL.

Some other well known SEO came out in 1994, Lycos, Infoseek,
and Opentext.

1995 DECs AltaVista arrives on the scene followed by Inktomi’s
“Hotbot” acquired by Yahoo! April 2003. MetaCrawler was released
in 1995. Giving separation to Meta Searches and the more common
data searches.

Browser concerns about the time that Yahoo had been
transferred to Netscape servers, Microsoft released its free
competitor “Internet Explorer”. Note: the majority of the
evolution within “how the Internet is being formed has to do
with survival of the fittest popularity negotiations” good to
remember for developers and engineers hoping to make
their mark.

Conclusions on tracing Search Engine History:

Clearly, it took a lot of effort and contribution to get us where
we are, now that we are here, we can study how to improve and
refine the rather tedious and awkward aspects in order to make
the next generation of events easier and better. Likewise, but
learning what to avoid, it is also important to learn what to embrace.
When stepping into search engines comprehensively, it is vital for
the developer and netizen alike to realize that search engines are
vital and improving them means the same thing to communication
as it does to information.

What I did not delve into:
Boolean as a search tool and in depth features analysis for s
earch engine pioneers that are being gobbled up by more
profitable ones. Primarily, the reason for this is because the
present remains undefined but this is an excellent time for
introspection before the next wave crashes down in order to
make it a better one for all of us surfers.

Hopefully, after this article you as the reader have been given a
fun yet reasonably substantial understanding of the evolution of
IT for the past 10 years and have some sound questions and
concerns to begin voicing in developer ears for the future. Also
a reason to feel comfortable researching the modern search tool
for yourselves either to better understand what you are working
with now or to begin working up the nerve to make your own.

It is only natural that the user-side be given a voice and a vote in
Search Engine Developments, as we will, ultimately be the ones
subjected to its faults as well as its brighter sides. The most
important thing is to keep IT moving on.

another version :
http://www.allbusinessnews.com/archives/2003/july/31prt.html

0 Comments:

Post a Comment

<< Home