Lab 12 -  Our final lab. 

 

Part I.  MAKING PAGES THAT SEARCH ENGINES FIND.

 

Background:   How do search engines work?  They work by sending out programs that are actually based on "hacker" technology to crawl or burrow through the millions upon millions of web pages.  These programs go by the following names:  Crawlers, Worms, Robots, Spiders.  I will use the term "worm", but you may hear people refer to them under these other names.

 

"Worms", although based on hacker technologies, are benign - they do not attempt to hurt your machine.  The worms work by burrowing through the Internet, moving from page to page, and as they go accumulating information about that page.  So not all the visitors to your website are human!  The worms can read certain parts of the page that help them when trying to describe the pages when they "report" back to the search engine.  Worms can also take a snapshot of your page when they visited.  This is why Google search results return a cache copy (did you notice that?).

 

What do worms read on your page?  One thing they can read is the so-called META tag.  The META tags don't speak to humans, but instead are there to speak to the "worms" (I know, this is weird). 

 

META tags, at least in the late ‘90s and early ‘00s were really important ideas because you could increase the likelihood that a search engine would rank your site highly for returns on appropriate searches, so this led to increased site traffic (very important!).

 

So, how do you use these tags to make it more likely that your site is found by a search engine?

 

Read here for more:

http://searchenginewatch.com/webmasters/article.php/2167931

also read this for a little more information about why Meta-tags aren’t quite as essential as you might have thought: 

http://spider-food.net/meta-tags.html

 

Please try out adding META tags to your pages.  You can add these in Dreamweaver as follows (this may be slightly different in CS3):

http://www.umbc.edu/oit/sans/helpdesk/Macromedia/Dreamweaver/HOWTO_Meta_Tags.html

Please actually play around with adding Meta tags to either a page you (quickly) make, or your existing project pages.

 

PART II.  How do you know who visits your website?                                       

 

Websites log all the visitors that come to your site.  What exactly do they log?  The answer depends on which kind of webserver you are running.  Apache, the most commonly used webserver, has a set of logs, including the Access Log.  Here is more information on Apache logs:

http://httpd.apache.org/docs/1.3/logs.html#accesslog

 

Outputs from logs usually look like this:

127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache_pb.gif HTTP/1.0" 200 2326

 

What does this mean?  The first line gives the name of the host that requested the file.  Remember a hostname is a computer name, not a name of the person.  The hostname often has information in it about where the person came from --- since hostnames often reflect a domain (eg. “Colorado.EDU”) which helps localize where the request came from.  The “dash” after the hostname means that the host id was not logged – don’t worry about it – that information is almost never captured.  The next item is the name of the user IF AND ONLY IF there was a login mechanism for the user to access the page (username/password system).  In almost all cases, you won’t get a username, and the username may not reflect the identity of the real user (obviously).  The next item is the date when the processing of the request was completed by the server.   The next line is what the client requested.  The last two items (200, 2326 are status codes and size in bytes of the returned item).

 

Weekly logs can often be megabytes in size representing millions of individual hits.  How to summarize all this information in logs?  With log summarizing programs of course!

 

Programs that summarize your weblogs for you:

http://www.weblogexpert.com/

 

Log summaries are very important in giving you some feedback about what is popular on your site and some very rough guides to demographics of site users.  More sophisticated information requires some form of further tracking,

 

Here is more information on web-analytics software for determining what your users are doing:  http://en.wikipedia.org/wiki/Web_analytics

 

Part III.  Cookies.

 

The web suffers from one serious problem.  It has no short term or long term memory.  When you surf the web, the page doesn’t remember that you have been there before!  Each time you visit, the page thinks you are there for the first time.  But it is important for the page to remember who you are, since if it can, the page can write itself more specifically to what it knows about you.

 

There are a couple ways to make a web page remember you.  One is to get you to fill out a form and then store information about you in a database, like what you buy and what you looked at.  This is how sites like Amazon work in part.

 

Another way, that is slightly more insidious is to use Cookies.  What are cookies?  Cookies are very small pieces of information that a web page can save on your computer.  You read correctly --- web sites can put information on your computer.  These very tiny pieces of information include things like what you clicked on the page and what information you put into a form.  The next time you visit, the page that wrote the cookies can look at the cookies on your machine and tailor information based on what it reads. 

 

Read more about cookies:  http://www.cookiecentral.com/faq/#1.1

To learn how to use cookies as a web developer:  http://www.peachpit.com/articles/article.aspx?p=31661&seqNum=6

 

AND FINALLY --- We didn’t have time to delve into installing web servers.  Let me know if you have questions about that, and I can help steer you to some resources…

 

PROJECT TIME!