Home
Contents
Search
Back
Up

Basic Internet

 

Opinions
Progressions
Basic Internet



Let's get back to basics

My original article, "The Internet - What Is It?" (July 1994 issue of Digital Rag) dealt with things like the hardware you needed, what a BBS (bulletin board system) was, free-nets, and costs. We'll deal with some of that in future articles, but I'm going to assume for now that the reader is already connected - otherwise how are you reading this eh?

The problem many people have today is that they really don't understand just exactly what they are connected to and why it works the way it does (or doesn't work the way they think it should). Herein is a little bit of light shed upon the topic. Of necessity it might appear a bit technical - but then so is figuring out where to put the gas in a car, and which is the gas and brake pedals. The point is that understanding a very few items can make many things more obvious and easy to deal with later.

The Internet is a network of networks.

All of the various smaller networks use a technical standard agreed upon by a technical committee called the Internet Engineering Task Force or IETF for short. This standard is called TCP/IP - Transmission Control Protocol/Internet Protocol. The 3 main concepts of TCP/IP that the neophyte should understand are Addresses, Names and Services. The one thing most are familiar with is the URL which ties all of these together but actually confuses people about what is really going on. There are certainly lots of other things to learn, but only these three things are necessary to grasp what is really going on.

Addresses

Each computer directly connected to the Internet must have a unique numeric address. The address is usually expressed as a series of 4 numbers separated by dots such as: 24.113.126.213

These numbers are not randomly assigned. There is a registration body that hands out blocks of numbers to ISPs all over the world, who in turn hand out smaller blocks or even individual addresses to their customers, one of whom is you. If you are running on a Windows based system you can look at the number you are assigned by running the c:\windows\winipcfg.exe program on your machine. If you are running a Linux or other Unix type box you will see it if you type ifconfig -a a the command line prompt.

If your system is connected to a network in your home or business it actually may not have a truly unique address - but instead will have one that is unique in the local network only. Chances are that it will be hidden behind a firewall which will hide its non-unique address from the rest of the Internet behind one or more assigned addresses that are unique. See Firewalls for more discussion on this.

Your ISP assigns an address to your system each time you connect to them if you dial periodically. If you are "permanently" connected via a lease line, ADSL connection, or cable modem, you may have been assigned a permanent or "static" address when you were first connected. If your business has many computers, it may in fact have been given a block of addresses which your system administrator doles out one at a time to each workstation and server in your facility.

Regardless of how you end up with the number though, understand that it (or at least the one that is on the gateway to the rest of the world) is unique. Any other computer on the Internet can talk to yours simply by using its numeric address - kind of like a phone number.

Names

Because humans don't deal all that well with long strings of numbers, a facility to allow text names to refer to the numbers was created. The Domain Name System (DNS) maps names such as www.pacdat.net to specific machines so that we humans don't have to keep track of the numbers. It also allows some flexibility in that it allows more than one name for a particular computer, and in some instances allows more than one computer to answer to a given name.

Note that domain names are case insensitive. This means that Pacdat.net is the same as PACDAT.NET

While there is currently a lot of discussion in various places as to the evolution of the Internet naming scheme, you will probably recognize the fact that most computer names end in only a small number of suffixes: "Dot COM", "Dot NET", "Dot ORG", "Dot EDU", "Dot MIL", and "Dot GOV". Outside of the U.S. we can add a 2 letter suffix that is unique to our country. Here in Canada, ours is "Dot CA"

The "Dot" is actually written as a period which separates the "second level name" from the "top level name" to produce for example: pacdat.net

Each "Dot" in a name adds a level to the name, and each level may be assigned to be administered by a different individual or company. The top level domains are administered by what once was a U.S. government agency and has recently been changed to a commercial contract currently held by Network Solutions Inc.

If you go to the Network Solutions site you can use their "whois" service to look up and see who owns a particular domain. If you look up pacdat.net you will see information about me and my company.

Here in Canada the "Dot CA" domain is administered currently by www.cdnnet.ca 

There is a lot of controversy over the "ownership" of domain names, including trademark and such. Each registry of Top Level Domains (TLDs) has its own policies regarding this. The TLD registry for the .COM, .ORG, .NET and other original TLDs is in the U.S. and goes by U.S. laws, even though many of the owners of second level names in these TLDs are in fact outside of the U.S. 

The CADomain people have their own set of policies on who can have what names. The prime example currently is the denial of a second level name (such as YOURNAME.CA) to any but federally registered companies or companies with offices in more than one province, regardless of whether such a name is otherwise not used. If you have a company incorporated in BC with only a BC address, you may register only a third level name such as "YOURNAME.BC.CA". Another policy example is the limit of only one name to each business with the exception of a second one that is for the French name of the same business. This is supposed to change under a new registration body soon.

A second level name (xyz.com) may be the only entity for a domain. The DNS (name to number) translation system may only have one address to translate it to, and that may be the end of it. This is the simplest, but also the least likely scenario. Usually there are one or more third level domain names, and sometimes fourth and fifth and deeper. These are used for example to describe other computers the company might own (desktop.xyz.com) , or to describe a computer dedicated to a particular service (mail.xyz.com or ftp.xyz.com). The most recognizable of these is WWW.xyz.com since the www is the most typical third level name for the computer that hosts a company's web pages. 

The DNS name for my computer as given to me by Rogers Wave is cr554487-a.poco1.bc.wave.home.com

Note that I also have it named as pacdat.pacdat.net - both names are valid, and both point to the same IP address (currently 24.113.126.213 as I write this). The difference is only in who has authority in a particular name space. In fact, if you have control of a domain and DNS server, you could give my computer a completely different name (e.g. richard.yourdomain.ca) and refer to it that way if you want.

There is another aspect to the DNS system that maps numbers back to names. Note that this is always a one to one map, and is referred to as the "Reverse DNS" name. This is governed by the organization that has control of the IP number block. In my case, Rogers maps my IP number (24.113.126.213) back to cr554487-a.poco1.bc.wave.home.com - my "real" DNS name. It is worth noting here that many ISPs will not allow a computer that doesn't have a reverse DNS entry set up for it to talk to their systems or send E-mail. This has mainly to do with security and their ability to determine who controls a computer that might attempt to do something they don't like.

One thing that some people get confused over; a domain name is not exactly the same as a URL. See the section below on Tying it all together.

Services

The TCP/IP standard adds an extra number to the IP address described above. These extra address numbers are used to select what type of service the computer addressed by the IP address is to perform. There are a number of standard service numbers but the most recognizable one for most people these days is service 80 - the one for Hypertext Transport Protocol (HTTP); in other words, the Web Server.

Other well known services include Simple Mail Transport Protocol (SMTP) for E-mail which is 25, and File Transfer Protocol (FTP) which is complex enough to use two ports, 20 and 21.

Any computer may handle any number of services for the address it has. My local server handles most of the main services for my home LAN but there are a couple that it hands out to another computer to work on. One is service 8080 which points to my camera server.

Tying it all together

The Universal Resource Locator (URL) ties it all together.

HTTP://WWW.PACDAT.NET/progressions.htm

A URL describes not only a computer, but a service on that computer (see previous section) and possibly even a very specific piece of information on that computer.

The URL is made up of a number of pieces and is worthy of a bit of discussion here. The 3 most common pieces that have been codified into the URL include:

  1. the service name followed by a colon
  2. a double slash followed by the name of a computer to indicate that the service is somewhere out on the network
  3. a single slash followed by the file name on the computer or a string of commands which will bring up the desired information or page.

There are defaults for each of these if they are left out, and the creators of some Internet browsers have taken it upon themselves to put in place some extra assistance that might or might not make it easier to find things.

The most typical service is HTTP, or "The World Wide Web. If you don't put the service name into most browsers it will default to HTTP.

The next piece is the name of the computer. Usually this is out on the network somewhere, so it is preceded by what has become the convention for "out on the network somewhere" - the double slash. 

After the double slash is either a domain name or a computer numeric address. If you put in a domain name that doesn't begin with the conventional WWW, some browsers will first try for a computer without the WWW, and then try the name with WWW at the beginning. The WWW itself does not mean anything - it is simply the conventional name for the computer that hosts a company's web site.

Some browser creators are also adding a second lookup facility to take typical key words and find a domain name from a search engine. This is nothing to do with an Internet standard and is not yet wide spread.

If there is no computer name at all, the URL points at the local computer. 

Some times you will see a second colon and a number after a computer name. This is a case where the service number is different from the standard one. e.g. http://www.pacdat.net:8080/test.htm

The final item is the explicit piece of data on the computer that is wanted, the file name or other command. This is the only part of the URL that might be case sensitive. While MS Windows computer file names are not generally case sensitive, other operating such as Unix and Linux have files whose names are case sensitive. You should assume that anything after the computer domain name is case sensitive.

If no file name is given on the end of the URL, then the server may provide a default file, typically called index.html or index.htm. If there is no such file the server will either simply show a list of files, or show an error page. The same is true if a directory name but no actual file name is given.

So, to summarize:

http://www.pacdat.net/progressions.htm
Goes to the computer called www.pacdat.net
Uses the HTTP service when it gets there
To ask for the file progressions.htm
ftp://www.pacdat.net/pub/
Will ask the www.pacdat.net computer to give a default file using the FTP service. Typically this will be a list of the available files. Note that even though the domain name starts with www, the computer uses the service (FTP in this case) specified before the colon in the URL.
www.pacdat.net
Your browser will put the http:// in front
The server at www.pacdat.net will give you the default file named index.html
pacdat.net/directory/
your browser will first try to see if there is a computer with the URL http://pacdat.net/directory that will answer the HTTP service and deliver a default file from the directory. If not, then some recent browsers will try http://www.pacdat.net/directory/
Only if that fails will they tell you that nothing can be found.

Next month we deal with the most widely used service, and in fact the oldest; and no, it isn't "the Web"; it is E-mail

Let me know if this or other articles help you.

richard

 

 

Home ] Contents ] Search ]
Back ] Up ]

Copyright © 1993-2007 Richard C. Pitt - all rights reserved
Updated June 17, 2005