08 Lesson 4 Phase 3 - Testing

###########################
## Phase Three - Testing ##
###########################

Before adding the final touch in which we set things up to talk to the
internet, we need to make sure the computers on our network can talk amongst
themselves.

You can either test each machine as you configure it (make sure you activate
the changes first!) or you can test them all in one go after you've finished
the whole configuration process.

The philosophy behind testing is quite simple. There are five places at
which a network connection can go wrong and you need to systematically test
each one. The order in which you perform these tests is important. If you
perform each one in turn, then you know immediately which part of your setup
is faulty. If you just pick any old test and it fails, you've not actually
learnt anything of value. For instance, say you try contacting another
computer by name, and it doesn't work. You know something is wrong, but have
no idea which cog on the gear is actually broken. Is it the network card
itself? Something wrong with the name resolution stage? It's only if you
work through something logically that you gain any benefits from your
efforts.

Now having said that, don't worry you're going to be saddled with assistants
in white coats carrying clipboards, wearing pocket protectors and checking
off a list with 512 items. It's only five steps. The key to troubleshooting
any problem is working out at which stage things break.

# Step One - The Network Card Itself. #

The command we use here you've already seen when you activated the network
interface during Phase 2:

ifconfig -a

Lets have a look at some of the output that command produces when run on my
computer.

eth1 Link encap:Ethernet HWaddr 00:10:5A:9D:76:C1
inet addr:10.0.0.1 Bcast:10.255.255.255 Mask:255.0.0.0
BROADCAST MULTICAST MTU:1500 Metric:1
RX packets:4314 errors:0 dropped:0 overruns:0 frame:0
TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:315981 (308.5 KiB) TX bytes:0 (0.0 b)
Interrupt:9 Base address:0xfc00

eth2 Link encap:Ethernet HWaddr 00:10:5A:74:B4:5E
inet addr:80.56.aa.dd Bcast:255.255.255.255 Mask:255.255.254.0
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:54713 errors:0 dropped:0 overruns:0 frame:0
TX packets:9912 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:100
RX bytes:9607076 (9.1 MiB) TX bytes:1406394 (1.3 MiB)
Interrupt:5 Base address:0xf480

All of the information included here can come in useful at some stage, but
we're just looking for two key points. The first one is a simple sanity
check. Look at the second line and make sure the IP address and subnet mask
are what they're supposed to be. If not, fix your typing mistake in the file
we edited at Step Two of Phase Two and activate the change.

The second thing to look for is found in the third line. You'll see above
that eth2 starts with UP whereas eth1 doesn't. "UP" is synonymous with
"working". If an interface is not marked as UP then you can try bringing it
up with the command:

ifup eth1

(obviously replace eth1 with the right interface name). If this command
fails, read through the file containing the IP address settings. Make sure
they're correct, and that you don't have duplicate lines telling the
computer conflicting information. If you still can't see the problem, the
best thing to do is write down any error messages and ask the list for help.
It's easier to troubleshoot these things on a case by case basis.

If you run ifconfig -a and the only interface listed is "lo", then your
network card hasn't been installed properly. Perhaps the correct module
(driver) isn't being loaded. Once again, email with the specifics of your
system.

# Step Two - Being Able to Talk to Yourself. #

For a computer this is not a sign of madness. It's something they do an
awful lot in day to day operations.

As well as the IP address you assign to it, every computer also answers to a
second IP address. This second address is called a loopback address, and is
the primary method a computer uses to talk to itself.

The loopback address is always 127.0.0.1, and is the same on every machine.
The first thing to do is ping the computer by its loopback address. The
command and its resulting output are below. If you're wondering, the -c4
option means "ping only 4 times".

[hamster@bob hamster]$ ping -c4 127.0.0.1
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.
64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.053 ms
64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.048 ms
64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.049 ms
64 bytes from 127.0.0.1: icmp_seq=4 ttl=64 time=0.049 ms

--- 127.0.0.1 ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2997ms
rtt min/avg/max/mdev = 0.048/0.049/0.053/0.008 ms
[hamster@bob hamster]$

The next thing to try is pinging the machine by the IP address you assigned
it in Step 2 of Phase 1. In my case the command looks as follows:

constantinople:~# ping -c4 10.0.0.1
PING 10.0.0.1 (10.0.0.1): 56 data bytes
64 bytes from 10.0.0.1: icmp_seq=0 ttl=255 time=0.8 ms
64 bytes from 10.0.0.1: icmp_seq=1 ttl=255 time=0.4 ms
64 bytes from 10.0.0.1: icmp_seq=2 ttl=255 time=0.4 ms
64 bytes from 10.0.0.1: icmp_seq=3 ttl=255 time=0.3 ms

--- 10.0.0.1 ping statistics ---
4 packets transmitted, 4 packets received, 0% packet loss
round-trip min/avg/max = 0.3/0.4/0.8 ms
constantinople:~#

If either of these two tests fail, make sure the interface is UP (re-check
the output of test number 1). On my computer, both of these ping commands
work successfully even with the network cable unplugged. Failure of either
of these two commands suggests hardware trouble.

# Step Three - Knowing Your Own Name. #

This step makes sure the computer is aware of its own name and responds to
it.

Check first what the computer thinks its name is by running the command:

hostname

It will answer with its name. In addition to this method, you can often find
out a computer's name by looking at the bash prompt (in the example below
the computer's name is bob). Using the prompt isn't always accurate though,
'cause it's possible to customise your prompt to print anything you want. If
the name it tells you is wrong, fix it by editing the file from Phase 2 Step
1 and activate the changes.

Ping the computer by its name:

[hamster@bob hamster]$ ping -c4 bob
PING localhost (127.0.0.1) 56(84) bytes of data.
64 bytes from localhost (127.0.0.1): icmp_seq=1 ttl=64 time=0.097 ms
64 bytes from localhost (127.0.0.1): icmp_seq=2 ttl=64 time=0.048 ms
64 bytes from localhost (127.0.0.1): icmp_seq=3 ttl=64 time=0.049 ms
64 bytes from localhost (127.0.0.1): icmp_seq=4 ttl=64 time=0.049 ms

--- localhost ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 2997ms
rtt min/avg/max/mdev = 0.048/0.060/0.097/0.023 ms
[hamster@bob hamster]$

What's interesting in this output is the first line - it's resolved the name
"bob" to the loopback (127.0.0.1) address. The reason behind this is the way
we wrote our /etc/hosts file. When trying to resolve a name, the computer
reads the /etc/hosts file one line at a time until it finds the name it's
looking for. In our case, the very first line in the file is the initial
mention of the computer's name, and the IP address given on this line is
127.0.0.1. Even though "bob" is mentioned further down the /etc/hosts file
(being matched with 10.0.0.106), the computer never gets that far.

If this test fails, check both /etc/hosts and the file you edited in Step 1
of Phase 2. Make sure you've spelt the computer's name the same way in both
places.

# Step Four - Contact Another by Number #

This next test will confirm if you can communicate with another computer.
This is done initially by IP, as we need to ensure we have basic
connectivity before we try using names. It may seem obvious, but it's worth
stating that these next two steps won't work until you've got a second
machine up and running to communicate with!

The test at this stage is simply:

ping -c4 <ip address of another computer>

I shan't bore you with the output of a successful ping. Suffice to say if
you get errors such as "Destination Host Unreachable" or no output except a
line saying "4 packets transmitted, 0 received, 100% packet loss", the
problem could be with the hub, cables or the other machine. Check for:

* Cable not plugged in (check BOTH machines).
* Hub not working.
* Interface on the other machine not in the "UP" state.
* The IP address you tried to ping doesn't exist on your network (check for
typing mistakes).

If you get an error message "Destination Network Unreachable" then you need
to check the subnet masks on all your computers and make sure you followed
the rules from Step 2 Phase 2 about which octets in a given range are yours
to alter.

# Step Five - Contact Another by Name #

Success at step 4 proves that you've got basic connectivity, so any problems
encountered here must involve the process of name resolution.

The command to use at this last step is:

ping -c4 <name of other computer>

If the ping fails, check the very first line of output to see which IP
address your computer resolved the name to. In the example below, it's
resolved the name "sooty" to the address 10.0.0.105. Make sure the IP is
correct one for that name.

[hamster@bob hamster]$ ping sooty
PING sooty (10.0.0.105) 56(84) bytes of data.

Troubleshooting problems here means examining your /etc/hosts file to make
sure the entries are correct according to the plan you drew up at the end of
Phase 1.

If you can't get steps 4 or 5 to work, you should also check you haven't
assigned the same name or IP to two different machines. When all else fails,
ask for help.

We're now ready to move on to the last phase, in which we setup the gateway
machine to allow internet connection sharing.