03 Supplement: More on symbolic links

Introduction

When I initially posted Lesson 2, I got a lot of feedback requesting an explanation of how symbolic links actually worked. I was intending to cover this at a later stage, but decided that it could be handled now without losing too many of you. If you don't particularly need/want to know how symbolic links are actually implemented, and are content just using them, that's fine. If you stop reading right here and wait for Lesson 3, you won't lose out. This is just a digression for those who are interested, and isn't necessary to understand the rest of the course.

Q. What is a symbolic link?

A symbolic link is a file (or directory) which behaves to all intents and purposes like its target - the file or directory to which it points. For example, if I have a symlink called /home/meredydd, pointing to the directory /export/meredydd/, then the path /home/meredydd will start acting exactly like the path /export/meredydd/. You can list its contents, read and write files in there, everything you can do to a normal directory. But it's not. What you would actually be doing is manipulating /export/meredydd/.

Directory listing with symbolic links using ls

Directory listing with symbolic links using pwd

Q. How does it work?

Oh. Now there's a question, and one that can be answered on many levels. Because this is a basic course, I'm going to go only so far in answering it. If you're interested in further implementation details, you can ask on the list and I will attempt to rustle up some links for you. Failing that, Google is your friend. But anyway...

A symbolic link can be thought of as a small, very special file, containing just one string - the target path. That's all it is. For example, the symbolic link I created in the Lesson 2 demonstration is just a file, marked specially as a symbolic link, containing the string "/home/meredydd/web/". Whenever the kernel (that's the part of the operating system which handles things like file accesses, networking, etc for programs - we'll see a lot of it in Lesson 3) gets asked to do something to the link, for example list the contents of the directory /home/meredydd/backups/web/, it reads the file, finds that it's a symbolic link, and retries the same request on the target path (/home/meredydd/web/). This time, it hits an ordinary directory, and so lists its contents in the normal way.

This, by the way, is the real reason for the odd behaviour of ../ in symlinks (see the green box in Lesson 2). In my example, we now see what's going on - when ls asks the kernel to list the contents of backups/web/../, the kernel realises that backups/web is a symbolic link, and so converts the path to /home/meredydd/web/../, with predicatable results.

Q. What do you mean by "Symbolic links are relative"?

Simply that - the target of the symbolic link is a relative path. This is explained by the way that the operating system's kernel handles any request made of a symbolic link. For example, if I have a symlink called /usr/X11, with a target of "X11R6/", and try to list the contents of /usr/X11/, the process I described in the previous paragraph occurs. The kernel notices that /usr/X11 is a link, and substitutes the target path for the one that was originally requested. Here, however, there is a difference. The target path is not absolute (it does not begin with a /). So, instead of replacing the original path entirely with the target, and trying again, the kernel backs up to the directory containing the symlink (in this case, /usr/), appends the new relative path, and then tries again. So /usr/X11/ becomes /usr/X11R6/. Very well and good.

You can also, however, use the ../ element in your path, which is when things get slightly unintuitive. If, for example, the sample symlink I created in Lesson 2 had ../web/ as a target (rather than the absolute path /home/meredydd/web/), it still would have worked. The reason? When accessing the path /home/meredydd/backups/web/, the kernel would realise that it was looking at a symlink, and do the normal thing - take the directory containing the symlink (/home/meredydd/backups/), and add on the link's target, producing /home/meredydd/backups/../web/, which would work perfectly.

Demonstration

The important thing to grasp is that the string you give to the ln command is not interpreted at all. It just gets written into the symlink - it is only when you access that link that the string is taken any notice of. This means that if you move a symbolic link around, but leave its target the same (which is what happens if you move or rename it - you don't expect the contents of other files to change when you move them, do you?), you can change what it points to.

Try this:

meredydd@rhodium:~$ mkdir tmp/
meredydd@rhodium:~$ ln -s ../ tmp/parent_dir
meredydd@rhodium:~$ ls -l tmp/parent_dir/

This will list the contents of your home directory. Why? The symbolic link parent_dir has a target of ../, so in fact you are listing the contents of the directory tmp/../ - your home directory. Now try this:

meredydd@rhodium:~$ mv tmp/parent_dir ./
meredydd@rhodium:~$ ls -l parent_dir/

You will now see the contents of ../ - in this case, the directory /home/.

Q. What's a "hard link", then?

A hard link is where two paths (specifically, two files - it doesn't work for directories) refer to the same data on a volume. They can also be created with the ln command, but are very rarely used.