Hard Links, Soft (Symbolic) Links and Junctions in NTFS: What Are They (For)?

by Helge Klein on 04/19/2009 | 12 Comments | 7,301 Views

This is an attempt at demystification. In the Windows world, links in the file system are often regarded as obscure, except for the infamous .LNK files, of course. But file system links are neither freaky UNIX/Linux command line stuff, nor are they new: Microsoft's OS offers two types of links since Windows 2000 and a third type since Vista/Server 2008. And boy, can they come in handy!

Hard Links

Hard links are everywhere. Every single file is a hard link! Think of it as this: a file really consists of two logically separate parts. The actual data (e.g. the contents of an MP3 file, the actual music) and a directory entry in the master file table (MFT) pointing to the data. MP3 music has no file name. That sounds silly, but it is the simple truth. In order to be able to have a name for a file of MP3 music a separate entity is needed, which is not related to the music in any way. That "entity" is your hard link.

Looking at it that way, hard links are easy to understand. Data is stored in clusters on your hard disks. The master file table stores links to the data and puts names on the links. That gives us file names. But are we limited to one MFT entry per data set (file)? No! NTFS has no problems whatsoever with multiple entries all pointing to the same clusters and thus the same data. Having two names for the same MP3 file is perfectly all right with NTFS, like C:\Fun\Boss talking bullshit.mp3 and C:\Work\Important speech of the boss.mp3. By clicking on either of those files the same speech is played, of course.

Since hard links are implemented directly in the MFT, they are limited to one volume. You cannot create a hard link that points to another volume, partition, drive or even to a file server on the network. But hard links are completely opaque to applications - very much in contrast to .LNK file that are only used and resolved by Explorer. Applications do not even know they are accessing some data via a hard link, how could they? Since every file is a hard link, it is only possible to determine how many hard links exist per file. There is no "first" or "real" one. Every single MFT entry is just one hard link among, potentially, many.

The downside of this "opaqueness" is that counting the size of directories becomes difficult. Determining a folder's size by looking at its properties in Explorer does not always yield the real size on disk since multiple hard links count twice! I do not know of any solution to this problem. If you have one, please let me know.

Junctions

Junctions are counterparts to hard links in that they work on directories instead of files. Implemented as reparse points stored as metadata in the file system junctions can point to other directories or volumes on the same computer, but not to folders on other computers. Unlike hard links, junctions point to a fixed path, the target. If the target is moved, deleted or renamed, you get the error "File not found" when attempting to list the contents of a junction. That means junctions can become stale, while hard links cannot.

Soft or Symbolic Links

While hard links and junctions have been present since Windows 2000, symbolic links were only recently added with Vista and Server 2008. They are similar in nature to junctions, but can also point to files and even to remote systems on the network, provided that the target machine runs Vista or later, too. As with junctions, changing a link's target results in a stale link. There is no mechanism built in that notifies the source about target changes.

By some, junctions are also regarded as soft links. Although that is technically correct, I prefer to distinguish between junctions and "real" soft aka symbolic links for practical reasons.

Link Creation and Manipulation

From the command line, hard links can be created with fsutil hardlink create (2000 to XP) or mklink /h (Vista and newer). For programmers, the API function CreateHardLink has been available since Windows 2000.

Junctions are best manipulated with the Sysinternals tool of the same name. Programmatic creation is, to my knowledge, undocumented.

Beginning with Vista, symbolic and other links can be manipulated with the mklink command. For programmers, the function CreateSymbolicLink has been added to the Win32 API.

Notes and References

All this applies to NTFS partitions only, of course.

Examples and practical tips of when to use which kind of link are outside the scope of this article, but a good topic for a future post.

The Wikipedia has relatively good articles about hard links, soft/symbolic links and junctions.

MS KB #205524 How to create and manipulate NTFS junction points
MS KB #315688 How to locate and correct disk space problems on NTFS volumes in Windows XP

+++ Your opportunity +++ Use Profile Migrator 2, the new sepago product that makes migrating user personalities between different platforms a breeze.! Download your free version now!

12 responses for "Hard Links, Soft (Symbolic) Links and Junctions in NTFS: What Are They (For)?"

Thanks for the post - maybe

Thanks for the post - maybe you can clarify something for me. When you have multiple hardlinks to a file does backup software see this as 1 lump of data plus multiple links in the MFT and consequently only backup the 'data lump' just once, or does the software see it as multiple separate files and backup the data many times. The later being an obvious inefficient use of space.

drtg, whether a backup

drtg,
whether a backup program backs up data that is pointed to by multiple entries in the MFT (i.e. hard links) once or multiple times depends on how clever the backup program is. Although I do not have evidence to back this up (haha) I suppose most programs are rather dumb when it comes to hard links.

@drtg: I'll add a little to

@drtg:
I'll add a little to Helge's responce, the MFT isn't read on the Program/Application layer, its read below that, so no, your software can not see that there are more then one entry on the MFT. Only the OS can.

I seem to just have

I seem to just have accidentally deleted a comment. Sorry! I got confused by the mass of spam comments...

Hi Helge - perhaps it was

Hi Helge - perhaps it was me?

I was asking how ACLs are evaluated when an object has multiple hardlinks.

Regards

Lee

Hi Lee, thanks for posting

Hi Lee,
thanks for posting your question again.
I might be wrong with this, but it should be like this:
NTFS ACLs are stored per MFT entry. A hard link basically is an MFT entry. So if you have two hard links pointing to the same data, you can set different permissions.

Hi Helge Thanks - so

Hi Helge

Thanks - so presumably that means to create a hardlink, you would need to have 'full control' rights probably on the current hardlink. Though it would mean that if later your rights to the original link was tightened-up, you would still have access to the data via your link?

Lee

Lee, uhh, I have to admit I

Lee,
uhh, I have to admit I do not know which permissions you need to create a hard link. But once you have two hard links pointing to the same data, you should be able to set different permissions on each link and thus have different users/groups that are allowed to access the data.
Excellent questions, by the way. It might be a good topic for another post to research and describe permissions in conjunction with hard links. So, thanks for indirectly suggesting the topic ;-)

Many thanks It's something

Many thanks

It's something that I have been researching for a few days and haven't found the answers to it. The reason I need to know is that I am developing a system that can have entities located via variuos paths (a la NTFS hardlinks). We need to implement security and I wanted to make it work the same way as NTFS does.

Lee

Lee, My earlier assumptions

Lee,
My earlier assumptions about permissions proved to be wrong. Please see this post for details: http://blogs.sepago.de/helge/2009/05/14/hard-links-and-permissions-acls/

Thanks for the article. It

Thanks for the article. It helped to systematize everything. I've also found a very good video: http://www.tubesfan.com/watch/the-endless-application-data-symlinks-dire... about the new implementation of Symlinks, Hard Links, and Directory Junctions in the latest Windows Operating Systems and many other issues.

heyo I felt your explanation

heyo
I felt your explanation of Hard-links was very subtle and easy to understand if you already understand the concept; to the people who are reading about them for the first time, I think your explanation is insufficient.