The directory have a logical structure that defines how files are organized inside the directory. There are many organizing schemes such as single level, two level, tree structure, and acyclic-graph structure. In this article, you will learn about various logical directory structures and need for them.
Single Level Directory
It is the simplest directory structure with all files in the same directory which easy to access, support and understand.
Few disadvantages with a single level directory are:
- When files increase, they are very difficult to manage.
- The files must have unique names because multiple users can have same filenames.
- A user may not remember the filenames because of too many files in the same directory.
Two level directories
In single level directory, it is difficult to manage each user files. The two level directory system allows a directory
User File Directory(UFD) which only list files of a single user.
When user logs or starts a job, a
Master File Directory (MFD) is searched for a user name or account number. The MFD is indexed on user name or account number and each entry points to a UFD.
When user searches for a file he refers to his own UFD. A file name is unique within the UFD even with the same
name by different users.
User can create files, delete files, rename files, and so on by searching for an existing file before performing a file
operation. Administrators can delete the UFD and create a new one and adds an entry into the MFD.
Disadvantages of Two-Level Directory
We list here the disadvantages of a two-level directory structure.
- Isolates one user from another which is a problem if the system needs cooperation between users on certain projects or tasks.
- One user cannot access files of other users unless permission is given explicitly.
The MFD and UFD can be thought of as a tree where files are leaves of the tree. The UFD itself is a file in the tree and
make a path with a file name.
A file name “test” in user B directory has path
In MS-DOS, a volume is specified as a letter followed by a colon,
In VMS, u:[sst.jdeck]login.com;1 where u: is volume, sst is a directory, jdeck is subdirectory and login.com is file
Some other systems treat volume as directory,
/u/pub/test where u is a volume.
The system files are part of a loader, compilers, utility routines, libraries, and assemblers. The operating system receives commands with filenames, the command line interpreter search for requested filenames in UFD, then load and execute the files.
This means that we need to keep a copy of system files for each user. If system files are 5 Mb in size, then the total space required for 12 users would be 12 * 5 = 60 MB. This would be a waste of space to keep multiple copies of the same files.
The standard solution would be to create a special directory for system files. If search could not find any system files
in the local UFD, then search in the special directory.
A sequence of directories are searched when a file is named is called
search path. This search path can be extended with unlimited directories to search. This a popular method of search in MS-DOS and UNIX.
The two-level directory structure could be extended to create a tree-structure with arbitrary height. The user can create subdirectories and keep their files.
The tree has a root directory and every file in the tree has a unique path name. A directory (or subdirectory) contains list of files or subdirectories. A single bit in each directory entry defines the content as file(0) or subdirectory(1).
Each process has a current directory which contains files that the process is currently interested in. If a referenced file is not in the current directory, the user must specify the path or change the current directory to be the directory holding the file.
Now, the user can change the directory using change directory system calls to another directory. The search path does not include an entry that stands for “the current directory”.
An initial current directory is assigned to the user when a job starts or when the user logs in. The accounting file contains entry to this initial current directory, therefore, a pointer for the operating system for the initial current directory. The current directory of any subprocess is the current directory of the parent.
The path name are of two types –
absolute path and relative
path in the tree structure.
- Absolute path – It is path to a file from the root directory given names of all directories and subdirectories in the path.
- Relative path – It is path from the current directory to the file including names of all directories and subdirectories in the path.
There are two approaches to deleting directories and files from a tree-structure. The first approach used in MS-DOS, a directory cannot be deleted unless it has no files, and this applied to subdirectory as well.
The second approach used in UNIX operating system, using rm command you can delete a directory along with subdirectories and files. All with a single command. This option is dangerous if you do not have a backup of files to restore it later.
Tree structure directory allows users to access other user’s files and directories. They can do that by specifying absolute path or relative path to the file.
A tree-structure does not allow sharing of files. The acyclic-graph (graph with no cycle) directories allow common shared directories for two or more users.
The shared file or directory will appear in two different places at the same time, but they are not copies and changes are done to a single file. Usually, teamwork requires such shared directories and subdirectories. One new file from a person in a shared subdirectory is replicated in all shared subdirectories.
Implementing Shared Files and Sub-directories
In UNIX system, a new directory entry is created called
link. A link is a pointer to another file or subdirectory and implemented as an absolute path or relative path to the file or subdirectory.
When a reference is made to the file, then we search the directory, but the file is marked as a link with a path name. The system will resolve the link pathname to get the real file. While traversing the tree, the link entries are ignored to preserve the acyclic-graph structure.
Another approach is to duplicate all information about files in both sharing directories. The problem with this approach is consistency when the files are modified.
The acyclic-graph structure is flexible than tree structure but more complex. We have listed some disadvantage of
this structure below.
- A File have multiple absolute paths. Traversing the file structure will cause a problem with multiple entries to the same file.
- Once the original file is deleted its space is deallocated, but it can leave dangling pointers which are pointing to the deleted file. If we keep a list of associated links and when the count of references becomes 0, it is a lot easier to remove link rather than searching links and remove them.
- We can leave the link dangling, only an attempt to resolve them will result in error.
An acyclic-graph directory ensures that there are no cycles in the structure. We start with two level directories, then add more subdirectories and files resulting in a tree structure. However, the tree-structure is broken when links are
added and it becomes a simple graph directory structure.
An acyclic graph traverses the graph to determine that there is no reference to a file. We want to avoid searching the
acyclic directory again for shared files or subdirectories if an attempt to find the file failed. This is for performance
What if we allow cycles (self-referencing) in the directory structure? The search would never end. The possible
solution might be to avoid searching shared directories. When the link count becomes 0, the file is deleted. But the when cycle exist the count never becomes 0 and the search algorithm will go to a never terminating loop.
The solution to this problem would be garbage collection, in which two passes is made:
- The first pass will mark everything that is accessible.
- The second pass will add the rest of the items including links to free space list.
It is effective, but a time-consuming effort. The best solution, therefore, is to avoid traversing links.
- Abraham Silberschatz, Peter B. Galvin, Greg Gagne (July 29, 2008) Operating System Concepts, 8 edn., Wiley.
- Ramez Elmasri, A Carrick, David Levine (February 11, 2009) Operating Systems: A Spiral Approach, 1st edn., McGraw-Hill Education.
- Tanenbaum, Andrew S. (March 3, 2001) Modern Operating Systems, 2nd edn., : Prentice Hall.