|
|
3.3. Manipulating files3.3.1. Viewing file properties3.3.1.1. More about lsBesides the name of the file, ls can give a lot of other information, such as the file type, as we already discussed. It can also show permissions on a file, file size, inode number, creation date and time, owners and amount of links to the file. With the -a option to ls, files that are normally hidden from view can be displayed as well. These are files that have a name starting with a dot. A couple of typical examples include the configuration files in your home directory. When you've worked with a certain system for a while, you will notice that tens of files and directories have been created that are not automatically listed in a directory index. Next to that, every directory contains a file named just dot (.) and one with two dots (..), which are used in combination with their inode number to determine the directory's position in the file system's tree structure. You should really read the Info pages about ls, since it is a very common command with a lot of useful options. Options can be combined, as is the case with most UNIX commands and their options. A common combination is ls -al; it shows a long list of files and their properties as well as the destinations that any symbolic links point to. ls -latr displays the same files, only now in reversed order of the last change, so that the file changed most recently occurs at the bottom of the list. Here are a couple of examples:
On most Linux versions ls is aliased to color-ls by default. This feature allows to see the file type without using any options to ls. To achieve this, every file type has its own color. The standard scheme is in /etc/DIR_COLORS: Table 3-5. Color-ls default color scheme
More information is in the man page. The same information was in earlier days displayed using suffixes to every non-standard file name. For mono-color use (like printing a directory listing) and for general readability, this scheme is still in use: Table 3-6. Default suffix scheme for ls
A description of the full functionality and features of the ls command can be read with info coreutils ls. 3.3.1.2. More toolsTo find out more about the kind of data we are dealing with, we use the file command. By applying certain tests that check properties of a file in the file system, magic numbers and language tests, file tries to make an educated guess about the format of a file. Some examples:
The file command has a series of options, among others the -z option to look into compressed files. See info file for a detailed description. Keep in mind that the results of file are not absolute, it is only a guess. In other words, file can be tricked.
3.3.2. Creating and deleting files and directories3.3.2.1. Making a mess...... Is not a difficult thing to do. Today almost every system is networked, so naturally files get copied from one machine to another. And especially when working in a graphical environment, creating new files is a piece of cake and is often done without the approval of the user. To illustrate the problem, here's the full content of a new user's directory, created on a standard RedHat system:
On first sight, the content of a "used" home directory doesn't look that bad either:
But when all the directories and files starting with a dot are included, there are 185 items in this directory. This is because most applications have their own directories and/or files, containing user-specific settings, in the home directory of that user. Usually these files are created the first time you start an application. In some cases you will be notified when a non-existent directory needs to be created, but most of the time everything is done automatically. Furthermore, new files are created seemingly continuously because users want to save files, keep different versions of their work, use Internet applications, and download files and attachments to their local machine. It doesn't stop. It is clear that one definitely needs a scheme to keep an overview on things. In the next section, we will discuss our means of keeping order. We only discuss text tools available to the shell, since the graphical tools are very intuitive and have the same look and feel as the well known point-and-click MS Windows-style file managers, including graphical help functions and other features you expect from this kind of applications. The following list is an overview of the most popular file managers for GNU/Linux. Most file managers can be started from the menu of your desktop manager, or by clicking your home directory icon, or from the command line, issuing these commands:
These applications are certainly worth giving a try and usually impress newcomers to Linux, if only because there is such a wide variety: these are only the most popular tools for managing directories and files, and many other projects are being developed. Now let's find out about the internals and see how these graphical tools use common UNIX commands. 3.3.2.2. The tools3.3.2.2.1. Creating directoriesA way of keeping things in place is to give certain files specific default locations by creating directories and subdirectories (or folders and sub-folders if you wish). This is done with the mkdir command:
Creating directories and subdirectories in one step is done using the -p option:
If the new file needs other permissions than the default file creation permissions, the new access rights can be set in one move, still using the mkdir command, see the Info pages for more. We are going to discuss access modes in the next section on File Security. The name of a directory has to comply with the same rules as those applied on regular file names. One of the most important restrictions is that you can't have two files with the same name in one directory (but keep in mind that Linux is, like UNIX, a case sensitive operating system). There are virtually no limits on the length of a file name, but it is usually kept shorter than 80 characters, so it can fit on one line of a terminal. You can use any character you want in a file name, although it is advised to exclude characters that have a special meaning to the shell. When in doubt, check with Appendix C. 3.3.2.2.2. Moving filesNow that we have properly structured our home directory, it is time to clean up unclassified files using the mv command:
This command is also applicable when renaming files:
It is clear that only the name of the file changes. All other properties remain the same. Detailed information about the syntax and features of the mv command can be found in the man or Info pages. The use of this documentation should always be your first reflex when confronted with a problem. The answer to your problem is likely to be in the system documentation. Even experienced users read man pages every day, so beginning users should read them all the time. After a while, you will get to know the most common options to the common commands, but you will still need the documentation as a primary source of information. Note that the information contained in the HOWTOs, FAQs, man pages and such is slowly being merged into the Info pages, which are today the most up-to-date source of online (as in readily available on the system) documentation. 3.3.2.2.3. Copying filesCopying files and directories is done with the cp command. A useful option is recursive copy (copy all underlying files and subdirectories), using the -R option to cp. The general syntax is cp [-R] fromfile tofile As an example the case of user newguy, who wants the same Gnome desktop settings user oldguy has. One way to solve the problem is to copy the settings of oldguy to the home directory of newguy:
This gives some errors involving file permissions, but all the errors have to do with private files that newguy doesn't need anyway. We will discuss in the next part how to change these permissions in case they really are a problem. 3.3.2.2.4. Removing filesUse the rm command to remove single files, rmdir to remove empty directories. (Use ls -a to check whether a directory is empty or not). The rm command also has options for removing non-empty directories with all their subdirectories, read the Info pages for these rather dangerous options.
On Linux, just like on UNIX, there is no garbage can - at least not for the shell, although there are plenty of solutions for graphical use. So once removed, a file is really gone, and there is generally no way to get it back unless you have backups, or you are really fast and have a real good system administrator. To protect the beginning user from this malice, the interactive behavior of the rm, cp and mv commands can be activated using the -i option. In that case the system won't immediately act upon request. Instead it will ask for confirmation, so it takes an additional click on the Enter key to inflict the damage:
We will discuss how to make this option the default in Chapter 7, which discusses customizing your shell environment. 3.3.3. Finding files3.3.3.1. Using shell featuresIn the example on moving files we already saw how the shell can manipulate multiple files at once. In that example, the shell finds out automatically what the user means by the requirements between the square braces "[" and "]". The shell can substitute ranges of numbers and upper or lower case characters alike. It also substitutes as many characters as you want with an asterisk, and only one character with a question mark. All sorts of substitutions can be used simultaneously; the shell is very logical about it. The Bash shell, for instance, has no problem with expressions like ls dirname/*/*/*[2-3]. In other shells, the asterisk is commonly used to minimize the efforts of typing: people would enter cd dir* instead of cd directory. In Bash however, this is not necessary because the GNU shell has a feature called file name completion. It means that you can type the first few characters of a command (anywhere) or a file (in the current directory) and if no confusion is possible, the shell will find out what you mean. For example in a directory containing many files, you can check if there are any files beginning with the letter A just by typing ls A and pressing the Tab key twice, rather than pressing Enter. If there is only one file starting with "A", this file will be shown as the argument to ls (or any shell command, for that matter) immediately. 3.3.3.2. WhichA very simple way of looking up files is using the which command, to look in the directories listed in the user's search path for the required file. Of course, since the search path contains only paths to directories containing executable programs, which doesn't work for ordinary files. The which command is useful when troubleshooting "Command not Found" problems. In the example below, user tina can't use the acroread program, while her colleague has no troubles whatsoever on the same system. The problem is similar to the PATH problem in the previous part: Tina's colleague tells her that he can see the required program in /opt/acroread/bin, but this directory is not in her path:
The problem can be solved by giving the full path to the command to run, or by re-exporting the content of the PATH variable:
Using the which command also checks to see if a command is an alias for another command:
3.3.3.3. Find and locateThese are the real tools, used when searching other paths beside those listed in the search path. The find tool, known from UNIX, is very powerful, which may be the cause of a somewhat more difficult syntax. GNU find, however, deals with the syntax problems. This command not only allows you to search file names, it can also accept file size, date of last change and other file properties as criteria for a search. The most common use is for finding file names: find <path> -name <searchstring> This can be interpreted as "Look in all files and subdirectories contained in a given path, and print the names of the files containing the search string in their name" (not in their content). Another application of find is for searching files of a certain size, as in the example below, where user peter wants to find all files in the current directory or one of its subdirectories, that are bigger than 5 MB:
If you dig in the man pages, you will see that find can also perform operations on the found files. A common example is removing files. It is best to first test without the -exec option that the correct files are selected, after that the command can be rerun to delete the selected files. Below, we search for files ending in .tmp:
Later on (in 1999 according to the man pages, after 20 years of find), locate was developed. This program is easier to use, but more restricted than find, since its output is based on a file index database that is updated only once every day. On the other hand, a search in the locate database uses less resources than find and therefore shows the results nearly instantly. Most Linux distributions use slocate these days, security enhanced locate, the modern version of locate that prevents users from getting output they have no right to read. The files in root's home directory are such an example, these are not normally accessible to the public. A user who wants to find someone who knows about the C-shell may issue the command locate .cshrc, to display all users who have a customized configuration file for the C shell. Supposing the users root and jenny are running C shell, then only the file /home/jenny/.cshrc will be displayed, and not the one in root's home directory. On most systems, locate is a symbolic link to the slocate program:
User tina could have used locate to find the application she wanted:
Directories that don't contain the name bin can't contain the program - they don't contain executable files. There are three possibilities left. The file in /usr/local/bin is the one tina would have wanted: it is a link to the shell script that starts the actual program:
In order to keep the path as short as possible, so the system doesn't have to search too long every time a user wants to execute a command, we add /usr/local/bin to the path and not the other directories, which only contain the binary files of one specific program, while /usr/local/bin contains other useful programs as well. Again, a description of the full features of find and locate can be found in the Info pages. 3.3.3.4. The grep command3.3.3.4.1. General line filteringA simple but powerful program, grep is used for filtering input lines and returning certain patterns to the output. There are literally thousands of applications for the grep program. In the example below, jerry uses grep to see how he did the thing with find:
All UNIXes with just a little bit of decency have an online dictionary. So does Linux. The dictionary is a list of known words in a file named words, located in /usr/share/dict. To quickly check the correct spelling of a word, no graphical application is needed:
Who is the owner of that home directory next to mine? Hey, there's his telephone number!
And what was the E-mail address of Arno again?
find and locate are often used in combination with grep to define some serious queries. For more information, see Chapter 5 on I/O redirection. 3.3.3.4.2. Special charactersCharacters that have a special meaning to the shell have to be escaped. The escape character in Bash is backslash, as in most shells; this takes away the special meaning of the following character. The shell knows about quite some special characters, among the most common /, ., ? and *. A full list can be found in the Info pages and documentation for your shell. For instance, say that you want to display the file "*" instead of all the files in a directory, you would have to use less \* The same goes for filenames containing a space: cat This\ File 3.3.4. More ways to view file content3.3.4.1. GeneralApart from cat, which really doesn't do much more than sending files to the standard output, there are other tools to view file content. The easiest way of course would be to use graphical tools instead of command line tools. In the introduction we already saw a glimpse of an office application, OpenOffice. Other examples are the GIMP (start up with gimp from the command line), the GNU Image Manipulation Program; xpdf to view Portable Document Format files (PDF); GhostView (gv) for viewing PostScript files; Mozilla/FireFox, links (a text mode browser), Konqueror, Opera and many others for web content; XMMS, CDplay and others for multimedia file content; AbiWord, Gnumeric, KOffice etc. for all kinds of office applications and so on. There are thousands of Linux applications; to list them all would take days. Instead we keep concentrating on shell- or text-mode applications, which form the basics for all other applications. These commands work best in a text environment on files containing text. When in doubt, check first using the file command. So let's see what text tools we have that are useful to look inside files.
3.3.4.2. "less is more"Undoubtedly you will hear someone say this phrase sooner or later when working in a UNIX environment. A little bit of UNIX history explains this:
You already know about pagers by now, because they are used for viewing the man pages. 3.3.4.3. Head and tailThese two commands display the n first/last lines of a file respectively. To see the last ten commands entered:
head works similarly. The tail command has a handy feature to continuously show the last n lines of a file that changes all the time. This -f option is often used by system administrators to check on log files. More information is located in the system documentation files. 3.3.5. Linking files3.3.5.1. Link typesSince we know more about files and their representation in the file system, understanding links (or shortcuts) is a piece of cake. A link is nothing more than a way of matching two or more file names to the same set of file data. There are two ways to achieve this:
The two link types behave similar, but are not the same, as illustrated in the scheme below: Note that removing the target file for a symbolic link makes the link useless. Each regular file is in principle a hardlink. Hardlinks can not span across partitions, since they refer to inodes, and inode numbers are only unique within a given partition. It may be argued that there is a third kind of link, the user-space link, which is similar to a shortcut in MS Windows. These are files containing meta-data which can only be interpreted by the graphical file manager. To the kernel and the shell these are just normal files. They may end in a .desktop or .lnk suffix; an example can be found in ~/.gnome-desktop:
This example is from a KDE desktop:
Creating this kind of link is easy enough using the features of your graphical environment. Should you need help, your system documentation should be your first resort. In the next section, we will study the creation of UNIX-style symbolic links using the command line. 3.3.5.2. Creating symbolic linksThe symbolic link is particularly interesting for beginning users: they are fairly obvious to see and you don't need to worry about partitions. The command to make links is ln. In order to create symlinks, you need to use the -s option: ln -s targetfile linkname In the example below, user freddy creates a link in a subdirectory of his home directory to a directory on another part of the system:
Symbolic links are always very small files, while hard links have the same size as the original file. The application of symbolic links is widespread. They are often used to save disk space, to make a copy of a file in order to satisfy installation requirements of a new program that expects the file to be in another location, they are used to fix scripts that suddenly have to run in a new environment and can generally save a lot of work. A system admin may decide to move the home directories of the users to a new location, disk2 for instance, but if he wants everything to work like before, like the /etc/passwd file, with a minimum of effort he will create a symlink from /home to the new location /disk2/home. |


