|
|
...making Linux just a little more fun! Inter-Process Communication - Part II
I hope everyone enjoyed Part I of this article, where we had an introductory discussion of IPC mechanisms and then continued with detailed descriptions of mechanisms such as pipes, fifos and shared memory. We shall now continue with the other IPC mechanisms such as memory mapping, semaphores, message queues, and sockets. Memory MappingWe have seen that fifos are basically named pipes. Similarly, mapped memory is like shared memory with a name. With this little bit of information, we can conclude that there is some sort of relationship between files in the file system and memory mapping techniques - and, in fact, there is. Simply put, the memory mapping mechanism deals with the mapping of a file in the file system to a portion of memory. Say, for example, we have a file that contains some data in a specific format that the user knows. That file may be required by some application. We will have to write some parsing function for the application that will search specific data in the file. To be more specific, we will have the parsing function return some data to the application whenever any requests for specific data come from the application. Then, our parsing function will have to access the file for every request. This is only acceptable if the application makes requests for data a few times. But this is time-consuming when the application sends frequent requests for data and processing. One may wonder what makes it time-consuming. Let us look at the operations involved in reading data from a file in the file system in the hard drive. The major operations involved are as follows:
The two operations mentioned above consume considerable system services as compared to simply reading the data from memory. A program working in user space, when invoking a system call, transfers control from the user program to the kernel, which places it in the kernel space. That is, kernel space is consumed in the execution of the system calls. Also, accessing the disk (hard drive) frequently will be comparatively slow, since it requires I/O operations on a physical device. The overhead caused by frequently invoking system calls and accessing disk to obtain data can be avoided by the memory mapping technique, which involves logically associating the file with a portion of the virtual memory or the program's user space. Any file I/O operations can then be replaced with simple memory access. Linux provides system calls such as mmap and munmap for the purpose of using the memory mapping mechanism, which allows the sharing of mapped files. We have to associate the file in such a manner that the mapping is accessible to any process. We may have an application or a program running several processes, and one or more of the processes may access data from one or more files. So if we have a mapping for a file with sharing rights for any process, we can use the same mapping in all processes. See the manual page for mmap for more details. Again, synchronization problems can arise. Will the reads happen before or after the writes? An entirely unrelated process might have memory-mapped the file and might be modifying it, unknown to us, at the same time as our process. A number of problems may come from the design of the application and the usage of the mechanism as well. So, let's take a look at the proper usage of this mechanism. (See listing1.c.txt)
The logic behind the program is simple. The code first creates a temporary
file with name test_file and writes the string
HELLO_STR (defined as macro) into the file. We then map
the file test_file to memory using mmap. The
format for this call, as described in the manual page for
mmap is So, just how much benefit do we gain from memory mapping versus reading a file? We can demonstrate the gains by comparing this program against the use of the open, read and close calls that we mentioned above. (See listing2.c.txt) This program gives the output in the same format as the previous one. One can compare the performance of both the programs by running their respective executables with the time command, that is, if test1 is the executable for listing1.c then run the program as time ./test1. Do the same for listing2.c. Note the difference between the performance of the two programs. My Pentium III machine with a processor speed of 733MHz and a 128MB SD RAM showed following results: #Time for listing1.c real 0m20.360s user 0m1.600s sys 0m18.670s and #Time for listing2.c real 0m24.325s user 0m1.990s sys 0m22.300s It is obvious that the results may vary from machine to machine. It is also possible that a machine with a similar configuration like that of my test machine gives different results. One can also do the testing with different values of NTHLOOP, which will help you to get a better understanding of the difference in performance as the number of accesses to a file increases. What I have shown here is a very simple scenario. In practical cases, we will often deal with files that are larger in size and have more and more complex data. The efficiency of the mapping method becomes much more apparent as files sizes go up and the complexity of parsing increases. One thing to note is that when the process terminates the mapped region is automatically unmapped. Closing the file descriptor does not unmap the region automatically. To unmap a memory region we can use the system call munmap. The address and the offset are the two arguments needed to unmap a memory region. Experimenting with the third and the fourth arguments to mmap can be very useful, and will promote a deeper understanding of where the use of memory mapping mechanism will be most effective. Message QueuesMessage queues are one of the three different types of System V IPC mechanisms. This mechanism enables processes to send information to each other asynchronously. The word asynchronous in the present context signifies that the sender process continues with its execution without waiting for the receiver to receive or acknowledge the information. On the other side, the receiver does not wait if no messages are there in the queue. The queue being referred to here is the queue implemented and maintained by the kernel. Let us now take a look at the system calls associated with this mechanism.
Let's take a look at an example of using message queues. (See send.c.txt) First, we create a message queue using msgget with the flag set to IPC_CREAT and permissions set to read/write. Once we get the message queue identifier, we send the messages that in the above code are read from standard input. Note the output of the command ipcs before and after running the above program. Try to interpret the output. One command that will be of use in doing this is ipcrm; see its man page for more details. The code for receiving the sent messages is given below. (See recv.c.txt) Note that here we are using msgget without the field IPC_CREAT. This is because we have not destroyed the message queue associated with the key 9999. This can be seen in send.c. When we call msgrcv, we get the messages that were typed into the send program. This is confirmed by writing the received messages to the standard output. Experimenting with message queues can be fun, but you have to realize their limitations. Only repeated experimentation with this mechanism will enable one to actually learn it. Using message queues in applications has to be done only after careful consideration. Before thinking more on this, let us move on to the last of the System V IPC mechanisms, the Semaphores. SemaphoresA semaphore is best defined as a synchronization variable among processes communicating with each other. The processes may communicate data among themselves, or it may be communicated by a shared data object. The application should then see that the shared data object is not being written simultaneously by two or more process. A simple solution will be to have a locking mechanism that will prevent simultaneous changes in the shared data object. To be more precise, if we have a semaphore variable - say, S - then give a process access to the shared data object if S is not set. Once access is granted, S is set, and is reset once operations on the shared data object are over. The setting and resetting of the semaphore variable are commonly known as wait and post operations. The pseudo code for the wait and post operations are given below:
wait(S) {
while(S <= 0)
; /* some operations. */
S--;
}
post(S) {
S++;
}
Linux provides three system calls that enable the use of the semaphore mechanism.
If you take a look at the 'keyboard-hit' program that I used in my previous article to explain the use of pipes, fifos and shared memory mechanisms, the code above is essentially the same. We accept input from the keyboard and check whether the input is a digit between 0 and 9 inclusive; however, we now have two processes trying to take inputs from the keyboard. Since this can lead to conflict, we keep a lock, so that when one process is active, the other one waits. First, we register a signal SIGINT, so that whenever the signal is raised, the signal handler process_exit is executed, bringing about the termination of the process for which the signal is registered. We then have init_semaphore, which creates a semaphore set with NUM members in the set (here it is 1). On successful creation of the semaphore set, semget returns a semaphore set identifier, which will be used for further operations on the semaphore set. We then invoke set_sem_val. Initially, this may seem useless, but it can actually be used to confirm that you can run your program with the code above and with the code that does not call set_sem_val. The manual page for semctl gives a brief description on the union semun. The union as defined is
union semun {
int val; /* value for SETVAL */
struct semid_ds *buf; /* buffer for IPC_STAT, IPC_SET */
unsigned short int *array; /* array for GETALL, SETALL */
struct seminfo *__buf; /* buffer for IPC_INFO */
};
We set the field val in the union to 1. This signifies that access to the critical resource can be provided only once (or one process). It the value is set as 2, then only two process can hold access to the shared resource at a time. Next, we initialize two processes that take the same action. We fork it in such a manner that only the child processes run. If semaphore operation for the lock is successful, then we invoke critical_resource; otherwise we try again (much like an infinite loop). If a lock is successful, then that process is provided access to critical_resource. Once it comes out of the critical_resource, we remove the lock. Note that we have a sleep(1) in the process, which enables the scheduling of the two process to provide enough time slices to both the process. Confusing, right? Try running the code without that sleep; the results are educational. The critical_resource simply reads keyboard characters other than '\n' and when a digit between 0 and 9 is read, it asks the user to raise the keyboard interrupt. This may look as simple as the concept explained in almost all textbooks on Operating Systems. But this is because we are dealing with only one member in the semaphore set. The complexity increases as the number of semaphores increases. One may also wonder, why not use a simple flag kept in shared memory instead? That would be fine if we were dealing with programs as simple as the one given as example here, but would restrict the usage and features that are available with semaphores. I suggest experimenting with this mechanism, with more complex programs where you deal with more than just one semaphore in the set. SocketsSockets are the most popular of all the IPC mechanisms for a simple reason: other mechanisms do not support communication between two machines. Sockets provide a full-duplex communication channel between processes that do not necessarily run on the same machine.Generally, sockets are associated with the concept of network communication in the form of client-server programming; a pair of processes of which one will be a client and one a server. The client process will send requests to the server, and the server process will send replies as an acknowledgment to the client requests. Of course, when creating a socket, we have to specify the type of communication that will be needed, since different modes of communication requires different protocols. The different system calls associated with this form of IPC mechanism are given below:
The structure of the two programs client.c and server.c are similar; both programs initially create sockets. The sockets are created using the AF_INET family of protocols, which is same as PF_INET. The field SOCK_STREAM indicates that it will be a sequenced, reliable, full-duplex and connection-based socket. The protocol specified during creation of socket is '0'. This can also be IPPROTO_TCP for tcp sockets and IPPROTO_UDP for udp sockets. In the case of the server, we name the sockets using the bind system call. We have a structure sockaddr and sockaddr_in. The former specifies the generic socket address, whereas the latter describes the internet socket address. The structure as one can see in the manual page for ip is
struct sockaddr_in {
sa_family_t sin_family; /* address family: AF_INET */
u_int16_t sin_port; /* port in network byte order */
struct in_addr sin_addr; /* internet address */
}
and in_addr is
struct in_addr {
u_int32_t s_addr; /* address in network byte order */
}
Here, the sin_family field is same as the family of protocols being used, that is, AF_INET. The sin_port field is the port number (in network byte order), which will be used by the socket. Finally, sin_addr field is the internet address (again in network byte order), which here I have given as the loopback address. The socket is bound to these details, once a successful call is made on bind. The listen system call specifies that the created socket is willing to accept incoming connections, but with a restriction that there will be a limitation for the number of connections. In our code, the limitation is given as '1', which specifies the queue size for the number of pending connections. An accept call returns the descriptor of the accepted connection. If the server needs to serve n > 1 number of connections, then we have to make repeated calls to accept. After that, we can do the usual file operations (e.g., read and write) on the descriptor returned by the accept system call. Alternate read and write operations will help the server to communicate with the client. Coming to the structure of the client program, once a socket is created, the client tries to connect to the server using the connect system call. This, again, is done after setting the internet socket address before binding the socket to a name. Once the call to connect is successful, the client program can start communicating with the server. Note: obviously, the client program can only connect to a running server. However, for testing purposes, please run the client program without running the server. The example shown here illustrates the use of sockets for loopback communication. The client and server programs may be run on different machines by changing the internet address to the required ones. There are also several other families of protocols which support communication via sockets. Some of the interesting ones are PF_LOCAL for local communication, PF_PACKET for low-level packet interface, PF_NETLINK for communication between the kernel and the user, etc. It's a good idea to try out these options to understand the flexibility and domains (in the application level) where one can use this mechanism. The example above is for connection oriented communication. Exploring the use of sockets in an connectionless-oriented communication is left as an exercise to the readers. ConclusionFinally, I come to an end of my article. I have tried to present the concepts of Inter-Process Communication in a manner which will make it easier for novice programmers to write their own code whenever they come across any of the problems that I have mentioned here. This two-part article will not, of course, make you an expert in this field, but should form a good base for future exploration.
|
I completed my B. Tech in Computer Science & Engineering from a small town
called Trichur, in Kerala, God's Own Country in India. Presently I am
working in Naturesoft Pvt. Ltd, Chennai, India as a Programmer. I spend my
free time reading books on Linux and exploring the same. Also, I have a
good appetite for Physics. My motive in life is to go forward, onward and
upward.