The following program creates two groups of Pthreads, an READ group and an WRITE group, to create an exact copy of a source file passed as a command-line argument.
The program is invoked as follows:
copy <total-reader-threads> <total-writer-threads> <input-file> <output-file> <buffer-size> <reader-log> <writer-log>
<total-reader-threads> | is the number of READ threads to create. There should be at least 1. |
<total-writer-threads> | is the number of WRITE threads to create. There should be at least 1. |
<input-file> | is the pathname of the file to be copied. It should exist and be readable. |
<output-file> | is the name to be given to the copy. If a file with that name already exists, it should be overwritten. |
<buffer-size> | is the capacity, in terms of BufferItem’s, of the shared buffer. This should be at least 1. |
<reader-log> | the READ threads write some trace information to this file. If a file with that name already exists, it should be overwritten. |
<writer-log> | the WRITE threads write some trace information to this file. If a file with that name already exists, it should be overwritten. |
The original main thread is not part of either group. The main() function should open the source file, and create/initialize a circular buffer, and create all READ and WRITE threads. Then, the main thread waits for all these threads to finish. All READ and WRITE threads share the circular buffer. Each buffer slot stores 2 pieces of information: one data byte read from the source file and its offset in the source file.
typedef struct { char data; off_t offset; } BufferItem;
Each READ thread goes to sleep for some random time between 0 and 0.01 seconds upon being created. Then, it reads the next single byte from the file and saves that byte and its offset in the file it to the next available empty slot in the circular buffer. Then, this READ threads goes to sleep for some random time between 0 and 0.01 seconds and then go back to read the next byte, until the end of file.
Similarly, upon being created, each WRITE thread sleeps for some random time between 0 and 0.01 seconds and it reads a byte and its offset from the next available nonempty buffer slot, and then writes the byte to that offset in the target file. Then, it also goes to sleep for some random time between 0 and 0.01 seconds and goes back to copy next byte until nothing is left.
For debugging purposes, each thread writes some information to two log files, so we can better trace the execution of the program.
Since all threads access common data, synchronization is required. The following pthread API’s are used:
pthread_attr_init | The pthread_attr_init() function initializes the thread attributes object pointed to by attr with default attribute values. After this call, individual attributes of the object can be set using various related functions (listed under SEE ALSO), and then the object can be used in one or more pthread_create(3) calls that create threads. |
pthread_create | The pthread_create() function starts a new thread in the calling process. The new thread starts execution by invoking start_routine(); arg is passed as the sole argument of start_routine(). |
pthread_exit | The pthread_exit() function terminates the calling thread and returns a value via retval that (if the thread is joinable) is available to another thread in the same process that calls pthread_join(3). |
pthread_join | The pthread_join() function waits for the thread specified by thread to terminate. If that thread has already terminated, then pthread_join() returns immediately. The thread specified by thread must be joinable. |
pthread_mutex_destroy | The pthread_mutex_destroy() function shall destroy the mutex object referenced by mutex; the mutex object becomes, in effect, uninitialized. An implementation may cause pthread_mutex_destroy() to set the object referenced by mutex to an invalid value. |
pthread_mutex_init | Destroy and initialize a mutex. |
Critical sections of code are used for both read and write threads.
The two log files have no part in the file copying, but they are used to trace the execution of the program.
Each of the READ threads should be given a different number in the range 0 … <total-reader-threads>-1. Each of the WRITE threads should be given a different number in the range 0 … <total-writer-threads>-1. Each thread should know its own number. (This number is different from the thread id.)
When an READ thread reads the next unread byte from the file, it can obtain the offset using the lseek system call. When the READ thread saves the byte and its offset to the buffer, it writes to a particular index in the buffer. Each time an READ thread number n reads a byte from offset x in the file and writes it to index i in the buffer, it should write the line n x i to the <reader-log> file. More exactly, it should write its thread number, followed by a single blank, followed by the offset in the file, followed by a single blank, followed by the index in the buffer, followed by a newline character ‘\n’.
Similarly, each WRITE thread also writes n x i to the <writer-log> file. More exactly, it writes its thread number, followed by a single blank, followed by the offset in the file where it writes its byte, followed by a single blank, followed by the index in the buffer where it read its byte, followed by a newline character ‘\n’.