Victor Diaz | Multithreading - File Copy ~ Victor Diaz

Project Name

Multithreading – File Copy

Project Goal

To create an exact copy of a source file passed as a command-line argument using multithreading programming.

Details

Overview

The following program creates two groups of Pthreads, an READ group and an WRITE group, to create an exact copy of a source file passed as a command-line argument.

The program is invoked as follows:

copy <total-reader-threads> <total-writer-threads> <input-file> <output-file> <buffer-size> <reader-log> <writer-log>

<total-reader-threads>	is the number of READ threads to create. There should be at least 1.
<total-writer-threads>	is the number of WRITE threads to create. There should be at least 1.
<input-file>	is the pathname of the file to be copied. It should exist and be readable.
<output-file>	is the name to be given to the copy. If a file with that name already exists, it should be overwritten.
<buffer-size>	is the capacity, in terms of BufferItem’s, of the shared buffer. This should be at least 1.
<reader-log>	the READ threads write some trace information to this file. If a file with that name already exists, it should be overwritten.
<writer-log>	the WRITE threads write some trace information to this file. If a file with that name already exists, it should be overwritten.

Documentation

The original main thread is not part of either group. The main() function should open the source file, and create/initialize a circular buffer, and create all READ and WRITE threads. Then, the main thread waits for all these threads to finish. All READ and WRITE threads share the circular buffer. Each buffer slot stores 2 pieces of information: one data byte read from the source file and its offset in the source file.

typedef struct {
    char data;
    off_t offset;
} BufferItem;

Each READ thread goes to sleep for some random time between 0 and 0.01 seconds upon being created. Then, it reads the next single byte from the file and saves that byte and its offset in the file it to the next available empty slot in the circular buffer. Then, this READ threads goes to sleep for some random time between 0 and 0.01 seconds and then go back to read the next byte, until the end of file.

Similarly, upon being created, each WRITE thread sleeps for some random time between 0 and 0.01 seconds and it reads a byte and its offset from the next available nonempty buffer slot, and then writes the byte to that offset in the target file. Then, it also goes to sleep for some random time between 0 and 0.01 seconds and goes back to copy next byte until nothing is left.

For debugging purposes, each thread writes some information to two log files, so we can better trace the execution of the program.

Since all threads access common data, synchronization is required. The following pthread API’s are used:

pthread_attr_init	The pthread_attr_init() function initializes the thread attributes object pointed to by attr with default attribute values. After this call, individual attributes of the object can be set using various related functions (listed under SEE ALSO), and then the object can be used in one or more pthread_create(3) calls that create threads.
pthread_create	The pthread_create() function starts a new thread in the calling process. The new thread starts execution by invoking start_routine(); arg is passed as the sole argument of start_routine().
pthread_exit	The pthread_exit() function terminates the calling thread and returns a value via retval that (if the thread is joinable) is available to another thread in the same process that calls pthread_join(3).
pthread_join	The pthread_join() function waits for the thread specified by thread to terminate. If that thread has already terminated, then pthread_join() returns immediately. The thread specified by thread must be joinable.
pthread_mutex_destroy	The pthread_mutex_destroy() function shall destroy the mutex object referenced by mutex; the mutex object becomes, in effect, uninitialized. An implementation may cause pthread_mutex_destroy() to set the object referenced by mutex to an invalid value.
pthread_mutex_init	Destroy and initialize a mutex.

Critical sections of code are used for both read and write threads.

Log files

The two log files have no part in the file copying, but they are used to trace the execution of the program.

Each of the READ threads should be given a different number in the range 0 … <total-reader-threads>-1. Each of the WRITE threads should be given a different number in the range 0 … <total-writer-threads>-1. Each thread should know its own number. (This number is different from the thread id.)

When an READ thread reads the next unread byte from the file, it can obtain the offset using the lseek system call. When the READ thread saves the byte and its offset to the buffer, it writes to a particular index in the buffer. Each time an READ thread number n reads a byte from offset x in the file and writes it to index i in the buffer, it should write the line n x i to the <reader-log> file. More exactly, it should write its thread number, followed by a single blank, followed by the offset in the file, followed by a single blank, followed by the index in the buffer, followed by a newline character ‘\n’.

Similarly, each WRITE thread also writes n x i to the <writer-log> file. More exactly, it writes its thread number, followed by a single blank, followed by the offset in the file where it writes its byte, followed by a single blank, followed by the index in the buffer where it read its byte, followed by a newline character ‘\n’.

Development

API

References

The C Programming Language, 2nd Edition
Brian W. Kernighan, Dennis Ritchie
Beej’s Guide to Unix IPC
Brian “Beej Jorgensen” Hall
Beej’s Guide to Unix IPC – PDF
Brian “Beej Jorgensen” Hall
Operating System Concepts, 8th Edition
Abraham Silberschatz, Peter B. Galvin, Greg Gagne
Linux man pages
Michael Kerrisk

Multithreading – File Copy

Details

Overview

Documentation

Log files

Development

API

References

About Me

Instagram

Multithreading – File Copy

Details

Overview

Documentation

Log files

Development

API

References

About Me

Tags

Instagram