Sunday, August 17, 2008

OS Concepts Part-II


OS concepts gave some idea about threads.
To understand threads more, it is important to understand how threads are different from processes.
Consider different colored threads: red blue, white, purple existing together to give shape to a hand-knit embroidery.

Each thread has a unique existence and at the same time co-exist while using the resources provided in the process of carving out a multi-threaded embroidery.

In the world of Operating Systems, many threads map to one process, while sharing the resources of a single process in multi-threaded environment.
IPC mechanism exist for the communication of 2 processes, whereas 2 threads can easily communicate with each other as they share the same address space, code and data segments. Each thread maintains its own copy of stack and PC and SP registers along with some storage space for local variables.

UNIX supports multiple user process but only supports one thread per process, whereas Solaris supports multiple threads per process. High and cheap degree of parallelism is often achieved via the usage of threads. In an embedded SQL application, each session denotes a thread. The program creates as many sessions as there are threads. This way we can realize that each thread opens up a unique point of execution. An attempt has been made to explain it more in the examples below:

Example 1 : A file server on a LAN

  • It needs to handle several file requests over a short period
  • Hence more efficient to create (and destroy) a single thread for each request
  • Multiple threads can possibly be executing simultaneously on different processors

Example 2: Matrix Multiplication

Matrix Multiplication essentially involves taking the rows of one matrix and multiplying and adding corresponding columns in a second matrix i.e:

Matrix Multiplication (3x3 example)

Note that each element of the resultant matrix can be computed independently, that is to say by a different thread.

Lets try to understand the attributes which further differentiates one thread from the other. These attributes can be set at the time of the creation of the thread or changed when the thread is in running mode:

1. Priority:

As the name suggests, this affects the amount of processing time that the system gives the thread before letting another thread interrupt it.

2. Stack Size:
This parameter decides the number of function calls made by the thread permitted within its stack space.

3. Name:
Each thread is associated with a unique name, something that helps in debugging or tracking the thread in its workspace.

4. Scheduling Policy:
The policy decides how various threads are scheduled within the system.

5. Thread State:
A thread's state indicates what the thread is doing and what it is capable of doing at a particular instance. it is running, waiting for resources or sleeping ??

6. Thread Stack Guard Size:
Most thread implementations add a region of protected memory to a thread's stack, commonly known as a guard region, as a safety measure to prevent stack pointer overflow in one thread from corrupting the contents of another thread's stack.

7. Scope:
This attribute defines the scope or the visibility area of the thread.

8. Detach State:
This attribute defines how a thread leaves the associated active sources during its termination.

Next task is to understand how a thread is created in Linux:

On Linux, kernel threads are created with the clone system call. Clone API specifies which resources should be shared. It shares memory space, file descriptors and signal handlers.


The first step is to decide the optimum stack size to be used by the thread.
The SP (stack counter) passed to clone must refer to the top of the chunk of memory, since on most processors the stack goes down.
To avoid a memory leak, the stack must be freed once the thread has exited. For better understanding, here is a snippet of simple clone program from Linux magazine:

___________________________________________________________________

#include

#include

// 64kB stack
#define FIBER_STACK 1024*64

// The child thread will execute this function
int threadFunction( void* argument )
{
printf( "child thread exiting\n" );
return 0;
}

int main()
{
void* stack;
pid_t pid;

// Allocate the stack
stack = malloc( FIB
ER_STACK );
if ( stack == 0 )
{
perror( "malloc: could not allocate stack" );
exit( 1 );
}

printf( "Creating child thread\n" );

// Call the clone system call to create the child thread
pid = clone( &threadFunction, (char*) stack + FIBER_STACK,
SIGCHLD | CLONE_FS | CLONE_FILES | CLONE_SIGHAND | CLONE_VM, 0 );
if ( pid == -1 )
{
perror( "clone" );
exit( 2 );
}

// Wait for the chil
d thread to exit
pid = waitpid( pid, 0, 0 );
if ( pid == -1 )
{
perror( "waitpid" );
exit( 3 );
}

// Free the stack
free( stack );

printf( "Child thread returned and stack freed.\n" );

return 0;
}

_____________________________________________________________________

Simple clone Thread Example

Lets have a look at clone API in more detail.

clone( &threadFunction, (char*) stack + FIBER_STACK,
SIGCHLD | CLONE_FS | CLONE_FILES | CLONE_SIGHAND | CLONE_VM, 0 );

Remember, the main use of clone is to create multiple threads in a program that run concurrently in a shared memory space.

When a thread is created, it starts executing the function (
&threadFunction).
0 represents the arguments passed to the function. These can be variable in number.
When the function returns, the thread terminates.

(char*) stack + FIBER_STACK specifies the location of the stack used by the thread. The calling process must create memory space for the thread stack and pass a pointer to this space to clone().

SIGCHLD | CLONE_FS | CLONE_FILES | CLONE_SIGHAND | CLONE_VM
These represent the flags, the lower byte of the flags contains the number of the termination signal sent to the parent when the thread dies.
Here SIG stands for signal and CHLD stands for child, signifying that the child process has terminated.

Here flags are bit-wise ORed to specify the shared resources:
CLONE_FS: It specifies that the file-system is shared.

CLONE_FILES: It specifies that the file-descriptor table is shared.

CLONE_SIGHAND: It specifies the sharing of same table of signal handlers.

CLONE_VM:It specifies that the created thread runs in the same memory space as the calling process.


Time for Question-Answer session again:

1. 1 process can have 100 threads.
Two child processes are created each having 40 and 60 are formed.
What is the probability that:

a. 2 threads enter the same child process?
b. 2 threads enter the different child process?

2. What are the disadvantages of using threads over processes?

3. Define the 3 possible main stages in which a thread can be?

4. What role does a thread play in contributing to parallel processing inside a multi-processor?

5. How can one ensure that all the threads of a process co-exist at a particular instance?

6. What are the tools required to program with threads?

7. Name some OS which support thread programming?

8. What makes threads such an interesting read ??

6 comments:

Anonymous said...

oh, I can see Linus Torvalds and Bill Gates running for cover :)

Pilot-Pooja said...

:-))

Anonymous said...

BTW, why don't you get your own domain name and hosting?

Pilot-Pooja said...

Thats a good idea Shantanu!!
Thanks for a good suggestion!

Ankur said...

Ohooo ....
i didnt get even a single word.
.......unfotunately its not for a Mechanical Engr. like me ....

Pilot-Pooja said...

Ankur, thanks for your reading efforts!

My apologies for making it too core specific, will try giving more briefing in next posts.


Mindbox