Sunday, 27 January 2013

Complete multithreaded Matrix multiply program in C

I am not the author of this code but I have integrated it to while doing assignment I just forgot those links but anyway I said it, its not my code.
Also if you think doing this is good or bad please comment that will be a good feedback so may be I can improve things in future.
//starting of code
#include<stdio.h>   
#include<pthread.h>
#define ARRAY_SIZE 3
       
    typedef int matrix_t[ARRAY_SIZE][ARRAY_SIZE];   
    matrix_t MAm,MBm,MCm;
    /* main: allocates matrix, assigns values, computes the results */   
    typedef struct {   
    int       id;   
    int       size;   
    int       row;   
    int       column;   
    matrix_t  *MA;   
    matrix_t  *MB;   
    matrix_t  *MC;   
    } matrix_work_order_t;   

void mult(int size,             /* size of the matrix */   
              int row,              /* row of result to compute */   
              int column,           /* column of result to compute */   
              matrix_t MA,          /* input matrix */   
              matrix_t MB,          /* input matrix */   
              matrix_t MC) {       /* result matrix */   
       
          int position;   
       
          MC[row][column] = 0;   
          for(position = 0; position < size; position++) {   
                MC[row][column] = MC[row][column] +   
                  ( MA[row][position]  *  MB[position][column] ) ;   
          }   
    }
    /*   
    * Routine to start off a peer thread   
    */   
       
    void peer_mult(matrix_work_order_t *work_orderp)   
    {   
       
    mult(work_orderp->size,   
         work_orderp->row,   
         work_orderp->column,   
         *(work_orderp->MA),   
         *(work_orderp->MB),   
         *(work_orderp->MC));   
       
    free(work_orderp);   
    }   
   
    extern int   
    main(void) {   
        int size = ARRAY_SIZE, row, column;   
          /* Fill in matrix values, currently values are hardwired */
  for (row = 0; row < size; row++) {
    for (column = 0; column < size; column++) {
      MAm[row][column] = 1;
    }
  }
  for (row = 0; row < size; row++) {
    for (column = 0; column < size; column++) {
      MBm[row][column] = row + column + 1;
    }
  }
  printf("MATRIX MAIN THREAD: The A array is is;\n");
  for(row = 0; row < size; row ++) {
    for (column = 0; column < size; column++) {
      printf("%5d ",MAm[row][column]);
    }
    printf("\n");
  }
  printf("MATRIX MAIN THREAD: The B array is is;\n");
  for(row = 0; row < size; row ++) {
    for (column = 0; column < size; column++) {
      printf("%5d ",MBm[row][column]);
    }
    printf("\n");
    }
       
        matrix_work_order_t *work_orderp;   
        pthread_t peer[size*size];       
        /* Process Matrix, by row, column */   
        int id=0 ;
        for(row = 0; row < size; row++)     {   
          for (column = 0; column < size; column++) {   
               id = column + row*3;   
               work_orderp =(matrix_work_order_t *)malloc(sizeof(matrix_work_order_t));   
               work_orderp->id = id;   
               work_orderp->size = size;   
               work_orderp->row = row;   
               work_orderp->column = column;   
               work_orderp->MA = &MAm;   
               work_orderp->MB = &MBm;   
               work_orderp->MC = &MCm;   
       
               pthread_create(&(peer[id]), NULL, (void *)peer_mult,   
                               (void *)work_orderp);   

          }   
        }   
        int i=0;   
               /* Wait for peers to exit */   
        for (i = 0; i < (size * size); i++) {   
           pthread_join(peer[i], NULL);   
        }       
printf("MATRIX: The resulting matrix C is:\n");   
          for(row = 0; row < size; row ++) {   
            for (column = 0; column < size; column++) {   
            printf("%5d ",MCm[row][column]);   
            }   
          printf("\n");   
          }
        return 0;   
    }

Tuesday, 8 January 2013

x86 and linux process kernel Stack

I even had this as doubt until I figured it out and as of I tried to google the solution on net found non so this could help others to find the solution if they have similar doubts.
x86 have few registers to store data so efficient ways are implemented to consume less of available registers, one is to find the current process running .Many RISC arch processors store the address of current process running in a register but for x86 Linux does maps struct thread_info which has task_struct which points to process being running .
I read in book Linux Kernel Development in topic "Storing the process descriptor" page 26 in new edition about this but I did not got it.
I understood that when process block is made in RAM it is allocated to any physical address in RAM and so there is no perfect knowledge of the starting point or so. Thus we can't even say any thing about the stack's address. So I did not got how could masking just 13 bits of address of stack point of kernel process stack gives the address of current_thread_info.
This was wrong as right now I have no knowledge of how paging in done in physical address space. 
But then I tried to find out that how that happens and I had some idea that if it is at a predefined location than be can have such conditions.
I found the good explanation is Understanding linux Kernel in topic 3.2.2.1 "Process descriptor handling" page 4 of chapter 3.
Its simple, 2 pages are assigned to Kernel stack and current_thread_info struct which are aligned to multiple of 2 power 13. And this is the point I was not knowing. Rest all you can understand by diagram fro m the book.

Every process( ) has memory reserved for kernel function when those processes call the system calls .This kernel stack are used.