Monday, December 14, 2009

Read firefox bookmarks using Java

I've often seen people asking in various forums "how do we read firefox bookmarks from Java/C# ? ". It seemed like an interesting question and I ended up solving the question in around 2 hrs. A little bit of reading about the firefox database will be necessary. You can get that here. Now for the implementation you will need to add the jar file. You can get it here. You can add the external jar to your project and then compile the following code. The implementation only reads and lists out some fields in places.sqlite. You can modify the query and do various other things.


import java.sql.*;

public class firefoxBookmarksReader {
public static void main(String[] args) throws Exception {
Class.forName("org.sqlite.JDBC");
Connection conn = DriverManager.getConnection("jdbc:sqlite:/home/deepak/.mozilla/firefox/yvf7p20d.default/places.sqlite//"); // replace the path appropriately depending on the location of places.sqlite
if(conn==null)
{
System.out.println("ERROR");
}
System.out.println(conn.toString());

Statement stat = conn.createStatement();

ResultSet rs = stat.executeQuery("select * from moz_bookmarks;");
while (rs.next()) {
System.out.println("id = " + rs.getString("id"));
System.out.println("keyword = " + rs.getString("keyword_id"));
System.out.println("title = " + rs.getString("title"));
}
rs.close();
conn.close();
}
}


Thanks,
Layman

Tuesday, December 8, 2009

Go on cygwin

I have a crap windows machine at my workplace and I wanted to do some work on Go during my free time. I had cygwin running and I could successfully install Google go on cygwin. So I thought I'll put this as a post.

You have to get the repository. But you wont have hg command installed(in most cases)
If you have already installed hg command for cygwin skip to step 2

1. Do the folowing commands
$ wget http://peak.telecommunity.com/dist/ez_setup.py
$ python ez_setup.py
$ easy_install -U mercurial
$ chmod +x /usr/lib/python2.5/site-packages/mercurial-1.0-py2.5-cygwin-1.5.25-i686.egg/mercurial/*.dll

2. Now that you have hg command on your machine. Do the following command.
$ hg clone https://golang-on-cygwin.googlecode.com/hg/ golang-on-cygwin

3. Now you have to setup certain environment variables.
export GOROOT=/path/to/golang-on-cygwin
export GOARCH=386
export GOOS=linux
export GOBIN=/path/to/your/local/bin
export PATH=$PATH:$GOBIN

4. Change to the source directory and run the all.bash script
cd $GOROOT/src
./all.bash

Installation will succeed and you can start with Google go

Thanks,
Layman

Thursday, December 3, 2009

Thread, semaphore, signals example in C

I got this question by chance. It was about creating 2 threads that would run in such a way that if one thread prints a value i, the next thread should print a value i+1.

The pthread_create() function is used to create a new thread, with attributes specified by attr, within a process. If attr is NULL, the default attributes are used. If the attributes specified by attr are modified later, the thread's attributes are not affected. Upon successful completion, pthread_create() stores the ID of the created thread in the location referenced by thread.

The pthread_join() function suspends the execution of the calling thread until the thread identified by the thread identifier terminates, either by calling pthread_exit() or by being canceled.

For our given problem we need the increment and print action for each thread to be done as atomic actions. No context switch should happen these operations. If that happens we may not get the desired result. In case a context switch happens after a read in thread one, then thread 2 might print the incremented value and thread 1 will print the old value, and there are other possibilities as well.Hence the use of semaphore cannot be avoided.

Use of SIGINT (Cntr+C) is used to quit the program.


#include < stdio.h >
#include < pthread.h >
#include < semaphore.h >
#include < signal.h >

sem_t semname;
int qval=1;

void *func1(void *arg) {
while(qval) {
sem_wait(&semname);
printf("Thread one prints # %d\n",(*(int *)(arg))++);
sem_post(&semname);
sleep(1);
}
}
void *func2(void *arg) {
while(qval) {
sem_wait(&semname);
printf("Thread two prints # %d\n",(*(int *)(arg))++);
sem_post(&semname);
sleep(1);
}
}
void quit() {
qval=0;
}


main() {
signal(SIGINT,quit);
sem_init(&semname,0,1);
int i=0;
pthread_t thread1,thread2;
pthread_create(&thread1,NULL,func1,&i);
pthread_create(&thread2,NULL,func2,&i);
pthread_join(thread1,&i);
pthread_join(thread2,&i);
printf("\nQuiting...\n");
return 0;
}

Thanks
Layman

Wednesday, December 2, 2009

Trie datastructure C example

This post is about the trie datastructure. Here we are trying to do a lexicographic trie. It can store all the words of a dictionary in an efficient manner. In the lexicographic trie datastructure each node has 26 pointers. Each pointers represent a letter. So to store "lay" , the system will first create a path from pointer for letter l to a new node(call it ptr). Then it makes a new node and makes the ptr's pointer for letter a point to this new node and so on. The advantage is that many words start with the same substring (eg new and neo). So here the common part "n" & "e" is stored only once. Hence lesser storage space and efficiency. In terms of computational speed searching is much faster as each time we can eliminate 25 links and hence it must be log n to the base 26. This is used commonly to implement dictionaries etc.
See the code now.

CODE

#include < stdio.h >
#include < stdlib.h >
#include < string.h >

struct diction
{
struct diction *alpha[26];
int flag;
};
typedef struct diction dic;

dic *root;
int maptoptr(char a)
{
char j='a';
int i;
for(i=0; i<26 a="'a';" root="(dic" i="0;i<26;i++,a++)" t="maptoptr(a);">alpha[t]=NULL;
}
}


void insert(char ip[])
{
dic *ptr=root;
int t;
int f=0;//false
int i=0;

t=maptoptr(ip[i]);
while(ptr->alpha[t]!=NULL && f==0) //not finished searchin and if there is a path with current letter
{
if(ip[i]=='\0' && ptr->flag==1)
{
f=1;//true
}
else
{
ptr=ptr->alpha[t];
i++;
}
t=maptoptr(ip[i]);
}
if(f==1)
{
printf("Already exists\n");
}
else
{
dic *ptr1=NULL;
char temp='a';
int j;
while(ip[i]!='\0')
{
temp='a';
ptr1=(dic *)malloc(sizeof(dic));
for(j=0;j<26;j++,temp++) t="maptoptr(temp);">alpha[t]=NULL;
}
ptr1->flag=0;
t=maptoptr(ip[i]);
ptr->alpha[t]=ptr1;
ptr=ptr1;
i++;
}
ptr->flag=1;
}
}
void search(char ip[])
{
int i=0,f=1;
int t;
dic *ptr=root;
while(ip[i]!='\0'&& f==1)
{
t=maptoptr(ip[i]);
if(ptr->alpha[t]==NULL)
{
f=0;
i++;
break;
}
ptr=ptr->alpha[t];
i++;
}
if(f==1 && ip[i]=='\0' && ptr->flag==1)
printf("\nFound\n");
else
printf("\nNot found\n");
}

void bt_free(dic *ptr)
{
int i=0,t;
char a='a';
if(ptr==NULL)
return;
for(i=0;i<26;i++,a++) t="maptoptr(a);">alpha[t]!=NULL)
{
bt_free((ptr->alpha[t]));
}
}
free(ptr);
}
main()
{
char a[25];
printf("Enter the word to insert : ");
gets(a);
createroot();
insert(a);
printf("Enter another word to insert : ");
gets(a);
insert(a);
printf("Enter the word to be searched : ");
gets(a);
search(a);
bt_free(root);
}




OUTPUT

$ ./a.exe
Enter the word to insert : layman
Enter another word to insert : laymansviews
Enter the word to be searched : layman

Found

Wednesday, November 25, 2009

Art of computer programming -Sorting #4

Bubble sort
Bubble sort is the most well known sort. The idea of bubble sort is very straight forward. We compare adjacent elements and rearrange them. So element 1 and element 2 are compared and if element 1 is > element 2 then they are swapped, next element 2 and element 3 are compared and so on. In the first pass the largest element will be swapped to position n-1. In the second pass second largest element will reach n-2 etc. So in the nth pass ith largest element will reach n-i. There are a total of n passes. Now there are very little interesting things about this sort.

Point to note
The algorithm is inefficient for small and large lists. For small lists insertion sort maybe used and O(n^2) sorting is not recommended for large lists.
Knuth says that "the bubble sort seems to have nothing to recommend it, except a catchy name and the fact that it leads to some interesting theoretical problems"
Bubble sort is 40% slower than selection sort and 5 times slower than insertion sort.

Algorithm
procedure bubbleSort( A : list of sortable items ) defined as:
do
swapped := false
for each i in 0 to length(A) - 2 inclusive do:
if A[i] > A[i+1] then
swap( A[i], A[i+1] )
swapped := true
end if
end for
while swapped
end procedure

Worst case performance : O(n^2)
Best case performance : O(n)
Average case performance : O(n^2)
Worst case space complexity: O(1)

Thanking you,
Layman

Thursday, November 19, 2009

C/C++ teaser

This is not very difficult. But just have a look.

1. Predict the outcome
#include < stdio.h >
main()
{
int a=0;
printf("%s",a);
}

2. Predict the outcome
#include < stdio.h >
main()
{
int arr[]={2,3,4,5,6};
int *iptr=arr;
iptr++;
printf("%d",*(iptr+i));
arr++;
printf("%d",*(arr+i));
}


3. Predict the outcome
#include < stdio.h >
main()
{
int arr[]={2,3,4,5,6};
int *iptr=arr;
int i=0;
printf("%d,%d%d,%d",*(iptr+i),i[arr],arr[i],*(arr+i));
}


Please answer the questions as comments. The answers will be posted soon.

Thanks
Layman

Wednesday, November 18, 2009

Art of computer programming -Sorting #3

Shell sort
For any sorting algorithm that moves only one position at a time , then the running time will always be proportional to n^2. Hence the idea was struck to make larger leaps rather than small ones. The small leaps is one of the prime reasons for insertion sort's low efficiency. And Donald Shell proposed shell sort.
Now you will be wondering how to choose he leap/step size. In shell sort we can start with an arbitrary step size in the first pass, after every pass the step size decreases. Hence shell sort is also known as decreasing increment sort. We'll take an example. In the first pass, let the step size be 8.
So during first pass element 0 is compared with element 8, element 16 , element 24.....element n.....element( n+8) etc. During second pass the increment is reduced to 4, so element 0 is compared with element 4, element 8........element n, element(n+4) etc. In the third pass increment size be 2, comparison will be between element 0, element 2, element 4....element n,element( n+2) etc. And finally increment becomes 1 in the last pass. In the last pass element 0, element 1, element 2 ....element n, element (n+1) etc is compared.

This is how the shell sort operates. During each pass the elements being compared are re-arranged in sorted order,ie, if element i and j are compared the position of i and j are rearranged so that the series i,j is in sorted order.

Points of interest
Like the insertion sort it has best case of O(n) hence its suitable to check whether an array is sorted or not.
Choosing the increment sequence is a tricky question. If you are concerned about the efficiency I have few recommendations for you:
The Sedgewick increment sequence..
9\times4^i - 9\times2^i + 1, or 4^{i} - 3\times2^i + 1 ......
The best known sequence according to research by Marcin Ciura is 1, 4, 10, 23, 57, 132, 301, 701, 1750


Algorithm:
input: an array a of length n with array elements numbered 0 to n − 1

inc ← round(n/2)
while inc > 0 do:
for i = inc .. n − 1 do:
tempa[i]
ji
while jinc and a[jinc] > temp do:
a[j] ← a[jinc]
jjinc
a[j] ← temp
inc ← round(inc / 2.2)

Worst case performance : Depends on the leap/step size sequence
Best case performance : O(n)
Average case performance : Depends on the leap/step size sequence
Worst case space complexity: O(n)
Thanking you,
Layman







Tuesday, November 17, 2009

Art of computer programming -Sorting #2

Insertion sort
Insertion sort is one of those very traditional sorting methods. In this method, we scan through the list element by element and we assume that all elements preceding the present element is sorted. We start with the assumption that first element is sorted and we need to place the 2nd element to the left or the right of the first element based on whether its greater than or less than the 1st element. So we can generalize that for the i'th element we assume that the list from 0...i-1 is sorted and we can insert the ith element at the the correct position in this sorted sublist. So the problem reduces to inserting an element in a sorted list. But its not as simple as it sounds. Inserting in a sorted list may need shifting the elements of the sublist. Its a reasonably expensive operation. Hence using arrays will be quite expensive for insertion sort. Whereas if you are using linked lists, then shifting operations become much more easier(as it involves only 2-3 pointer operations).
On an average we will need a total of 0.5(1+2+...+n) shiftings ~ 0.25n^2

A method to improve this will be to use binary search to find the position of the element in the sorted list. This will reduce the position search time to log n. Hence the algorithm improves.


Points of interest
To check whether a list is already sorted insertion sort is highly recommended.
It can receive data as it is working. This is particularly useful when you have only an incomplete piece of data.
While insertion sort typically makes fewer comparisons than selection sort, it requires more writes because the inner loop can require shifting large sections of the sorted portion of the array.


Algorithm
insertionSort(array A)
begin
for i := 1 to length[A]-1 do
begin
value := A[i];
j := i - 1;
while j >= 0 and A[j] > value do
begin
A[j + 1] := A[j];
j := j - 1;
end;
A[j + 1] := value;
end;
end;


Worst case performance О(n^2)
Best case performance O(n)
Average case performance О(n^2)
Worst case space complexity O(1)

Friday, November 13, 2009

Rectangle intersection problem with tree datastructure

The problem here is to write a program that will find all the rectangles that intersects with a given rectangle once the left bottom and top right of the rectangles are given.

#include < stdio.h>
#include < stdlib.h>
#include < sys/stat.h>


struct rect {
int x,y;
struct rect *left,*right,*rtpoint;
};

int xmin,ymin,xmax,ymax;

typedef struct rect rec;
rec *root=NULL,*newptr=NULL;

void find(int x1,int cnd) {
rec *parent=NULL,*tptr=NULL;
rec *ptr=root;
while(ptr!=NULL) {
parent=ptr;
if(cnd==0) {
//if( ((ptr->x > x1) && (ptr->rtpoint)->x < x1) || ((ptr->x < x1) && ((ptr->rtpoint)->x>x1) ) )
if(ptr->rtpoint!=NULL) {
if(check(ptr)) {
printf("\nIntersects (%d,%d) - (%d, %d)\n",ptr->x,ptr->y,(ptr->rtpoint)->x,(ptr->rtpoint)->y);
(ptr->rtpoint)->rtpoint=NULL;
ptr->rtpoint=NULL;
}
}
}
if(ptr->x > x1)
ptr=ptr->left;
else
ptr=ptr->right;
}
if(cnd==1) {
if(parent->x > x1 )
parent->left=newptr;
else
parent->right=newptr;
}
}

int check(rec *temp) {
rec *temp2=temp->rtpoint;
int ys=temp->y,xs=temp->x;
int yr=temp2->y,xr=temp2->x,t;
if(ys>yr) {
t=ys;
ys=yr;
yr=t;
}
if(xs>xr) {
t=xs;
xs=xr;
xr=t;
}
return !( (yr <= ymin) || (ys >= ymax) || (xr <= xmin) || (xs >= xmax) );
}

void preorder(rec *ptr) {
if(ptr!=NULL) {
preorder(ptr->left);
printf("x=%d y=%d\n",ptr->x,ptr->y);
preorder(ptr->right);
}
}
void bt_free() {
if(root==NULL)
return;
bt_free(root->left);
bt_free(root->right);
free(root);
}

int main() {

FILE * fp;
int temp;
rec *temp_ptr=NULL;
struct stat stFileInfo;
char filename[50];
do {
printf ("Enter the filename to get data : ");
gets(filename);
temp=stat(filename,&stFileInfo);
if(temp!=0) {
printf("File doesnt exist!!! \n");
}
}while(temp != 0);
fp = fopen(filename,"r");
while(fscanf(fp,"%d %d %d %d",&xmin,&ymin,&xmax,&ymax) > 0) {
if(root==NULL) {
root=(rec *)malloc(sizeof(rec));
root->x=xmin;
root->y=ymin;
root->left=NULL;
root->right=NULL;
temp_ptr=root;
newptr=(rec *)malloc(sizeof(rec));
newptr->left=NULL;
newptr->right=NULL;
newptr->rtpoint=temp_ptr;
newptr->x=xmax;
newptr->y=ymax;
find(xmax,1);
temp_ptr->rtpoint=newptr;
}
else {
newptr=(rec *)malloc(sizeof(rec));
newptr->left=NULL;
newptr->right=NULL;
newptr->x=xmin;
newptr->y=ymin;
temp_ptr=newptr;
find(xmin,1);
newptr=(rec *)malloc(sizeof(rec));
newptr->left=NULL;
newptr->right=NULL;
newptr->rtpoint=temp_ptr;
newptr->x=xmax;
newptr->y=ymax;
find(xmax,1);
temp_ptr->rtpoint=newptr;
}
}
fclose(fp);
printf("Tree created successfully\nPreorder traversal follows\n");
preorder(root);
printf("\nEnter the test rectangle co-ordinates seperated by comas\n");
scanf("%d,%d,%d,%d",&xmin,&ymin,&xmax,&ymax);
if(xmin>xmax) {
temp=xmin;
xmin=xmax;
xmax=temp;
temp=ymin;
ymin=ymax;
ymax=temp;
}
find(xmin,0);
find(xmax,0);
bt_free();
return 0;
}


Sample input file
6 6 11 11
0 0 2 2
9 9 12 12
60 0 70 100
7 2 8 12

COM/Serial communication in C

This program will help you to understand how to do serial communication in C.
The C program will illustrate the basics of serial communication.


/////////////////////////////////////////////////
// Serial port interface program //
/////////////////////////////////////////////////

#include < stdio.h > // standard input / output functions
#include < string.h > // string function definitions
#include < unistd.h > // UNIX standard function definitions
#include < fcntl.h > // File control definitions
#include < errno.h > // Error number definitions
#include < termios.h > // POSIX terminal control definitionss
#include < time.h > // time calls
#include < stdlib.h >
#include < sys/types.h >
#include < netinet/in.h >
int fd;
int open_port(void)
{
int fd; // file description for the serial port

fd = open("COM2", O_RDWR);

if(fd == -1) // if open is unsucessful
{
perror("open_port: Unable to open /dev/ttyS0 - ");
}
else
{
printf("Opened %d\n",fd);
}

return(fd);
}

int configure_port(int fd) // configure the port
{
struct termios oldtio,port_settings; // structure to store the port settings in
tcgetattr(fd,&oldtio);
//cfsetispeed(&port_settings, B115200); // set baud rates
//cfsetospeed(&port_settings, B115200);

bzero (&port_settings, sizeof (port_settings));
port_settings.c_cflag = B115200 | CLOCAL | CREAD;
port_settings.c_cflag &= ~PARENB; // set no parity, stop bits, data bits 8N1
port_settings.c_cflag &= ~CSTOPB;
port_settings.c_cflag &= ~CSIZE;
port_settings.c_cflag |= CS8;
port_settings.c_iflag = IGNPAR | ICRNL;
port_settings.c_oflag = 0;
port_settings.c_lflag &= ~ICANON;
tcflush(fd, TCIFLUSH);
tcsetattr(fd,TCSANOW,&port_settings);
return(fd);
}

int query_modem(int fd) // query modem with an AT command
{

int k=0;
char str[5];
str[0]=2;
str[1]=str[2]='f';
str[3]=str[4]=3;
k=write(fd,str,strlen(str)); // send an AT command followed by a CR
printf("command %d\n",k);

}
int dsadas()
{
printf("In fm");
int bytesRead;
char *str;
char *buff;
//str=(char *)malloc(100*sizeof(char));
//buff=(char *)malloc(1024*sizeof(char));
while(1)
{

//printf("In while");
if(read(fd,str,10)>0)
{
printf("in if");
buff="";
printf("yesyeysyes %s\n",str);
//strcpy(buff,str);
do
{
//sleep(100);
//append
printf("Buffer = %s",buff);
strcat(buff,str);
bytesRead=read(fd,str,10);
}while(bytesRead>0);
//send bytes to ip
printf("Buffer = %s",buff);
}
}
}

int main(void)
{
pthread_t thread1,thread2;
fd = open_port();
configure_port(fd);
query_modem(fd);
int iret1, iret2;
//iret1 = pthread_create( &thread1, NULL, dsadas);
//pthread_join(thread1, NULL);
return(0);
}

Create tree from preorder traversal

#include < stdio.h >
#include < stdlib.h >

int current=0;

struct bstree {
int val;
struct bstree *left,*right;
};

typedef struct bstree node;

node* tree_build(int* inarr,int* prearr,int left,int right);
void sort(int inarray[], int begin, int end);
void swap(int *a,int *b);
void inorder_traverse(node *ptr);
void bt_free(node *ptr);

main() {
int i;
int prearr[]={10,4,3,2,5,6,11,13,12};
int inarr[10];
for(i=0;i<9;i++)
inarr[i]=prearr[i];
sort(inarr,0,9);
node* root = tree_build(inarr,prearr,0,9);
printf("\n\nTREE CREATED. NOW DOING AN INORDER TRAVERSAL ON THE TREE...\n\n");
inorder_traverse(root);
printf("\n\n");
bt_free(root);
}

void bt_free(node *ptr) {
if(ptr==NULL)
return;
bt_free(ptr->left);
bt_free(ptr->right);
free(ptr);
}

void swap(int *a,int *b) {
int temp=*a;
*a=*b;
*b=temp;
}

void sort(int inarray[],int begin,int end) {
if (end > begin + 1) {
int pivot=inarray[begin],l=begin+1,r=end;
while(l if (inarray[l] <= pivot)
l++;
else
swap(&inarray[l],&inarray[--r]);
}
swap(&inarray[--l],&inarray[begin]);
sort(inarray,begin,l);
sort(inarray,r,end);
}
}

node* tree_build(int inarr[],int prearr[],int left,int right) {
int i,position;
if(left==right)
return NULL;
node *temp=(node *)malloc(sizeof(node));
temp->val=prearr[current];
if(left+1==right)
return temp;
i=left;
while(i if(inarr[i]==prearr[current])
break;
i++;
}
for(position=0;inarr[position]!=inarr[i];position++);
if(position>left) {
current++;
temp->left=tree_build(inarr,prearr,left,position);
}
if(position+1 current++;
temp->right=tree_build(inarr,prearr,position+1,right);
}
return temp;
}

void inorder_traverse(node *ptr) {
if(ptr==NULL)
return;
inorder_traverse(ptr->left);
printf(" %d - ",ptr->val);
inorder_traverse(ptr->right);
}

Friday, October 23, 2009

TCPIP communication in C

Server.c
/* A simple server in the internet domain using TCP
The port number is passed as an argument */
#include < stdio.h >
#include < sys/types.h >
#include < sys/socket.h >
#include < netinet/in.h >

void error(char *msg)
{
perror(msg);
exit(1);
}

int main(int argc, char *argv[])
{
int sockfd, newsockfd, portno, clilen;
char buffer[256];
struct sockaddr_in serv_addr, cli_addr;
int n;
if (argc < 2) {
fprintf(stderr,"ERROR, no port provided\n");
exit(1);
}
sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0)
error("ERROR opening socket");
bzero((char *) &serv_addr, sizeof(serv_addr));
portno = atoi(argv[1]);
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = INADDR_ANY;
serv_addr.sin_port = htons(portno);
if (bind(sockfd, (struct sockaddr *) &serv_addr,
sizeof(serv_addr)) < 0)
error("ERROR on binding");
listen(sockfd,5);
clilen = sizeof(cli_addr);
newsockfd = accept(sockfd,
(struct sockaddr *) &cli_addr,
&clilen);
if (newsockfd < 0)
error("ERROR on accept");
bzero(buffer,256);
n = read(newsockfd,buffer,255);
if (n < 0) error("ERROR reading from socket");
printf("Here is the message: %s\n",buffer);
n = write(newsockfd,"I got your message",18);
if (n < 0) error("ERROR writing to socket");
return 0;
}




Client.c
#include /* for printf() and fprintf() */
#include /* for socket(), connect(), send(), and recv() */
#include /* for sockaddr_in and inet_addr() */
#include /* for atoi() and exit() */
#include /* for memset() */
#include /* for close() */
#include
#define RCVBUFSIZE 200024 /* Size of receive buffer */

void DieWithError(char *errorMessage); /* Error handling function */

int main(int argc, char *argv[])
{
int sock; /* Socket descriptor */
struct sockaddr_in echoServAddr; /* Echo server address */
unsigned short echoServPort; /* Echo server port */
char *servIP; /* Server IP address (dotted quad) */
char *echoString; /* String to send to echo server */
char echoBuffer[RCVBUFSIZE]; /* Buffer for echo string */
unsigned int echoStringLen; /* Length of string to echo */
int bytesRcvd, totalBytesRcvd; /* Bytes read in single recv()
and total bytes read */

servIP = "10.64.122.26"; /* First arg: server IP address (dotted quad) */
echoServPort = atoi("9000"); /* Use given port, if any */

/* Create a reliable, stream socket using TCP */
if ((sock = socket(PF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0)
DieWithError("socket() failed");

/* Construct the server address structure */
memset(&echoServAddr, 0, sizeof(echoServAddr)); /* Zero out structure */
echoServAddr.sin_family = AF_INET; /* Internet address family */


echoServAddr.sin_addr.s_addr = inet_addr(servIP);



echoServAddr.sin_addr.s_addr = inet_addr(servIP); /* Server IP address */
echoServAddr.sin_port = htons(echoServPort); /* Server port */

/* Establish the connection to the echo server */
if (connect(sock, (struct sockaddr *) &echoServAddr, sizeof(echoServAddr)) < 0)
DieWithError("connect() failed");




/* Receive the same string back from the server */
totalBytesRcvd = 0;
printf("Received: "); /* Setup to print the echoed string */
while (1)
{
/* Receive up to the buffer size (minus 1 to leave space for
a null terminator) bytes from the sender */
if ((bytesRcvd = recv(sock, echoBuffer, RCVBUFSIZE - 1, 0)) > 0)
{
echoBuffer[bytesRcvd] = '\0'; /* Terminate the string! */
printf("%s\n", echoBuffer); /* Print the echo buffer */
}

}

printf("\n"); /* Print a final linefeed */

close(sock);

exit(0);
}//-- end main --//



void DieWithError(char *errorMessage)
{
perror(errorMessage);
exit(1);
}

Thanking you
Layman

Tuesday, October 13, 2009

C code to check for the existance of a file

Recently I came across this question to check if a file exists in a given directory. The question looks pretty simple, but most of those guys just out of college will have trouble doing this, its for them that I write this down. Actually, you are supposed to know all this. But nothing happens as expected during college. So lets look at it at least now. There are many ways to do the same.
You can do a fopen and if it fails then you can probably assume that the file doesn't exist. The standard practice in the software industry is to use a stat() function.

Includes:
#include
Code:
int intStat;
struct stat stFileInfo;
intStat = stat(filename,&stFileInfo);
if(intStat==0)
printf("File already exists!!!\n ");
else
printf("File doesnt exist\n ");


So,this is how you will check whether a file exists or not with the stat function. From now on use this method rather than checking whether fopen fails
Thanking you,
Layman

Saturday, September 5, 2009

Art of computer programming -Sorting #1

Long since Ive posted something. This time we will learn sorting. Sorting is familiar to you and to me for many years. But there are aspects of it which I haven't heard about ever since I first knew it. Those will be the points of discussion here.

Sorting by counting :
I cant say this is the best method. But every method has its advantages and disadvantages when used appropriately.

set count[i]=0 for all i from 0 to n-1
for(i=n;i>1;i++) {
for(j=i-1;j>=1;j++) {
if element[i]>element[j] count[i]++;
else count[j]++;

Now we have an auxiliary table within which we have the position of the elements in the sorted list.
Distribution counting sort:
This technique is applicable when we have to sort the numbers which belongs to a small range.
That is all the values to sort should be less than v(upper limit) and greater than u(lower limit) and v-u is small(<10 i="0;i). In such cases we can use the sorting by counting method to its full use and it'll give much better efficiency than any other method.
Thanks
Layman

Sunday, July 5, 2009

Mini/Main CS Project ideas

Its been sometime since I wanted to post some project ideas.

  • Develop a debugger for C/CPP : gdb is a well known debugger in linux. On similar lines you can write a debugger for C/CPP. A debugger should contain functionalities to setup break points , view variable status etc. You can use Eclipse and see how ideally a debugger should work. I recommend you to do a mini-version of gdb. You will have many references since many debuggers including gdb are open-source.

  • Virtual private server : VPS will provide multiple virtual hosting environments on the same physical server. This will eliminate the need for separate physical servers for hosting different websites. It will also provide clients with an economical solution for hosting their website and allow web hosting providers (WHPs) to use their resources efficiently. Each VPS acts as a dedicated server with full root access. The benefits from such products are efficient use of existing resources by sharing them between various customers, provides more security than other sharing hosting techniques, provides root access inside each VPS though with limited capabilities, faster context switching, and minimum modifications to kernel.
  • Octave Gtk : Octave-Gtk aims to bring a full-featured toolkit like GTK to GNU Octave, to make scientific computing tools with a GUI front-end scripted from Octave itself. This directly translates to end users as a new and powerful paradigm for constructing GUIs with minimal knowledge of the constructs and easy interface for scientific programs in a really short time. Octave-Gtk is an Octave Gtk binding, which helps you access the GTK C API from the GNU Octave, an interpreted language. This cross language interoperability is achieved by Octave-Gtk binding code, which enables type safe, idempotent access of functions and objects from the either domains of Octave and C in a clean, and transparent manner, hiding the details to the end user.
  • CrossTalk : CrossTalk is a VoIP client for Windows and Linux. It allows users to talk in real time (like telephone calls) and explores the exciting world of Internet Telephony. CrossTalk will implement a simple user interface allowing users to easily make calls anywhere. It will support the encryption of the data stream, low latnecy, use of minimum bandwidth and maybe multiple conferencing as well. I intend to develop CrossTalk clients for both Linux and Windows. All of the above 'abstracts' are what I intend to do, whether I can do it, or, maybe even better it is to be seen.
  • A Highly secure steganographic file system : Magikfs is a highly secure steganographic file system with the plausible deniability feature being implemented on the Linux platform particularly over an ext2/ext3 partition without structural modifications. The data which is already present in the file system is not harmed at all. Magikfs will be user friendly,easy to install and easy to give your files the highest kind of security ever offered by file systems that exists today.
  • Secure PC to PC communication : PC to PC communication is essential to share the information. The SENDTO command is used to directly send the message over the network. So, it is not possible to send the secured information. The sent message is not secured and can be mis utilized or eavesdropped. RSA is a public key algorithm. Public key cryptography uses 2 keys a "public key" and "private key" to implement an encryption algorithm that does not require two parties to first exchange a secret key in order to conduct secure communication. Two types of primitive are specified in this document, organized in pairs: encryption and decryption. An encryption primitive produces a cipher text representative from a message representative under the control of a public key, and a decryption primitive recovers the message representative from the cipher text representative under the control of the corresponding private key. Its security is based on difficulty of factorizing large integers.
  • Mobile based collaboration : Mobiles are increasingly becoming high-end not only in terms of hardware but also in operating system cum software support. Hence, there is much demand of feature packed applications which are as good or even better than their PC based counterparts. Our application, a mobile based collaboration tool, beautifully packs features from different popular domains (like google calendar, maps, drawing tools, text chat etc.) into an application which allows collaboration with other users. Online document creation made popular by Google Docs has paved the way for creation of similar utilities to make a users experience wholesome. The very idea of an amateur internet user being able to share and communicate with his/her friends, relatives and business associates over the internet through simplistic yet powerful tools is very encouraging. Realizing this fact, we have developed an application that can collaboratively share information and data among a group of trusted users. Our main utility is a drawing tool for Android (Open Source Mobile SDK) which is based on the principle that inherent capability of human beings to express and understand is much better in terms of colorful figures than through plain text. In addition to this, we have incorporated several other features in our application, Google calendar, maps, advanced drawing features, text chat etc., that allow users to use the shared information enhancing the overall experience.
Courtesy: Red hat scolarships
Thanking you,
Layman

Tuesday, June 30, 2009

Regular expressions quick list

One of the most essential things you need to know as a software engineer or as a software engineering student is regular expressions. This post will brief you about the basics of regular expressions, again remember this is just a layman's view and not a geeks perspective, so read this and move on.....

Metacharacters -------------------------Matches
  1. . -------------------------------Anything except newline character
  2. \n -----------------------------Newline
  3. * ------------------------------Zero or more copies of preceding expression
  4. + -----------------------------One or more occurrence of preceding expression
  5. ? ------------------------------Zero or one copies of preceding expression
  6. ^ -----------------------------Beginning of a line
  7. $ ------------------------------End of line
  8. a|b ---------------------------a or b
  9. (ab)+ -----------------------One or more copies of ab (grouping)
  10. "a+b" -----------------------Literal "a+b"
  11. [] ------------------------------Character class
Examples :-
  1. abc------------------------------------- abc
  2. abc* -----------------------------------ab abc abcc abcc abccc ......
  3. abc+ ----------------------------------abc abcc abccc.....
  4. a(bc)+ -------------------------------abc abcbc abcbcbc....
  5. a(bc)? --------------------------------a abc
  6. [abc] ----------------------------------one of:a,b,c
  7. [a-z] -----------------------------------any letter, a-z
  8. [a\-z] ----------------------------------one of : a,-,z
  9. [-az] ---------------------------------- one of : - , a, z
  10. [a-zA-Z0-9]+ ----------------------one or more alphanumeric characters
  11. [ \t\n]+ -------------------------------whitespace
  12. [^ab] ---------------------------------anything except : a,b
  13. [a^b] ---------------------------------one of : a, ^, b
  14. [a|b] ----------------------------------one of : a, | , b
  15. a|b ------------------------------------one of : a, b
  16. ^[a-z] --------------------------------line starting with a character from a-z
  17. [a-z]$ --------------------------------line ending with a character from a-z
  18. ^[a-z]$ -----------------------------line containing only a character from a-z
Thanking you,
Layman

Monday, June 29, 2009

Linux Commands: Quicklist

    1. Who…….whoami
    2. date
    3. use of ; to separate commands
    4. man
    5. ls………. –F folder /…………. –l line by line
    6. cat Types contents onto screen …… -n option for line no -b for line no excluding blank lines
    7. wc word count -l for line count -w for word count -c for character count
    8. cp copy source file to destination -i ask before overwrite

      -r option for copying directories ……multiple directories may also be specified

      More than one sources can be specified. Last one will be taken as destination

    1. mv ……. –i option is available for preventing the overwrite.

      mv can also be used along with directories(moving)

    1. rm removes the files
    2. rmdir removes a directory
    3. ls –l gives the permissions …………..

    1st 3 charas give owner permissions

    2nd 3 gives group permissions

    3rd 3 gives other privileges

    1. chmod

      expression is either symbolic or octal. In symbolic letters are used. In octal nos from 0-7 are used to set the permissions.

      In symbolic form the syntax is (who) (action) (permission)

Letter Represents

u Owner

g Group

o Other

a All

Symbol Represents

+ Adding permissions to the file

- Removing permission from the file

= Explicitly set the file permissions

Letter Represents

r Read

w Write

x Execute

eg. $ chmod guo+rx *

gives read and execute permissions to group, owner and others(all).More than one set of permissions can be set by separating by commas.

    Octal methods

Read permission value = 4; Write permission value=2; Execute permission value=1;

Adding the value of the required permission will give us a value b/w 0-7. We can use this no to specify the permission for the owner, group, and others in the respective order.

$ chmod 0777 *

Will mean to add rwx permissions to all

$chmod 0770 *

will add rwx permissions to owner and group members whereas others will be prevented from reading , writing or executing the file.

Here * specifies that the change of permission is applicable for all the files in the pwd (present working directory)

  1. chown option user : group files

    chown is used to change the ownership of a file or group of files.

    Option is basically all crap. Don’t worry over that.

    User will be a an existing user . This will specify the new owner of the file(s)

    Group files will be a path. It can be a single file or maybe a directory path.

    In recent systems chgrp is a new command available.

    For portability sake we better skip this option and stick on to chown.

    1. ps

    This command lists all the presently running processes. Each process has a unique id called process id or PID. It will be unique during a runtime. Its usually a 5 digit number. Max value of PID is 32767.

    Each process will have a PID and a Parent PID attached to it. Each process will have a parent process. We will note that most processes have the PPID as the PID of shell.

    With all commands we may run them in foreground or background. If we run smthing on foreground we cannot do any other process until it completes. So to save time we can simply put them on to the background. To run a command as a background process just add a ‘&’ sign at the end of the command. Eg. $ps &

    TIP : stty –a command will give you the list of all key combinations for various functions.

    1. bg %job number

    To move a foreground process to background we can use the ^Z key combination so that the process is suspended and then using the bg command. By default if no job number is specified bg command resumes the most recently suspended process to background.

    1. fg %job number

    To move a suspended process or maybe a background process to foreground we can use this command. Job number maybe specified. In case it’s not supplied, the shell takes it as the last job suspended.

    1. nohup

    This command can be prefixed with any other command. It suppresses the action of the hang up signal and thus causes a command to run even after the user logs out. This command is usually used to run commands in background. The results if not redirected will go to a file called nohup.out . Otherwise it can be suitably redirected

    Eg. $nohup ls & will redirect the op to nohup.out

    1. wait

    wait can be used to put the shell to wait state until background process(es) proceed to completion. The optional parameter maybe a process id , a process number prefixed by % sign , it maybe blank. In case the parameter is blank(absent) the shell waits until all background processes attain completion.

    1. jobs

    This command shows the list of all processes that the user has suspended and also the ones running in the background. In the op we can see – and + signs appear. The job holding + sign will be the most recent one.

    1. kill

    The word says it all. It can kill a process. The parameter maybe a PID or it may be the process no prefixed by the %sign. Actually the kill command sends a TERM signal to the process. The process may ignore this signal or otherwise it can perform an orderly shutdown. In case it ignores we can go for the kill -9 (or kill –KILL) command followed by the parameter so that it will be forcibly killed.

    1. exec

    This command replaces the current process with a new process specified in the command. The real mechanism is overlaying the new process on the older running one.

    Variables

    Scalar variables are available in unix. The variable assignment is fairly simple.

    Variable name=value

    The lexical rules for variable names are same as that of C. In case we need to get the value of a variable we need to prefix the variable name by a $ sign. When the variable name prefixed by the $ sign is parsed, the idea is to textually substitute for the $variable by its value . Use $ along with the variable name only when the value is needed and not anywhere else. When assigning a string better use quotes.

    So far we have seen how to use scalar variables. Now we shall move on to the more organized arrays. The concept of arrays is same as in C. Array initialization is fairly simple.

    Array name[index]=value

    Or

    Array name =(element1 element2 element3 …….elementn)

    Or

    Array name=([index1]=element1 [index5]=element5…….)

    Accessing an array element is a different matter. The syntax will be…

    {Arrayname[index]}

    Unlike C ,here the shell doesn’t keep track of all the positions in an array. Rather it keeps track of those elements that are assigned values. To access all elements in an array 2 options are available.

    {arrayname[*]}

    Or

    {arrayname[@]}

    If we assign a value to the arrayname, its equivalent of assigning to the 0th element in the array.

    We may set a variable as read-only by the following command

    readonly variablename

    Once a variable has been set as readonly, then we can’t overwrite it.

    Just like we declared variables we can unset or rather remove them from the computer’s memory. The command format is shown below. This is same for both arrays as well as scalar variables. But we cannot use this command format to remove variables that have been set readonly.

    unset variablename

    So far we have discussed about variables. Now we shall see the different types of variables.

    • Local variable : Those variables which we had used until now. These are user modifiable and can be set by the user. These are the variables that are currently available within the shell and are not available to other processes of the shell.
    • Environment variable : These are variables that are accessible to every child process of the shell. These are also user-modifiable. Only those environment variables that are required for the proper functioning of the child program is usually reserved.
    • Shell variable : These are special types of variables that are set by the shell (and none else) for the proper functioning of itself. They maybe of either local or environment type.

    Now we will see how to export a variable to the environment.

    export varname=value varname1=value1……….varnamen=valuen

    Substitutions

    There are basically 4 types of substitutions

    • Filename substitution or globbing
    • Value based variable substitution
    • Command substitution
    • Arithmetic substitution

    In globbing we basically use three wildcards to match filenames. They are *(which matches any no of any characters),? (which matches any 1 character), [character list] (which matches any 1 character from the character list). Rules that are to be used in the character list are same as those that we have learnt in Perl (minor differences yet exist .for eg negation is specified by ! and not by ^).

    In value based substitutions, a value is substituted or replaced for a parameter according to certain conditions. There are basically 4 forms shown below.

    ${parameter:-word} Will mean to substitute the word for the parameter in case the parameter is either unset or is null. Here its blind substitution of the word instead of the parameter as in case of a macro and it’s the word that matters, not its value.

    ${parameter:=value}

    ………………………………….

    Command substitution is yet another form. Its fairly simple. The basic idea is that the command itself gets replaced by the output of a command. The usual context of implementation is in the case of assigning an output to a variable.

    Variable=’command’

    Arithmetic substitution basically deals with an expression as shown below. Here all the basic operations like +,-,*,/ exists. Also its possible to change the precedence by using braces. Arithmetic is integer arithmetic. Hence floating point values maybe truncated.

    $((expression))

    Flow control

    If statement:

    If list1

    then

    list2

    elif list3

    then

    list4

    else

    list5

    fi

    if statement finds application in error handling. List1 can be any operation, it maybe a encode operation or maybe a file creation operation. In any case if the operation fails(ie it returns nonzero) list2 is not executed.

    There is yet another representation.

    if list1 ; then list2; elif list3 ;then list4; else list5; fi

    test is a command used in association with if statement. It returns 0 or 1 based on the expression it receives as argument. Syntax is shown below.

    test expression

    or

    [ expression ]

    Now we shall see the various kind of tests.

    • File tests
    • String comparisons
    • Numerical comparisons

    File tests are used for testing files. The syntax is shown below.

    [ option filename ]

    Some of the options are:

    -b if file exists and if it’s a block special file

    -c if the file exists and if it’s a character special file

    -d if the file exists and id it’s a directory

    -e if file exists

    -f if file exists and if it’s a regular file

    -r if file exists and it is readable

    -w if file exists and it is writable

    -x if file exists and its executable

    Now we shall take a look at test operations on strings. There are basically 2 things we would like to find out.

    • Check whether a string is empty
    • Check whether 2 strings are equal

    [ -z string ] checks whether a string is of zero length

    [ -n string ] checks whether a string is of nonzero length

    [ string1=string2 ] if the strings are equal returns 0

    [ string1 != string2 ] if the strings are unequal returns 0

    Test can be done on integers by the following format.

    [ int1 operator int2 ]

    Operator can be:

    -eq equal to

    -le less than or equal to

    -lt less than

    -gt greater than

    -ge greater than or equal to

Here I would like to mention a very important point.

When we try a command in the prompt we get the status of the command immediately. Errors will be immediately reported. Whereas in a shell script this won’t happen. So we must have some provision to check whether the last command was successful.

The exit status of the last command will be stored in a special variable $?. We can simply check this variable for the value 0 to make sure the last command was successful.

Continuing with our discussion, now we will see compound expressions. These include combining simple test expressions into a compound expression. The general form is shown below.

[ expression1 ] operator [ expression2 ]

Operator maybe

-a and

-o or

Its also possible to negate the outcome of an expression by using the following format

[ ! expression ]

Some important points need special mention. The spaces in the formats can’t be ignored. Also all these test expressions are used in association with if expression.

Case statement:

Switch case statements serve the same purpose as in C. The syntax is shown below.

case word in

pattern1)

list1

;;

pattern2)

list2

;;

esac

or

case word in

pattern1) list1 ;;

pattern2) list2 ;;

esac

Here the word is matched with each pattern provided. If it matches the corresponding list is executed.

Each pattern may include the wildcards.

Loops
while loop:

All of us know how a while loop works. The syntax is shown below.

while command

do

    list

done

or

while command ; do list ; done

Command is usually a test expression that we have already dealt with.

List is any set of commands.

It’s possible to nest loops as in any other language.

There exists yet another loop. The until loop runs the list as long as the command(test expression) is false. Syntax is shown below.

until command

do

list

done

for loop:

The for loop is a slightly different concept from the for loop we know. It repeats a set of commands for each item in a list after assigning the item in the list to a variable. This is not the exact functioning of the loop. But for simplicity’s sake we shall look at it that way. Whatever be the loop deals with repeating a set of statements for each value in a list specified. Syntax is shown below.

for name in word1 word2 word3……….wordn

do

list

done

or

for name in word1 word2 word3……wordn ; do list ; done

Here the loop assigns each word to the name and then performs the list for each word.

Select loop:

This loop finds application in menu driven programs. The syntax is shown below.

select name in word1 word2 word3……….wordn

do

list

done

Here the bash prints a numbered list in the order word1…….wordn. Then it waits for the user to input. The input from the user is expected to be a number preceding any item that was displayed. If it is then the corresponding word is stored into the name.

Loop control

There are basically 2 loop control instructions.

  • break This command will break the control out of the loop.
  • continue This command will move the control out of the loop for the current iteration.




Input and output

Output to the terminal can be done 2 ways.

  • echo
  • printf

The usage of printf is the same as that in C.

Output of a command or a list of commands can be redirected to a file using the redirection operator.

Syntax is shown below.

{ command1; command2;…..commandn; } > filename

If no such file exists a new file is created. If it exists the file content is overwritten.

Appending to a file needs another redirection operator. Syntax is shown below.

{ command1; command2; command3; …………commandn; } >> filename

Sometimes it is necessary to write the result of commands onto the screen as well as a file. In that case we can use the following format.

command |tee filename


In input redirection, a text file can be redirected as an input to a command. The syntax is shown below.

command <>

There is yet another possibility.

command <<>

input1

input2

.

.

Inputn

delimiter

Here the redirection operator reads the input from the user until the specified delimiter is met with again. It then feeds the output to the command as input.

Now we will learn how to read an input from the user. The syntax is very simple.

read variable

Pipelining is yet another concept available.

command1 | command2……………

Here the output of command1 is fed to the input of command2 and so on.

File handles

We can associate a number with each filehandle and then use it to do operations on files.

3 file descriptors are always open along with each and every command.

  • STDIN value 0
  • STDOUT value 1
  • STDERR value 2

We can associate a filehandle with a file using the following command.

'exec n> filename'

or

'exec n>>filename'

Here n is any integer. If n is 1 then STDOUT is explicitly redirected to the specified file. In the second form the file is open in append mode.

Now to redirect the output of a command to a file we can use the following format.

'command n> filename' (Append mode is also available)

Also to read the contents from a file to the standard input of a command the following syntax can be used.

command n<>

Here n is the filehandle.

We may redirect STDIN, STDOUT and STDERR to any of the file by simply choosing 0,1,2 as the values for n.

It’s possible to redirect the input of STDIN/STDOUT and STDERR to different files.

'command 1>file1 2>file2'

The /dev/null file requires special mention. Anything redirected to this file will be discarded.

Sometimes we may want to send both STDOUT and STDERR to the same file. In that case the syntax is shown below.

'command > filename 2>&1'

Reading input from a file

Now we shall see how to read input from a file line by line and then use it for text filtering.

while read variablename

do

…………..

…………..

…………

'done <>'


Now let’s move on to the most important section as far as Y! is concerned. Most questions are asked from the text filtering section.









Text Filtering


Head and tail commands

We can extract the first/last n lines of a file by using head and tail commands respectively

head [ -n ] filename

tail [ -n ] filename

n is any integer. Head will return the first n lines and tail returns the last n lines. In case no integer is specified a default of 10 lines will be returned.

Grep command

This is yet another powerful text filtering command. Syntax is shown below.

grep option word filename

Option maybe

-i for case independent match

-v for unmatched set to be returned.

-n for line no’s to be attached

-l for the list of files including the word.

transliterate(tr) command :

This is yet another powerful text filtering command. The function performed by tr is to transliterate one set into another. The syntax is shown below.

tr 'set1' 'set2'

Here the characters included in set1 in the file are searched and replaced by the ones in set2. In case the filename is not specified the shell waits for the input from the keyboard. We must be careful while using characters like [ & ]. These have a special meaning when used along with tr. Thus we must quote them (using backslash) suitably. To capitalize or to do vice versa we can use set1 as 'A-Z' and set2 as 'a-z' .

Sometimes you might feel that all characters of set1 are being mapped to a single character of set2. In fact some versions of tr behaves differently. So to be on the safer side its always better to use the same number of characters in set1 and set2 (just in case tr behaves differently).

There is yet another option available for transliterate. This is the SQUEEZE option. The syntax is shown below.

tr -s 'set1'

Here if the shell finds multiple consecutive occurrences of any character in set1 then it gets replaced by a single occurrence.

Sort and uniq commands :

Sort command as we know sorts each line of input. We can use the tr command to transliterate the each word in the text input to a new line. Then we can sort the lines.

Syntax is shown below.

sort filename for ascending order sort

sort -r filename can be used for reversing the result.
























Text filtering

Sed

Its the short for stream editor ,all the input we feed into it goes through it and finally reaches the STDOUT. It doesn't change the input file.

With each line of the input file specified it makes a copy of the line and then pattern matches the line with the pattern specified. If a match is found then the corresponding action is performed. Then it proceeds to the next pattern and repeats the same procedure. Since the matching is done on the copy of the line and not the real line the changes that are done on the line doesn't affect the real line.

Syntax is shown below.

sed [option] '/pattern1/' action1

'/pattern2/' action2

.......

...................... filenames

patterns are usually regular expressions. We are already familiar with regular expressions. The only major difference is that there is no + meta character here. Although we are already familiar with regular expressions I would like to add a few important points here.

[^chars] matches character set not specified by chars.

^[chars] matches character set that starts with any one specified in chars

[chars]$ matches character set that ends with a character specified in chars.

sed '/pattern/p' file prints the lines that can be matched by pattern

sed '/pattern/d' file prints the lines after deleting those lines that matches pattern

sed '/pattern/s/pattern1/pattern2/' file

First a list of all the lines that match pattern are created then in these lines those characters that match pattern1 are replaced by pattern2. Then the entire lines are printed.

sed 's/pattern1/pattern2/' file

here all those characters that match pattern1 in a line are substituted by pattern2.

But there is a problem with the last 2 cases. Only the first character set in a line that matches the pattern1 is substituted. All other patterns in the same line are not substituted. In case we need to replace all the character sets that match pattern1 then the following syntax is to be used.

sed 's/pattern1/pattern2/g' file

The following syntax replaces the 4th pattern match in a line by the substitute provided.

sed 's/pattern1/pattern2/4' file

Sometimes we may need to replace the a pattern by another only if it doesn't contain a particular pattern. In that case we can use the following syntax.

sed '/pattern1/!s/pattern2/pattern3/g'

Here sed replaces all pattern2 by pattern3 for all the lines that doesn't match pattern1.

Deleting leading white spaces

sed 's/^[ \t]*//g'

Deleting trailing white spaces

sed 's/[ \t]*$//g'


Sometimes it becomes necessary to reuse a value in an expression. Suppose that we need to append a $ sign in front of all the numbers(floating) that is in a textfile. Then we can use the following syntax.

sed 's/[0-9][0-9]*\.[0-9][0-9]*/\$&/'


Its also possible to redirect the output of the sed command to a file by the following pattern.

sed '/pattern/' action filename >destination filename

Sometimes there occur situations in which there is a need to do more than one substitutions. In that case we can use the following syntax to perform the substitution.

sed 's/pattern1/pattern2/g' -e 's/pattern3/pattern4/g' ......................

Here the shell searches for pattern1 and substitutes all occurrences of pattern1 by pattern2 also it does the same thing with pattern3 and 4.

So far we have dealt with the easy ones that sed can give you. Now we will consider pattern space as a buffer. Just imagine that there are 2 buffers pattern buffer that holds the pattern to be printed and hold buffer that holds the current line. With this idea in mind read along.

= To print the current line number

sed = filename

x To exchange contents of pattern and hold

sed '/pattern/x' filename

Initially hold has null and pattern has 1st line(that matches the pattern) then it exchanges their content and then prints the content of pattern. Next pattern gets next pattern matched. Hold has 1st pattern matched. They exchange the content and then prints content of pattern.

n or N To read the next line in to the pattern space

sed '/pattern/n' filename

Here whenever a pattern is matched the next line is read into the pattern buffer.

g or G Copy/append hold buffer to pattern buffer.

sed '/pattern/g' filename

h or H Copy/append to pattern buffer to hold buffer.

sed '/pattern/h' filename

w filename Write the current pattern buffer to file

sed '/pattern/w filename' filename1

r filename Append text read from file

sed '/pattern/r filename' filename1

Double spacing a line that matches pattern

sed '/pattern/G' filename

Double space a file that has already blank lines in it.

sed '/^$/d;G' filename here the blanklines are first removed and then doublespaced

Delete double spacing

sed 'n;d' filename

Insert a blank line above every line that matches a pattern

sed '/pattern/{x;p;x}' filename

Insert a blank line below every line that matches a pattern.

sed '/pattern/G' filename

Insert a blankline above and below a pattern.

sed '/pattern/{x;p;x;G}'

Inserting the line number at the beginning of the every line

sed = filename | sed 'N;s/\n/\t/'

The first part of the command will give us the line nos attached to the lines but the line nos and lines are on separate lines. Hence we append every line (along with \n to the pattern buffer which initially contains the line number. Then we substitute tab for newline.

Inserting line number along with lines but print the linenos only if line is not blank]

sed '/./=' filename | sed 'N;s/\n/\t/'

Count the number of lines in a file

sed -n '$=' filename


Thanking you,

Layman