COME ON CODE ON

A blog about programming and more programming.

Longest Common Subsequence (LCS)

with 20 comments

The longest common subsequence (LCS) problem is to find the longest subsequence common to all sequences in a set of sequences (often just two).

First we look into only finding the length of LCS. We can easily construct an exponential time recursive algorithm to compute the length of the LCS. But using Dynamic Programming (DP) to compute the solution bottom up the same job can be done in O(mn) time where m and n are the lengths of the subsequences. Here is a C++ code to do so. Note that the code also uses very less space.

int LCS(string X,string Y)
{
     if (Y.length() > X.length())
        swap(X,Y);
     int m = X.length(),n=Y.length();
     vector< vector<int> > c(2, vector<int>(n+1,0));
     int i,j;
     for (i=1;i<=m;i++)
     {
         for (j=1;j<=n;j++)
         {
             if (X[i-1]==Y[j-1])
                c[1][j]=c[0][j-1]+1;
             else
                 c[1][j]=max(c[1][j-1],c[0][j]);
         }
    	 for (j=1;j<=n;j++)
    		 c[0][j]=c[1][j];
     }
     return (c[1][n]);
}

If we also wish to print the subsequence we use a space of size m x n to get the LCS. Here’s the code

string X,Y;
vector< vector<int> > c(101, vector<int>(101,0));
int m,n,ctr;

void LCS()
{
     m = X.length(),n=Y.length();

     int i,j;
     for (i=0;i<=m;i++)
         for (j=0;j<=n;j++)
             c[i][j]=0;

     for (i=1;i<=m;i++)
         for (j=1;j<=n;j++)
         {
             if (X[i-1]==Y[j-1])
                c[i][j]=c[i-1][j-1]+1;
             else
                 c[i][j]=max(c[i][j-1],c[i-1][j]);
         }
}

void printLCS(int i,int j)
{
    if (i==0 || j==0)
       return;
    if (X[i-1]==Y[j-1])
    {
       printLCS(i-1,j-1);
       cout<<X[i-1];
    }
    else if (c[i][j]==c[i-1][j])
         printLCS(i-1,j);
    else
        printLCS(i,j-1);
}

int main()
{
    while(cin>>X>>Y)
    {
        LCS();
        printLCS(m,n);
        cout<<endl ;
    }
}

This post is a prequel to posts on similar topics that i wish to write.

NJOY
-fR0D

About these ads

Written by fR0DDY

August 7, 2009 at 5:29 PM

Posted in Programming

Tagged with , , , , , , ,

20 Responses

Subscribe to comments with RSS.

  1. Can you find all LCS?

    I mean if there are some long strings.
    How to return the number of all the longest strings.

    bobi

    March 4, 2010 at 5:38 PM

  2. thank u so much for this set of codes.

    deepika

    April 6, 2011 at 8:22 PM

  3. Hi,

    I was testing this code with the inputs:
    X = CABC
    Y = BCAB

    and it printed the wrong sequence. It printed ABC. Do you know why this happened and how to fix it??? Thanks!

    M

    May 20, 2011 at 5:43 AM

    • Hi,

      Yes there was a bug in the program. I have corrected it now.

      fR0DDY

      May 20, 2011 at 10:05 AM

  4. I’d recommend losing those global variables… not the best way of going about things. It’s easy for a small program like this.

    Rishi B

    November 1, 2011 at 3:27 AM

  5. how to write a simpler code .. i don’t know about template .. plz help me

    yogesh

    November 27, 2011 at 8:50 PM

  6. gaurav how to improve programming skill

    divyanshu kumar

    April 5, 2012 at 3:54 PM

    • Practice. Learn. Practice.

      fR0DDY

      April 6, 2012 at 10:32 PM

      • I don’t know more about lcs algorithm.But i’m asking,how to implement lcs algorithm under OpenMp API.If you have some code.publish it.

        nishant kumar

        August 13, 2012 at 2:44 AM

  7. How to use this code for substrings? Example for minimal substrings of length 3: ABBBCDE and ACCDDBBB returns BBB !

    johnny

    May 2, 2012 at 5:27 PM

  8. I meant how to get indexes of the substring in first string which is the same as the substring in other string..
    Example : AAAABBBCCC and BBBCCCTTTT . REsult : 4 6 , 0 2 (BBB) and 7 9, 3 5 (CCC) ?

    johnny

    May 2, 2012 at 6:44 PM

    • This is completely different question. But it can be solved in linear time using suffix trees. May be I will write a post about it sometime.

      fR0DDY

      May 7, 2012 at 8:41 PM

      • can you help me to do it? suffix tree? What is that? I use it for large inputs and **L is unusable…

        johnny

        May 7, 2012 at 8:43 PM

  9. http://pastebin.com/bnWSxFXj i have suffix tree, and i need to change it just like i m described above.. Can you help me?

    johnny

    May 8, 2012 at 1:10 AM

    • You are directly printing the LCS. What if i need to store the LCS in a variable? I tried a few modifications but wasn’t able to do it..

      Magnum

      June 13, 2012 at 2:02 PM

  10. When there is 3 or more string how can i compute LCS..Can you help me to do that ????thanks in advance ..

    ioi

    July 22, 2012 at 6:15 PM

  11. this code is not displaying output can u give exact code both to find optimal value, solution of lcs

    vgs

    November 10, 2012 at 12:30 AM

  12. if size of input sequences are of order 10^5 then 2-D array will not be created and method fails ..then what to do,,,and also is there any algo faster than this????

    raunak talwar

    March 3, 2013 at 3:55 AM

  13. 1:the quick brown fox
    2:the fast brown dogs

    Does not print any sequence though the length of LCS is 12

    Farhad

    March 4, 2014 at 11:01 AM


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

Join 244 other followers

%d bloggers like this: