## Posts Tagged ‘**subsequence**’

## Longest Common Increasing Subsequence (LCIS)

**Given 2 sequences of integers, we need to find a longest sub-sequence which is common to both the sequences, and the numbers of such a sub-sequence are in strictly increasing order.**

For example,

2 3 1 6 5 4 6

1 3 5 6

the LCIS is 3 5 6.

The sequence *a*_{1}, *a*_{2}, …, *a*_{n} is called increasing, if *a*_{i}<*a*_{i+1} for *i*<*n*.The sequence *s*_{1}, *s*_{2}, …, *s*_{k} is called the subsequence of the sequence *a*_{1}, *a*_{2}, …, *a*_{n}, if there exist such a set of indexes 1 ≤ *i*_{1} < *i*_{2} < … < *i*_{k} ≤ *n* that *a*_{ij} = *s*_{j}. In other words, the sequence *s* can be derived from the sequence *a* by crossing out some elements.

A nice tutorial on the algorithm is given on CodeChef blog here. If you read the blog you can see that instead of looking for LCIS in a candidate matrix, we can keep an array that stores the length of LCIS ending at that particular element. Also we keep a lookup previous array that gives the index of the previous element of the LCIS, which is used to reconstruct the LCIS.

For every element in the two arrays

1> If they are equal we check whether the LCIS formed will be bigger than any previous such LCIS ending at that element. If yes we change the data.

2> If the jth element of array B is smaller than ith element of array A, we check whether it has a LCIS greater than current LCIS length, if yes we store it as previous value and it’s LCIS length as current LCIS length.

Here’s a C++ code.

#include<iostream> using namespace std; #include<vector> void LCIS(vector<int> A, vector<int> B) { int N=A.size(),M=B.size(),i,j; vector<int> C(M,0); vector<int> prev(M,0); vector<int> res; for (i=0;i<N;i++) { int cur=0,last=-1; for (j=0;j<M;j++) { if (A[i]==B[j] && cur+1>C[j]) { C[j]=cur+1; prev[j]=last; } if (B[j]<A[i] && cur<C[j]) { cur=C[j]; last=j; } } } int length=0,index=-1; for (i=0;i<M;i++) if (C[i]>length) { length=C[i]; index=i; } printf("The length of LCIS is %d\n",length); if (length>0) { printf("The LCIS is \n"); while (index!=-1) { res.push_back(B[index]); index=prev[index]; } reverse(res.begin(),res.end()); for (i=0;i<length;i++) printf("%d%s",res[i],i==length-1?"\n":" "); } } int main() { int n,m,i; scanf ("%d", &n); vector<int> A(n,0); for (i = 0; i < n; i++) scanf ("%d", &A[i]); scanf ("%d", &m); vector<int> B(m,0); for (i = 0; i < m; i++) scanf ("%d", &B[i]); LCIS(A,B); }

NJOY!

-fR0DDY

## Subsequence Frequency

The question is to find the number of times a string appears as a subsequence in an another string. A similar question was asked in Google Code Jam 09. Here is the link.

So how do we do it. If you think the general way the first brute force method which comes to mind is to get a matrix for the general subsequence problem and then backtrack all paths that lead to the subsequence. But the backtracking will be too time consuming. Now the second thought which comes to mind is can’t we have a similar dynamic programming approach to find the frequency rather than length. Yes, thankfully we can do that. How?

First lets start with a string of two letters. Suppose we need to find the occurence of ab in abbacb. We can calculate the answer to be four. How did we do it, there are three b’s after first ‘a’ and only one after the second ‘a’, all in all four. So if somehow we manipulate the table to deal with the frequency of characters we can take out the total numbers of times the string appears as subsequence. If you wan’t to try out the algorithm for yourself, now is the right time. What follows next is the algorithm.

We have the table for the above example as :

a | b | b | a | c | b | ||

0 | 1 | 1 | 1 | 1 | 1 | 1 | |

a | 0 | 1 | 1 | 1 | 2 | 2 | 2 |

b | 0 | 0 | 1 | 2 | 2 | 2 | 4 |

As seen above we set the row[0] values to 1 and column[0] values to 0. Here’s the algorithm.

Let’s call the string whose frequency is to be found as str1 and given string str2.

for i from 1 to length(str1) { for j from 1 to length(str2) { if (str1[i-1]==str2[j-1]) table[i][j]=table[i-1][j]+table[i][j-1]; else table[i][j]=table[i][j-1]; } }

Here’s a C++ code for the Code Jam Question :

#include using namespace std; char str[1000],cj[]="welcome to code jam"; int M[30][1000]={0}; int main() { int t,i,j,k,l,lcj=strlen(cj);; scanf("%d\n",&t); for (j=1;j<=1000;j++) M[0][j]=1; for (j=1;j<=lcj;j++) M[j][0]=0; for (k=1;k<=t;k++) { gets(str); l=strlen(str); for (i=1;i<=lcj;i++) { M[i][0]=0; for (j=1;j<=l;j++) if (cj[i-1]==str[j-1]) M[i][j]=(M[i-1][j]+M[i][j-1])%10000; else M[i][j]= M[i][j-1]; } printf("Case #%d: %04d\n",k,M[lcj][l]); } }

Happy Coding!

-fR0D

## Longest Increasing Subsequence (LIS)

The longest increasing subsequence problem is to find a subsequence of a given sequence in which the subsequence elements are in sorted order, lowest to highest, and in which the subsequence is as long as possible. This subsequence is not necessarily contiguous.

For example for the sequence -7 10 9 2 3 8 8 1 the longest (strictly) increasing subsequence is -7 2 3 8 of length 4.

There are few methods to solve this question. The very first method is to sort the given sequence and save it into another array and then take out the longest common subsequence of the two arrays. This method has a complexity of O(*n*^{2}) which should be good for most question but not all.

We will see here an algorithm which take O(n log k) time where k is the size of the LIS. For explanation on this algorithm see Faster Algorithm on this page. Here’s a small code to calculate only the length of the LIS.

#include<iostream> #include<set> #include<vector> using namespace std; int LIS(vector<int> A) { int N = A.size(),i; set<int> s; set<int>::iterator k; for (i=0;i<N;i++) { if (s.insert(A[i]).second) { k = s.find(A[i]); k++; if (k!=s.end()) s.erase(k); } } return s.size(); }[/sourcecode] To also get the LIS we need to maintain a previous array which stores the index of the previous element in the LIS. Here's a C++ implementation. It returns the LIS as an array. [sourcecode language='cpp']#include<iostream> #include<map> #include<vector> using namespace std; typedef pair < int , int > PI; vector<int> LIS(vector<int> A) { int N = A.size(),i,j=-1,t; vector<int> pre(N,-1),res; map<int,int> m; map<int,int>::iterator k,l; for (i=0;i<N;i++) { if (m.insert(PI(A[i],i)).second) { k = m.find(A[i]); l = k; k++; if (l==m.begin()) pre[i]=-1; else { l--; pre[i]=l->second; } if (k!=m.end()) m.erase(k); } } k=m.end(); k--; j = k->second; while (j!=-1) { res.push_back(A[j]); j = pre[j]; } reverse (res.begin(),res.end()); return res; }

Note that if there are more than one LIS the above code prints the last one which occured in the input array. Also to get the LDS we just need to change the lines :

set<int> s; to set<int,greater<int> > s; and map<int,int> m; to map<int,int,greater<int> > m;

Also little changes need to be made to the above codes if you dont want the LIS to be strictly increasing and rather be Longest Non-Decreasing Subsequence or Longest Non-Increasing Subsequence.

Happy Coding!

-fR0D

## Longest Common Subsequence (LCS)

The longest common subsequence (LCS) problem is to find the longest subsequence common to all sequences in a set of sequences (often just two).

First we look into only finding the length of LCS. We can easily construct an exponential time recursive algorithm to compute the length of the LCS. But using Dynamic Programming (DP) to compute the solution bottom up the same job can be done in O(mn) time where m and n are the lengths of the subsequences. Here is a C++ code to do so. Note that the code also uses very less space.

int LCS(string X,string Y) { if (Y.length() > X.length()) swap(X,Y); int m = X.length(),n=Y.length(); vector< vector<int> > c(2, vector<int>(n+1,0)); int i,j; for (i=1;i<=m;i++) { for (j=1;j<=n;j++) { if (X[i-1]==Y[j-1]) c[1][j]=c[0][j-1]+1; else c[1][j]=max(c[1][j-1],c[0][j]); } for (j=1;j<=n;j++) c[0][j]=c[1][j]; } return (c[1][n]); }

If we also wish to print the subsequence we use a space of size m x n to get the LCS. Here’s the code

string X,Y; vector< vector<int> > c(101, vector<int>(101,0)); int m,n,ctr; void LCS() { m = X.length(),n=Y.length(); int i,j; for (i=0;i<=m;i++) for (j=0;j<=n;j++) c[i][j]=0; for (i=1;i<=m;i++) for (j=1;j<=n;j++) { if (X[i-1]==Y[j-1]) c[i][j]=c[i-1][j-1]+1; else c[i][j]=max(c[i][j-1],c[i-1][j]); } } void printLCS(int i,int j) { if (i==0 || j==0) return; if (X[i-1]==Y[j-1]) { printLCS(i-1,j-1); cout<<X[i-1]; } else if (c[i][j]==c[i-1][j]) printLCS(i-1,j); else printLCS(i,j-1); } int main() { while(cin>>X>>Y) { LCS(); printLCS(m,n); cout<<endl ; } }

This post is a prequel to posts on similar topics that i wish to write.

NJOY

-fR0D