## Thursday, March 21, 2013

### [leetcode] Palindrome Partition II

Problem is stated here .
It is easy to find out that it's a dynamic programming problem. Certainly we can use recursion to solve it, but we will see that there're a lot of repeated computations, and therefore recursion is inefficient in this kind of situation.
DP approach isn't difficult, a good and brief explanation can be found here.

Let dp[i] = min cut for substring [i...n], where n is the length of input string.
Then dp[i] = 0, if substring[i..n] itself is a palindrome
or dp[i] = 1+dp[j], if substring(i, j) is a palindrome ( i<j<n )

Here comes the interesting part, how do we know if substring(i, j) is a palindrome or not ?
If we check it directly by going though characters from i to j-1, it reveals a overall runtime of O(n^3) of this algorithm. The trick is using another DP:

substring(i, j) is palindrome if charAt(i) == charAt(j-1) && subtring(i+1,j-1) is palindrome.

So, this is a interesting mixed DP problem and can be solved in O(n^2).
Note that this post made a good point that it can be regarded as a variant of Matrix Chain Multiplication Problem. They gave a neat code for this problem using 2D dynamic programming algorithm which is very similar with the one to solve Matrix Chain Multiplication Problem. It makes perfect sense, but the time complexity is O(n^3).

As I talked it before, this problem can be solved in runtime of O(n^2). So what's the difference between it and Matrix Chain Multiplication Problem?  Well, by using anoter DP, we can find out if substring(i, j) is palindrome in O(1) time while we can not directly get the min multiplications of sub matrices(i,j) if we applied the same strategy to it.

Another related problem is problem 17.14 in the book Cracking the Coding Interview:
Given a text without spaces, e.g."helloworldniceday",
Find the minimal words it can be divided into. e.g. "hello world nice day" -> 4 words.
It occurs to me that this is another variant of Matrix Chain Multiplication Problem with 2-D DP at first, but actually it's much easier than I thought. 1-D DP is enough, just like the Palindrome Partition problem I talked first.
```//-1 means this text can't be divided to valid words.
public static int getMinSeperateWords(String text){
int len = text.length();
int[] dp = new int[len];
for(int i=len-1;i>=0;i--){
if(Dict.isValid(text.substring(i))){
dp[i]=1;
continue;
}
int min=Integer.MAX_VALUE;
for(int j=i+1;j<len;j++){
if(dp[j]!=-1 && Dict.isValid(text.substring(i,j))){
if(dp[j]+1<min)
min=dp[j]+1;
}
}
dp[i]= min==Integer.MAX_VALUE ? -1:min;
}
return dp[0];
}
```
In conclusion, problems can be related while they can also be harder or easier than the similar ones. Practice makes perfect.

## Wednesday, February 20, 2013

### [leetcode] Implement strStr() | String matching algorithms

Problem:
Returns a pointer to the first occurrence of needle in haystack, or null if needle is not part of haystack.
e.g. haystack=>"mississippi", needle=>"issip".  return "issippi"

This is not a difficult problem. A first thought of brute force algorithm is easy to come up with. However, it gets very tricky if we want to improve the efficiency.

Knuth-Morris-Pratt(KMP) Algorithm
See details at wiki and here
```public class StrStr {

public String strKMP(String haystack, String needle){
if(needle.length()==0)
return haystack;
//get next array
int[] next = generateNext(needle.toCharArray());
int len = 0;     // Number of characters matched

for( int i=0; i<haystack.length(); i++ ){
while( len>0 && needle.charAt(len)!= haystack.charAt(i) )
len = next[len];
if( needle.charAt(len) == haystack.charAt(i) )
len++;
if( len == needle.length() )
return haystack.substring(i-needle.length()+1);
}

return null;
}
/**
* KeyPoint. Get next array.
* @param needle
* @return
*/
private int[] generateNext(char[] needle){
//next array. next[i] means when match length = i, the common prefix and real suffix length = next[i]
int[] next = new int[needle.length+1];
//the longest common length of  prefix and real suffix.
int presufix = 0;
//next[1] = 0, starting from 2. i = match length.
for(int i=2;i<next.length;i++){
while(presufix>0 && needle[presufix]!=needle[i-1])//trickiest part
presufix=next[presufix];
if(needle[presufix]==needle[i-1])
presufix++;
next[i]=presufix;
}
return next;
}

}
```

Rolling Hash Algorithm
See details at wiki. An important application of rolling hash is Rabin-Karp Algorithm. While Rabin-Karp algorithm is a good choice for multiple pattern search , I borrowed the idea to solve this problem. Although it is slower than KMP algorithm in worst case in terms of time complexity, it is much easier to understand than KMP.
The hash function H is:

Where a is constant and c1..ck are input chars.
Here's a simple example of rolling hash.
Input: char array A[1...10]. Length k=5.
H[1..5] = A[1]*a^4 + A[2]*a^3 + A[3]*a^2 + A[4]*a^1 + A[5]*a^0
Then how can we compute H[2..6] ?
Instead of computing it from scratch, we can do as following based on H[1..5]
1. Remove first item of H[1..5] :  H[2..6] = H[1..5] - A[1]*a^4
2. Times constant a                  :  H[2..6] *= a
3. Add  last item A[6]              :  H[2..6] += A[6]

Based on this idea, I coded a very naive implementation to solve the problem. Note that to avoid manipulate huge H values, all maths are done by mod in reality.
```public class StrStrRollingHash {
public static final int A = 3;

public String strStr(String haystack, String needle) {
int n = haystack.length();
int m = needle.length();
if(m==0)
return haystack;
if(n<m)
return null;

char[] input = haystack.toCharArray();
int needleHash = getHash(needle.toCharArray(),0,m-1);
int haystackHash = getHash(input,0,m-1);

for(int i=0;i<n-m;i++){
if(haystackHash==needleHash){
if(haystack.substring(i, i+m).equals(needle))
return haystack.substring(i);
}
//rolling hash
haystackHash-=(int)(input[i]*Math.pow(A, m-1));//remove i th item
haystackHash*=A;//times A
}

if(haystackHash==needleHash){
if(haystack.substring(n-m, n).equals(needle))
return haystack.substring(n-m);
}

return null;

}

private static int getHash(char[] array,int start,int end){
int k=end-start;
int sum=0;
for(int i=start;i<=end;i++){
int item = (int)(array[i]*Math.pow(A,k--));
sum+=item;
}
return sum;
}
}

```

You can try this problem on leetcode.
There's another related Boyer-Moore Algorithm, I will study it when get free.

## Monday, February 18, 2013

### 交换使两个序列差最小| Two Equal-sized Subsets Partition Problem

1. 穷举法。

2. DP 算法。

```public class PartitionArray {

/**
* @param array:
*        assuming that the input array is the combination of two n-sized array.
* @return int :
*        minimal difference of the two n-sized subsets.
*/
public static int minimalSum (int[] array){
int total = sum(array);
int sum = total/2;
int n=array.length/2;

//dp[i][v] = true means that there's i elements added up to value v.
boolean[][] dp = new boolean[n+1][sum+1];
dp[0][0]=true;

for(int k=1;k<=2*n;k++){
int val = array[k-1];
for(int i=1;i<=k&&i<=n;i++){
for(int v=1;v<=sum;v++){
//try to take k as i th element
if(v>=val && dp[i-1][v-val])
dp[i][v]=true;
}
}
}
//find the answer. we need n-sized subset, so just check dp[n][*].
for(int i=sum;i>0;i--){
if(dp[n][i])
/**
* we find a subset with size=n and sum=i,another subset will have sum=total-i
* so the difference will be (total-i)-i=total-2*i
*/
}
return -1;
}

private static int sum(int[] array){
int sum=0;
for(int i:array)
sum+=i;
return sum;
}
}
```

## Tuesday, February 5, 2013

#### Talk is cheap. Show me the code.

--Linus Torvalds in Linux-kernel mail list

## Monday, February 4, 2013

### Notes on LCS, LIS, and related problems

LCS (Longest Common Subsequence) is a well-known dynamic programming(DP) problem. I learned it several times, but sadly I forgot it several times as well. These kinds of problems are not that straightforward for someone first encounters it. While I'm learning(to be more specific, re-learning) these related problems recently, I still haven't found any short path to master DP algorithms. Personally, I think completely understanding more and more problems will enlarge our knowledge to solve other similar problems in the future. And of course, coding is inevitable and essential.

Circus Tower Sorting Problem
There are many interesting problems related to LCS problem. For example:
A circus is designing a tower routine consisting of people standing atop one another’s shoulders. For practical and aesthetic reasons, each person must be both shorter and lighter than the person below him or her. Given the heights and weights of each person in the circus, write a method to compute the largest possible number of people in such a tower.
EXAMPLE:
Input (ht, wt): (65, 100) (70, 150) (56, 90) (75, 190) (60, 95) (68, 110)
Output: The longest tower is length 6 and includes from top to bottom: (56, 90) (60,95) (65,100) (68,110) (70,150) (75,190)

This problem can be easily converted to LCS problem and therefore solution is quite simple:
1. Sort the data according to height. Denote it as D1. That is O(nlogn).
2. Sort the data according to weight. Denote it as D2. That is O(nlogn).
3. Find the length of the LCS of D1 and D2. Using dynamic programming, it’s O(n^2).
Given a good understanding of LCS problem, this problem can be solved quickly, though it's tricky.
1. Sort the Sequence in increasing order. Denote it as S'. That is O(nlogn).
2. Find the length of the LCS of original sequence S and S'. Using dynamic programming, it’s O(n^2).
This reveals the overall O(n^2) time complexity solution. Another  O(n^2) DP solution is shown here. There's exists O(nlogn) solution to solve it and that's not hard either. They can be found on wiki and a good example is shown on a Chinese blog.

The boxes in this particular problem can be rotated and multiple instance of same box is allowed. This problem can be solved based on the above LIS problem.

There's another box stacking problem which is slightly different from the above one in Cracking the Coding Interview 5th Edition Chapter 9.10 :

You have a stack of n boxes, with width w(i), depth d(i), and height h(i). The boxes cannot be rotated and can only be stacked on top of one another if each box in the stack is strictly larger than the box above it in width, height, and depth. Implement a method to build the tallest stack possible, where the height of a stack is the sum of the heights of each box.

My initial thought is to adapt a similar approach as the Circus Tower problem. However, sorting the boxes in width, depth and height separately and find their LCS is kind of 3-Dimension DP, which is much more complicated. Actually I don't if it will truly work.

While answer in the book is taking the similar idea of solving LIS problem, I don't understand it well, and I don't like the code either. So I just put my idea here:

1. Sort the boxes on increasing order of height.
2. Let:
H(i) = Maximum possible Stack Height with box i at bottom of stack
H(i) = { Max ( H(j) ) + height(i) }
where j is in [1..i-1] and width(j) < width(i) and depth(j) < depth(i) and height(j) < height(i)
If there is no such j then H(i) = height(i).
3. Scan H(1) to H(n). Find the max as the answer.

I may put the code later.

## Tuesday, January 29, 2013

### [leetcode] Generate Parentheses

Problem:
Given n pairs of parentheses, write a function to generate all combinations of well-formed parentheses.
For example:
n=1: "()"
n=2: "()()","(())"
n=3: "((()))","()(())","()()()","(()())","(())()"
...
While this problem seems familiar to me, I didn't have good solutions at first glance. Then my first step was to observe more examples and tried to find some patterns.

Let [n] indicates all well-formed combinations of number n. Then it was not difficult to find out that we had the following patters:
[n] = { "(" + [n-1] + ")"  , [1][n-1], [2][n-2], ...., [n-2][2], [n-1][1] }
where [m][n] indicates the cartesian product of two sets.
e.g. [2][2] =  { "()()","(())" } x { "()()","(())" }
= { "()()()()","()()(())","(())()()","()()()()" }
As we can see, there will be duplicates of the product, so I use the Set container in java to avoid duplicates.

Then the main idea is to iterate from 1 to n. When calculate k, we can use the previous [1..k-1] results to generate the results of k.
```public class GenerateParentheses {
public ArrayList<String> generateParenthesis(int n) {

ArrayList<Set<String>> list = new ArrayList<Set<String>>();

int number=1;

while (number<=n){
Set<String> parentheses = new HashSet();
//generate combinations of pattern "("+[number-1]+")"
generateParentheses(number,list,parentheses);
//generate combinations of pattern [1][number-1],[2][number-2],...[number-1][1]
for(int i=1;i<number;i++){
int leftIndex=i-1;
int rightIndex=number-i-1;
for(String left:list.get(leftIndex)){
for(String right:list.get(rightIndex)){
}
}
}
//add results of this number and go on.
number++;
}
//add the elements to list required by OJ.
ArrayList<String> result = new ArrayList<String>();
Iterator<String> it = list.get(list.size()-1).iterator();
while(it.hasNext())
return result;
}

private void generateParentheses(int number,ArrayList<Set<String>> list,Set<String> parentheses){
if(number==1){
return;
}
StringBuilder sb = new StringBuilder();
for(String s:list.get(number-2)){
sb.append("(");
sb.append(s);
sb.append(")");
sb.setLength(0);
}
}
}```

I know...The code looks messy and kind of crazy with 4-nested loops.
After surfing online and discussed with others, I have a much better solution with the idea that is much easier to illustrate.

When the parameter is n, we are going to have n left parenthesis and n right parenthesis. We can add either "(" or ")" to current constructed string (initially empty string), but with some constrains:
1. No. of left parenthesis "(" much be less or equal to n.
2. Certainly, right parenthesis ")" need to be <= n, but also need to be <= no. of  "(".
Based this point, we can easily get a recursive approach :
```public class GenerateParentheis {
ArrayList<String> list;

public ArrayList<String> generateParenthesis(int n) {
list = new ArrayList<String>();
generateParenthesis("",0,0,n);
return list;
}

private void generateParenthesis(String s,int left,int right,int n){
if(left==n && right==n){
return;
}
if(left<n)
generateParenthesis(s+"(",left+1,right,n);
if(right<left)
generateParenthesis(s+")",left,right+1,n);
}
}
```

Even More
There're many problems that looks totally different at first glance, but are the same in nature.
The key in called Catalan Number. The n th Catalan number is 1/(n+1) * C(2n, n).
So when n=3, we can get Catalan number = 5, which is the total number of well-formed parenthesis combinations when n=3.

## Saturday, January 26, 2013

### Some Powerful Shell Commands

Original article at coolshell

• `!\$!\$是一个特殊的环境变量，它代表了上一个命令的最后一个字符串。如：你可能会这样：\$mkdir mydir\$mv mydir yourdir\$cd yourdir可以改成：\$mkdir mydir\$mv !\$ yourdir\$cd !\$`
• `sudo !!`
以root的身份执行上一条命令 。
场景举例：比如Ubuntu里用`apt-get`安装软件包的时候是需要root身份的，我们经常会忘记在`apt-get`前加`sudo`。每次不得不加上`sudo`再重新键入这行命令，这时可以很方便的用`sudo !!`完事。
（注：在shell下，有时候你会输入很长的命令，你可以使用!xxx来重复最近的一次命令，比如，你以前输入过，vi /where/the/file/is, 下次你可以使用 !vi 重得上次最近一次的vi命令。）
• `cd –`
回到上一次的目录 。
场景举例：当前目录为`/home/a`，用`cd ../b`切换到`/home/b`。这时可以通过反复执行`cd –`命令在`/home/a``/home/b`之间来回方便的切换。
• ‘ALT+.’ or ‘<ESC> .’
热建alt+. 或 esc+. 可以把上次命令行的参数给重复出来。
• `^old^new`
替换前一条命令里的部分字符串。
场景：`echo "wanderful"`，其实是想输出`echo "wonderful"`。只需要`^a^o`就行了，对很长的命令的错误拼写有很大的帮助。（也可以使用 !!:gs/old/new
• du -s * | sort -n | tail
列出当前目录里最大的10个文件。
• :w !sudo tee %
在vi中保存一个只有root可以写的文件
• date -d@1234567890
时间截转时间
• > file.txt
创建一个空文件，比touch短。
• mtr coolshell.cn
mtr命令比traceroute要好。
• 在命令行前加空格，该命令不会进入history里。
• echo “ls -l” | at midnight
在某个时间运行某个命令。
• curl -u user:pass -d status=”Tweeting from the shell” http://twitter.com/statuses/update.xml
• curl -u username –silent “https://mail.google.com/mail/feed/atom” | perl -ne ‘print “\t” if /<name>/; print “\$2\n” if /<(title|name)>(.*)<\/\1>/;’
检查你的gmail未读邮件
• ps aux | sort -nk +4 | tail
列出头十个最耗内存的进程
• `man ascii`
显示ascii码表。
场景：忘记ascii码表的时候还需要google么?尤其在天朝网络如此“顺畅”的情况下，就更麻烦在GWF多应用一次规则了，直接用本地的`man ascii`吧。
• `ctrl-x e`
快速启动你的默认编辑器（由变量\$EDITOR设置）。
• `netstat –tlnp`
列出本机进程监听的端口号。（netstat -anop 可以显示侦听在这个端口号的进程）
• `tail -f /path/to/file.log | sed '/^Finished: SUCCESS\$/ q'`
当file.log里出现Finished: SUCCESS时候就退出tail，这个命令用于实时监控并过滤log是否出现了某条记录。
• `ssh user@server bash < /path/to/local/script.sh`
在远程机器上运行一段脚本。这条命令最大的好处就是不用把脚本拷到远程机器上。
• ssh user@host cat /path/to/remotefile | diff /path/to/localfile -
比较一个远程文件和一个本地文件
远程关闭一台Windows的机器
• `screen -d -m -S some_name ping my_router`
后台运行一段不终止的程序，并可以随时查看它的状态。`-d -m`参数启动“分离”模式，`-S`指定了一个session的标识。可以通过`-R`命令来重新“挂载”一个标识的session。更多细节请参考screen用法 `man screen`
• `wget --random-wait -r -p -e robots=off -U mozilla http://www.example.com`
下载整个www.example.com网站。（注：别太过分，大部分网站都有防爬功能了：））
• `curl ifconfig.me`
当你的机器在内网的时候，可以通过这个命令查看外网的IP。
• convert input.png -gravity NorthWest -background transparent -extent 720×200  output.png
改一下图片的大小尺寸
• `lsof –i`
实时查看本机网络服务的活动状态。
• `python -m SimpleHTTPServer`
一句话实现一个HTTP服务，把当前目录设为HTTP服务目录，可以通过`http://localhost:8000`访问 这也许是这个星球上最简单的HTTP服务器的实现了。
• `history | awk '{CMD[\$2]++;count++;} END { for (a in CMD )print CMD[a] " " CMD[a]/count*100 "% " a }' | grep -v "./" | column -c3 -s " " -t | sort -nr | nl | head -n10`