New NP complete problems are always interesting because they broaden our conception of what is difficult to compute. Now we have a new result that unshuffling square strings is NP-Hard.
The idea of a shuffle is very simple. Take two strings u and v then w is a shuffle of u and v if there are a set of strings xi and yi such that
u =x1 x2 · · · xk
v = y1 y2 · · · yk
w = x1 y1 x2 y2 · · · xk yk
In other words, w is a sequential interleaving of substrings of v and w.
A string w is called a square if it is the shuffle of a string u with itself. That is, w can be created by shuffling two copies of u.
Creating squares is easy; just take a string and shuffle it with itself. Also notice that one string can be shuffled to in many different ways depending on how you split it up into substrings.
It might be easy to create a square, but it is much more difficult to solve the inverse problem of determining if a string is square. The problem of working out if a given string w can be expressed as a shuffle of a given u and v can be solved in polynomial time.
You can even solve the more general problem of whether w is the shuffle of k different strings in polynomial time - but only if k is fixed. If k is unspecified then the problem becomes NP-complete. Later this proof was extended to the case where the k strings are identical but until recently there was no clear answer for square strings i.e. for the case k = 2.
That is, can you find a polynomial algorithm for deciding in a string w can be constructed from the shuffle of an unspecified string u with itself?
The square problem seems to be easier than the general problem with k allowed to vary, but in fact a recent paper presents the result that it too is NP-complete, even if the alphabet used for the strings is finite but not too small. In fact the paper proves that the task is NP-complete for alphabets as small as 7 characters. It suggests that shuffles with as few as 3 symbols might be NP-complete, but we still need a proof of this.