The Halting Problem is a terrible example of NP-Harder
It's a justifiable copout, but it's still a copout.
Short one this time because I have a lot going on this week.
In computation complexity, NP is the class of all decision problems (yes/no) where a potential proof (or "witness") for "yes" can be verified in polynomial time. For example, "does this set of numbers have a subset that sums to zero" is in NP. If the answer is "yes", you can prove it by presenting a set of numbers. We would then verify the witness by 1) checking that all the numbers are present in the set (~linear time) and 2) adding up all the numbers (also linear).
NP-complete is the class of "hardest possible" NP problems. Subset sum is NP-complete. NP-hard is the set all problems at least as hard as NP-complete. Notably, NP-hard is not a subset of NP, as it contains problems that are harder than NP-complete. A natural question to ask is "like what?" And the canonical example of "NP-harder" is the halting problem (HALT): does program P halt on input C? As the argument goes, it's undecidable, so obviously not in NP.
I think this is a bad example for two reasons:
All NP requires is that witnesses for "yes" can be verified in polynomial time. It does not require anything for the "no" case! And even though HP is undecidable, there is a decidable way to verify a "yes": let the witness be "it halts in N steps", then run the program for that many steps and see if it halted by then. To prove HALT is not in NP, you have to show that this verification process grows faster than polynomially. It does (as busy beaver is uncomputable), but this all makes the example needlessly confusing.1
"What's bigger than a dog? THE MOON"
Really (2) bothers me a lot more than (1) because it's just so inelegant. It suggests that NP-complete is the upper bound of "solvable" problems, and after that you're in full-on undecidability. I'd rather show intuitive problems that are harder than NP but not that much harder.
But in looking for a "slightly harder" problem, I ran into an, ah, problem. It seems like the next-hardest class would be EXPTIME, except we don't know for sure that NP != EXPTIME. We know for sure that NP != NEXPTIME, but NEXPTIME doesn't have any intuitive, easily explainable problems. Most "definitely harder than NP" problems require a nontrivial background in theoretical computer science or mathematics to understand.
There is one problem, though, that I find easily explainable. Place a token at the bottom left corner of a grid that extends infinitely up and right, call that point (0, 0). You're given list of valid displacement moves for the token, like (+1, +0)
, (-20, +13)
, (-5, -6)
, etc, and a target point like (700, 1)
. You may make any sequence of moves in any order, as long as no move ever puts the token off the grid. Does any sequence of moves bring you to the target?
This is PSPACE-complete, I think, which still isn't proven to be harder than NP-complete (though it's widely believed). But what if you increase the number of dimensions of the grid? Past a certain number of dimensions the problem jumps to being EXPSPACE-complete, and then TOWER-complete (grows tetrationally), and then it keeps going. Some point might recognize this as looking a lot like the Ackermann function, and in fact this problem is ACKERMANN-complete on the number of available dimensions.
A friend wrote a Quanta article about the whole mess, you should read it.
This problem is ludicrously bigger than NP ("Chicago" instead of "The Moon"), but at least it's clearly decidable, easily explainable, and definitely not in NP.
It's less confusing if you're taught the alternate (and original!) definition of NP, "the class of problems solvable in polynomial time by a nondeterministic Turing machine". Then HALT can't be in NP because otherwise runtime would be bounded by an exponential function. ↩
If you're reading this on the web, you can subscribe here. Updates are once a week. My main website is here.
My new book, Logic for Programmers, is now in early access! Get it here.
I'm maybe dense but I think I miss something about the example? You have a list of pairs - surely you can verify in polynomial time that: a. this is a subset of original set: O(n log n) (sort pairs, then linear scan) b. It reaches target point: O(n) -(just add the coordinates and check they never get negative along the way)
what am I missing? Why is this not NP?
You're allowed to reuse moves: So for goal
(0, 4)
and the displacement vectors{a = (1, 0), b = (-3, 1)}
, the solution isaaabaaabaaabaaab
— 16 steps total. The analogous problem for three dimensions would have a 64 step witness, for four dimensions would have a 128 step witness, etc.