A failure of peer review

LW

Master Poster
Joined
Jan 15, 2002
Messages
2,796
Fifteen minutes ago I realised that one of my published results is in error. I classified a computational problem into a wrong complexity class because I mishandled an exponent.

And that was in a conference paper that was first read by two professors and then anonymously reviewed by three more reviewers.

It was published in September 2001, and has been cited by two further articles (discounting a self-cite), and no-one has noticed it before or at least no-one has told me about the error.

And now I'll have to spend rest of the evening (its 10 pm already) to purge the result from my thesis draft.

Peer review doesn't necessarily prevent errors.
 
Crossbow said:
Since the paper in question was published two years ago, what caused you notice the error now?

I'm in process of fixing the last details of my intermediate thesis (between Master's and Doctoral) that should be quite print-ready this friday. One of these details is fleshing out the proof of that particular theorem that was only scetched in the original paper. And it turned out first that the scetch didn't work as it should, which got me worried, and then after trying to rescue the proof I realised that it was wrong, wrong, and completely wrong.

(I had two different exponential constructions and it seemed that combining them would lead to doubly exponential situation, but in reality it stays exponential).

Luckily, the two other theorems that needed more flesh were correct.
 
LW said:


I'm in process of fixing the last details of my intermediate thesis (between Master's and Doctoral) that should be quite print-ready this friday. One of these details is fleshing out the proof of that particular theorem that was only scetched in the original paper. And it turned out first that the scetch didn't work as it should, which got me worried, and then after trying to rescue the proof I realised that it was wrong, wrong, and completely wrong.

(I had two different exponential constructions and it seemed that combining them would lead to doubly exponential situation, but in reality it stays exponential).

Luckily, the two other theorems that needed more flesh were correct.

Well, methinks that if you only sketched a proof, then the referee could at best be expected to judge whether the theorem "sounded reasonable" and whether the sketched proof "smelled right".

Before publication I often send proofs for checking to colleagues -- in detail that'll never make it into the paper but makes it easier on them. The hidden sort of peer review. I have fairly low expectations of journal referees (including myself)...
 
LW said:


I'm in process of fixing the last details of my intermediate thesis (between Master's and Doctoral) that should be quite print-ready this friday. One of these details is fleshing out the proof of that particular theorem that was only scetched in the original paper. And it turned out first that the scetch didn't work as it should, which got me worried, and then after trying to rescue the proof I realised that it was wrong, wrong, and completely wrong.

(I had two different exponential constructions and it seemed that combining them would lead to doubly exponential situation, but in reality it stays exponential).

Luckily, the two other theorems that needed more flesh were correct.

Well, methinks that if you only sketched a proof, then the referee could at best be expected to judge whether the theorem "sounded reasonable" and whether the sketched proof "smelled right".

Before publication I often send proofs for checking to colleagues -- in detail that'll never make it into the paper but makes it easier on them. The hidden sort of peer review. I have fairly low expectations of journal referees (including myself)...
 
phildonnia said:
Out of curiosity; what was the problem, and what was its complexity?

Actually, right now I'm not certain of the complexity anymore, except that it ends somewhere in the exponential hierarchy. The original proof was incorrect, but with a second and third examinations it starts to seem that the result might be correct after all, as I had managed to forget one important step in a translation when I went through the proof this evening.

But anyway, the problem is assigning several different classes of logic programs into the complexity hierarchy.

In practice I'm doing the hardness direction of the proofs with direct Turing machine translations. I construct a program that simulates a given Turing machine that has some time bound. The actual simulation is easy, but the difficult part is to encode the tape cells of the machine. In practice I have to find ways to construct binary counters (to denote the tape cells and time steps) using different language features.

For example, in one case a n-bit integer is encoded into a n-tuple <b_1, ..., b_n> of bits, each of which is then stored in one variable. In another case the integers are made by compositions of function symbols t and f (for true and false). And now there is one complex case where my encoding didn't work, and I've now spent last three hours in trying to find a new one.

But, as the last bus to home leaves in 15 minutes I think I'll let my brain to rest until tomorrow.

[Edited to add: the nasty thing about the exponential hierarchy is that you play around with functions like 2^(2^(n^k)), and to tell the truth, my intuition doesn't work very well with figures that big.]
 
LW said:
Peer review doesn't necessarily prevent errors.

It still beats the hell outta the old system.

Man - I think it was 'Blessed are the cheese-makers'.
Jewwife - Ah. What's so special about the cheese-makers?
Jew - Well obviously it's not meant to be taken literally, it refers to any manufacturers of dairy products.
 
LW said:
Peer review doesn't necessarily prevent errors.

Agreed. But, I would have said doesn't necessarily "catch" errors (i.e., you were the source of the error, not the reviewers). And, always remember the level of interest and/or controversy any particular paper generates results in proportionate increase in the level of scrutiny. Thus, we can conclude that your paper was either boring or irrelvant. :D

(Just kidding. ;) And, good job with the self-review and subsequent admission of error. Integrity is the backbone of bona fide research.)

-TT
 
Is it feasible to test this stuff by running it on an actual computer?

I am reminded of a quote from Knuth (at the bottom of his FAQ): "Beware of bugs in the above code; I have only proved it correct, not tried it.''
 
LW said:


I'm in process of fixing the last details of my intermediate thesis (between Master's and Doctoral) that should be quite print-ready this friday. One of these details is fleshing out the proof of that particular theorem that was only scetched in the original paper. And it turned out first that the scetch didn't work as it should, which got me worried, and then after trying to rescue the proof I realised that it was wrong, wrong, and completely wrong.

(I had two different exponential constructions and it seemed that combining them would lead to doubly exponential situation, but in reality it stays exponential).

Luckily, the two other theorems that needed more flesh were correct.

Well then I guess that I fail to see what the real problem is.

The way I see it is: you made a math error and no one else noticed this error either, but now you have noticed the error and you will make the appropriate corrections.

This sounds like the good, sound science in practice.

It would have been nice if someone would have noticed the error before publication, but alas, no one is perfect.
 
69dodge said:
Is it feasible to test this stuff by running it on an actual computer?

I am reminded of a quote from Knuth (at the bottom of his FAQ): "Beware of bugs in the above code; I have only proved it correct, not tried it.''

Well, the problem in this particular case is that it is so high on the complexity hierarchy that any almost example with the worst-case behavior will cause the computer to run out of memory (or, run for a loooong time if memory use is restricted).

For example, I have a test program that consists of six logic program rules. The total length of the program is 180 ascii characters. It would fit into three standard-width lines on a computer screen.

When I instantiate the rules (that is, I replace variables from the rules with all (relevant) possible combinations of values for them), the resulting variable-free program has 4115 rules, and consists of little more than 400,000 characters.

The non-ground program has only three variables .If I add a fourth one there, it will add only 18 ascii characters to the original program, but then there will be approximately 32 ^ 4 = 1,048,576 ground instances, and adding the fifth will increase it into 64 ^ 5 = 1,073,741,824.

Luckily, it turns out that these extreme examples happen only when you really want to create insanely big ground instantiation of the program, and in practical applications the size of the instantiation is much, much smaller.

Of cours, someone may now wonder why am I instantiating programs. The reason is that there exist very good algorithms for handling logic programs that don't have any variables in them and all those that work on programs with variables are much slower. So, it is computationally more efficient to first remove all variables from the program before trying to find their models.
 
Isn't istantiating programs with different values just transferring the variable to the operating system shell ? Or do I have no idea what you're talking about ?

It reminds me of a "file compression" method that could always decrease the file size. It did so by encoding information in the file name.
 
teddygrahams said:
Isn't istantiating programs with different values just transferring the variable to the operating system shell ? Or do I have no idea what you're talking about ?

OK, a little example. A logic program looks something like this:
Code:
a(0). a(1).
b(0). b(1).
c(X, Y) :- a(X), a(Y).
In effect, the first two lines say that the atoms 'a(0)' and 'a(1)' as well as 'b(0)' and 'b(1)' are true. The third line is a rule stating that 'c(X, Y)' is true whenever 'a(X)' and 'b(Y)' are. (It corresponds to classical implication).

The meaning of this program is the smallest set of ground atoms that make all rules true; that is, if the body of a rule (above the a(X), b(Y)) part is true, then the head (c(X, Y)) has to be true also.

One way to handle programs like these is to remove all variables from them by replacing the rules that have variables with their ground instances. So, the instantiated version of the above program would be:
Code:
a(0). a(1).
b(0). b(1).
c(0,0) :- a(0), b(0).
c(0, 1) :- a(0), b(1).
c(1, 0) : - a(1), b(0).
c(1, 1) :- a(1), b(1).
Here, the meaning of the program would be the set { a(0), a(1), b(0), b(1), c(0,0), c(0,1), c(1,0), c(1,1) }.

This is only a trivial example to show the intuition behind instantiation.

The main reason to do instantiation instead of handling programs with variables directly is that there are useful logic programming semantics (stable model, well-founded, disjunctive stable model, perfect model, ...) where there exists efficient algorithms to handle variable-free programs but algorithms for programs with variables have a terrible performance. So, it is (in practice) cost-effective to remove the variables before trying to find models of the program.
 

Back
Top Bottom