Largest ever miscarriage of justice?

So the notion of a bug free system is widely understood to be nonsense.


There ser mathematical ways of proving a programme to be without errors but these are time-consuming, and expensive, so I doubt if it has ever been used outside universities. And they would still be subject to the kind of problems that Smartcooky outlines.
 
Is a bug a bit of programming that was faulty, causing an unexpected result? Basically, human error causes computer error.

Not always human error. Complex interactions within and between opaque systems can trigger bugs that are not apparent or able to be anticipated by the programming team. Some interactions cannot effectively be tested except under live conditions - no human error involved, just the impossibility of predicting and accounting for every possible interaction.

Sometimes they're not even bugs at all. Just unanticipate-able conditions causing complex interactions between simple rules.

Humans have long reached the point where we can collectively develop systems more complex than any one of us can fully comprehend or hold at one time in our mind. Sometimes it takes a whole team of experts days or weeks of investigation, to figure out why this system intermittently doesn't talk to that system. I've been on such teams. The detective work is fun, the opacity of the problem is frustrating, and the pressure to figure out what's going on and fix it is stressful. And the outcome of the investigation is never that a programmer made a human error that they should have found and fixed before they committed their code or released it to production.

Sometimes there's no-one to blame. Sometimes imperfections arise simply from the fact that we're doing really complicated things at really large scales, to satisfy our human desires. Not human error, humanity error.
 
Last edited:
So the notion of a bug free system is widely understood to be nonsense.

With respect to smartcooky, I'd say no, it's not widely understood to be nonsense.

I might say it's narrowly understood to be nonsense, within the software industry. But I think it's more accurate to say it's not a notion that is entertained at all within the industry. Nobody on my side of the fence even considers the possibility of a bug-free (complex) system long enough to dismiss it as nonsense.

And I suppose outside the industry, among lay persons, it's probably widely assumed to be a sensible expectation for complex computer systems.
 
There ser mathematical ways of proving a programme to be without errors but these are time-consuming, and expensive, so I doubt if it has ever been used outside universities. And they would still be subject to the kind of problems that Smartcooky outlines.

I've used programs proofs in safety critical software. The trouble is that the proof often takes longer to write than the code and it's also often tedious in the extreme and people often get the proofs wrong.

When I was involved, systems aimed at automating programme proofs were just beginning to show up. Of course these systems were complex software and not necessarily bug free.
 
The stupid was in evidence on the stand yesterday, evil remains to be proven. The buck passing strongly suggests it

This was a great line by Nick Wallis:

"Parker’s motivations for taking the job of Post Office chair were not properly explored. He told the inquiry he knew he was walking into a business “in deep crisis” and he showed some caution before agreeing to do so – requesting a good look at the Post Office’s figures. But Parker never took a salary (or, more accurately, he donated his salary to charity) which begs the question – why would a busy, important, incredibly rich man without a knighthood agree to take on this government-owned basket case for no money?"

:D
 
Is a bug a bit of programming that was faulty, causing an unexpected result? Basically, human error causes computer error.

There's a quip: "Computers never make mistakes. They correctly execute all of yours."

Except as others have noted, it can be really hard at times figuring out just what's going wrong. On occasion I've run a program and had it fail to produce the expected result. After a lot of troubleshooting and being unable to find the cause, I've re-run it and this time it worked as expected, even though I changed nothing in the program.

There are also things called "heisenbugsWP" (named after the Heisenberg uncertainty principle) which are problems that manifest themselves in normal operation but aren't there when run under controlled conditions in attempts to track them down.
 
There's a quip: "Computers never make mistakes. They correctly execute all of yours."

Except as others have noted, it can be really hard at times figuring out just what's going wrong. On occasion I've run a program and had it fail to produce the expected result. After a lot of troubleshooting and being unable to find the cause, I've re-run it and this time it worked as expected, even though I changed nothing in the program.

Something has always changed if you run a program twice and get different results. I once found a bug that occurred because a timestamp changed. If the seconds part was 8 or 9 the program would crash. I'll leave it as an exercise for the reader to figure out the bug.
There are also things called "heisenbugsWP" (named after the Heisenberg uncertainty principle) which are problems that manifest themselves in normal operation but aren't there when run under controlled conditions in attempts to track them down.

My favourite one of those was a Visual C++ program that crashed in production but not when debugged. The eventual cause of this behaviour was found to be an uninitialised boolean variable. When run in the debugger, Visual Studio filled all uninitialised storage with a blank pattern so the variable looked like it was initialised to true. When run in production, the variable just had whatever value was in the storage used by the stack frame, usually 0, which is false. Once we figured that out, we could reliably reproduce the bug in debug mode and fix it.

I would say that, probably, the biggest cause of bugs in large systems is concurrency i.e. trying to do two or more things at the same time. It makes it almost impossible to reason about the code because you cannot be sure of when two parallel events occur relative to each other. Distributed systems like Horizon are the ultimate in concurrency.
 
But even when Horizon was being built the kind of concurrency issues Horizon would face were well understood. It's not like they were dealing with marshaling errors or race conditions, they were simple transactions that had been well understood for years. They simply were not up to understanding the problems or designing a robust solution.
 
"Robust" being key. Their solution was to have a secret way to fix things and then to blame the postmasters when it went wrong
 
My favourite one of those was a Visual C++ program that crashed in production but not when debugged. The eventual cause of this behaviour was found to be an uninitialised boolean variable. When run in the debugger, Visual Studio filled all uninitialised storage with a blank pattern so the variable looked like it was initialised to true. When run in production, the variable just had whatever value was in the storage used by the stack frame, usually 0, which is false. Once we figured that out, we could reliably reproduce the bug in debug mode and fix it.

We just had that, but in reverse. The dinosaur who created the initial code used displays to prove his code (head/desk). When the new guy came to change it and used the debugger, all of the uninitialised working storage contained crap.
 
We just had that, but in reverse. The dinosaur who created the initial code used displays to prove his code (head/desk). When the new guy came to change it and used the debugger, all of the uninitialised working storage contained crap.

I've had to deal with a lot of programmers who did not have the faintest clue about the environments their programs ran in. The usual inverse ratio of confidence versus knowledge applied heavily.
 
The older guy has a lot of knowledge about the run time environment, but is wary of "modern" tools, the new guy doesn't understand run time environments at all and couldn't work out why his loop of "UNTIL > 100" wasn't even being performed once.
 
<sweat runs down face as I struggle not to derail thread any further >

I too struggle to distinguish between ‘interesting and relevant’ and ‘derail of no interest except to specialists’. Over the years I’ve abandoned thousands of posts.
 
Something has always changed if you run a program twice and get different results. I once found a bug that occurred because a timestamp changed. If the seconds part was 8 or 9 the program would crash. I'll leave it as an exercise for the reader to figure out the bug.

My first suspicion would be an issue with the most significant bit of a nibble. It would be "0" for numbers 0-7, and "1" for 8 and 9
 
One question that seems to be ignored is why was the administration so intent on actively suppressing any hint that the software could be faulty? Were there any kickbacks to head honchos?
 
One question that seems to be ignored is why was the administration so intent on actively suppressing any hint that the software could be faulty? Were there any kickbacks to head honchos?

In their eyes, first they were defending spending $1billion on Horizon and rolling it out by 2000, so it had to be "robust" even though a month before it was noted to be not ready, then they were tasked with making the Post Office profitable, so ditching Horizon for a whole new system was inconceivable.
 
Last edited:
Or, did Horizon/Fujitsu grease a few wheels...?

Doesn't seem to be any evidence for that.

The faults here are the result of the buggy ethical and behavioural routines humans run and the emergent behaviours when those routines are run in the real world.
 
You can get soft errors from cosmic rays or alpha particles from radioactive isotopes or lead that flip values in memory or otherwise affect the actual logic functions.

I do see similar effects in my work
 
You can get soft errors from cosmic rays or alpha particles from radioactive isotopes or lead that flip values in memory or otherwise affect the actual logic functions.

I do see similar effects in my work

I will not discuss Microchannel Architecture.
I will not discuss Microchannel Architecture.
I will not discuss Microchannel Architecture.
I will not discuss Microchannel Architecture.
 
You can get soft errors from cosmic rays or alpha particles from radioactive isotopes or lead that flip values in memory or otherwise affect the actual logic functions.

I do see similar effects in my work

I vaguely remember warnings from DEC about problems with some of their VAX chips being assembled on dodgy substrates with a higher than usual level of some radioactive element. IIRC they would replace them free of charge if you had problems.
 
Doesn't seem to be any evidence for that.

The faults here are the result of the buggy ethical and behavioural routines humans run and the emergent behaviours when those routines are run in the real world.

I agree. At its core, this isn't a software problem. It's a problem of lapsed ethics, gross incompetence and a reward system the incentivized asset recovery and prosecution over justice.
 
Pretty much. A bug can sit in a piece of software for a long time, and only show up when a set of rare and unusual circumstances occurs.
And if it is found in testing it's often easier to add "Don't do this" to the documentation rather than fix the bug.
 
I will not discuss Microchannel Architecture.
I will not discuss Microchannel Architecture.
I will not discuss Microchannel Architecture.
I will not discuss Microchannel Architecture.
How about SNA on Token Ring?
 
One question that seems to be ignored is why was the administration so intent on actively suppressing any hint that the software could be faulty? Were there any kickbacks to head honchos?
Fuijitsu had bought ICL, the UK's answer to IBM, and hence got UKGov ICT contracts by default. If they'd made a major mistake then the civil servants and politicians who awarded the contract.Obviously this couldn't have happened, so there were no faults and the subs were self-evidently guilty.

Remember:
It is better that some innocent men remain in jail than that the integrity of the English judicial system be impugned.
 
Fuijitsu had bought ICL, the UK's answer to IBM, and hence got UKGov ICT contracts by default. If they'd made a major mistake then the civil servants and politicians who awarded the contract.Obviously this couldn't have happened, so there were no faults and the subs were self-evidently guilty.

Remember:

The irony being that if the Post Office had just come clean and figured out a way to fix the system or replace it, there would have been a flurry of publicity and awkward PM Question Time moments and then nothing. Instead, the Post Office is now synonymous with evil stupidity and scandal. They are the ruler against all other scandals will now be judged in the UK. We add "gate" to scandals in the US after Watergate. In the UK, it will now be "The next Post Office" or "Post Office like". Well done Jarnail Singh, Paula Vennells, and the rest of the intellectually subpar cartoon villains. You have set the standard by which ethical failings and incompetence will now be judged.
 
And if it is found in testing it's often easier to add "Don't do this" to the documentation rather than fix the bug.

The best example of a "Don't do that" solution to a reported bug I ever encountered was when I was working at Plessey Controls back in the 80s, and a customer complained that a system we'd provided was crashing and rebooting every afternoon. Wanted to know if we had some kind of timer running that was making it do that. Mystified, we eventually sent an engineer to investigate. He was sitting monitoring the equipment, debug tools set up and waiting, when to his utter astonishment the guy who worked in the office next door came in, unplugged the rack of equipment, plugged in his kettle, made himself a cup of tea, and then plugged the equipment back in and departed.

Yeah. Don't do that.
 
The best example of a "Don't do that" solution to a reported bug I ever encountered was when I was working at Plessey Controls back in the 80s, and a customer complained that a system we'd provided was crashing and rebooting every afternoon. Wanted to know if we had some kind of timer running that was making it do that. Mystified, we eventually sent an engineer to investigate. He was sitting monitoring the equipment, debug tools set up and waiting, when to his utter astonishment the guy who worked in the office next door came in, unplugged the rack of equipment, plugged in his kettle, made himself a cup of tea, and then plugged the equipment back in and departed.

Yeah. Don't do that.

...and don't open your microwave door while its still cooking... especially if you work a Radio Telescope facility...

https://www.theguardian.com/science...-signal-plaguing-radio-telescope-for-17-years
 
The irony being that if the Post Office had just come clean and figured out a way to fix the system or replace it, there would have been a flurry of publicity and awkward PM Question Time moments and then nothing. Instead, the Post Office is now synonymous with evil stupidity and scandal. They are the ruler against all other scandals will now be judged in the UK. We add "gate" to scandals in the US after Watergate. In the UK, it will now be "The next Post Office" or "Post Office like". Well done Jarnail Singh, Paula Vennells, and the rest of the intellectually subpar cartoon villains. You have set the standard by which ethical failings and incompetence will now be judged.

Indeed, and "Horizon" becomes the new catchword for "bloated bug-riddled IT monolith".

We don't want another Horizon.
 
The irony being that if the Post Office had just come clean and figured out a way to fix the system or replace it, there would have been a flurry of publicity and awkward PM Question Time moments and then nothing. Instead, the Post Office is now synonymous with evil stupidity and scandal. They are the ruler against all other scandals will now be judged in the UK. We add "gate" to scandals in the US after Watergate. In the UK, it will now be "The next Post Office" or "Post Office like". Well done Jarnail Singh, Paula Vennells, and the rest of the intellectually subpar cartoon villains. You have set the standard by which ethical failings and incompetence will now be judged.

:thumbsup:
 
My first suspicion would be an issue with the most significant bit of a nibble. It would be "0" for numbers 0-7, and "1" for 8 and 9
Nope. And to stop the possibility of derailing the thread with further guess attempts: the problem was caused by some code that extracted the seconds part of the time stamp and converted them to an integer using a particular library function. The function treated numbers that started with a leading zero as octal e.g. "0644" was converted to 420. 08 and 09 are not valid octal.
 
From Nick Wallis in The Times:

11. The sub-postmasters’ union was in bed with Post Office bosses
George Thomson, the former general secretary of the National Federation of Sub-postmasters, told a 2015 parliamentary select committee that the Post Office had “done nothing wrong” on Horizon. At the inquiry, Thomson seemed determined to cleave to the wrong side of history, repeating his “belief” that there had not been any need for his union to get involved because “the small percentage of claims proves beyond any doubt that the system was robust”. When asked by Alan Bates in 2012 if he would tell his members that they could participate in Second Sight’s independent investigation, Thomson forwarded Bates’s email to Vennells with the message: “I have just received this rubbish … obviously I’ll tell him that Horizon is secure and robust and to go away.”

12. The incompetent administration of compensation has led to long delays
The Post Office says it has paid nearly £222 million to the 2,800 sub-postmasters who were sacked, prosecuted or forced to hand over money to make good holes in their accounts, but there are still hundreds of vindicated sub-postmasters going through the process. Sir Wyn Williams KC, the inquiry chairman, is due to address the issue of compensation in the final phase of the inquiry, which starts in September. His most recent statement on the matter concludes: “The criticisms which I make … of the delays remain justified.” Those delays are likely to continue into next year.

[sorry, reading it on Twitter which only gives me this link:
https://t.co/UKUfcJISuT
]
 
https://www.ft.com/content/edcd12fc-6452-4cac-ae12-e8d220793e58

"The Post Office repeatedly wrote to branches about reported problems with the Horizon system in the first five years of its rollout, while beginning to prosecute hundreds of sub-postmasters for offences linked to the faulty IT software.
The early warnings were made in internal operational manuals, seen by the Financial Times, which were sent to branches between 1999 and 2004."

They knew from the start about issues and encouraged the reporting of such and still they prosecuted!
 
https://www.ft.com/content/edcd12fc-6452-4cac-ae12-e8d220793e58

"The Post Office repeatedly wrote to branches about reported problems with the Horizon system in the first five years of its rollout, while beginning to prosecute hundreds of sub-postmasters for offences linked to the faulty IT software.
The early warnings were made in internal operational manuals, seen by the Financial Times, which were sent to branches between 1999 and 2004."

They knew from the start about issues and encouraged the reporting of such and still they prosecuted!

Paywalled

Here's a clear link...

https://www.removepaywall.com/searc.../content/edcd12fc-6452-4cac-ae12-e8d220793e58
 

Back
Top Bottom