In 1962, when I was at Kidsgrove, Arthur Bailey said that we couldn't use the computer that day because they were running a 3-hour program. That left me figuratively open-mouthed with surprise.
Most of the UK DEUCE monthly serviceability returns showed reliability above 85%. Those in the high 90s % meant that they lost as little as a few minutes a month in down time (including lost time owing to machine failure).
It was then, and subsequently when I was on site for some weeks at Warton, where they had 2 DEUCEs, that I realised that DEUCE ran for days and weeks without error. And, by implication, that there was something seriously wrong with UTECOM.
(JB) On your return to Sydney did you find out exactly what was seriously wrong with UTECOM ?
(RV) Indeed, and it took more than a year to fix it. (Actually, not all the problems got fixed, but most of them did.)
All of the problems related to unauthorised circuit changes, to compensate for being unable to find and remedy the real causes of faults.
Arthur Bailey made pointed comments a number of times during
the course, "Don't change an AOT*" he said. "They are factory set to 1%.
Changing an AOT is putting on a fault to clear a fault, and then
there are two faults to fix." And that's exactly what I found.
(No doubt English Electric was aware of the state of our machine because they would have received a copy of the engineers' log books.)
* AOT - Adjust On Test resistor
Delay lines 7 and 8 had been giving considerable trouble with digit pickup. The output from the mercury was very low. So I had the mercury cleaned, and the units were re-assembled and the crystals tuned (aimed). The output from the Receivers was found to be huge.
Only then did I find that that component(s) had been changed to massively increase the Receiver gains. This alone made the Receivers (in their former state) highly susceptible to every little fluctuation in voltage (and noise and jitter from the mercury was of course amplified to become bits). Digit pickup was real bad in those delay lines. Every time the heater in the mushroom switched on, digit pickup occurred. But that was only one of the causes of digit pickup. The other cause was an earth wiring change for the mushroom that hadn't been done.
But in the 15 months at UTECOM before I went to Kidsgrove, I thought that UTECOM was normal; that was the way it was.
There were regular breakdowns of the card punch. Even the rotary counters to punch the card numbers were sometimes inoperative, and we went for long periods with only 3 digits functioning -- of course that didn't contribute to unreliability, but it was just a symptom of the malaise.
The breakdowns of the card punch were both electrical and mechanical. Before I went to Kidsgrove, the crew bent the rods for the punch magnets whenever a column wouldn't punch. Regularly the electrical wires to the punch magnets broke off from vibration (not wired up properly)
One little problem bugged me, but I kept at it. Neither the card reader nor the card punch met spec for decall followed by a recall. A 9-24 on any row followed by a 12-24 after 2Mc was supposed to cause the reader to continue without stopping. Ours stopped and missed a cycle. I couldn't locate the cause from studying the circuits.
One day, perhaps a year after I returned from Kidsgrove (by which time things generally had improved), I watched the Reader relays when the reader was reading cards with a program executing 9-24 and 12-24 at the appropriate times. After studying this for a while, I gently applied pressure to the relay armatures of various relays that were dropping in and out. One relay cured the fault. When I examined the relay contact "arms" or "fingers", I found that they had been bent. When I straightened the arms, the fault was fixed.
As the punch had a similar problem, I took the same approach to it. Same cause (relay contact arms bent). When straightened, the fault disappeared. And it took only a few minutes to fix! What they had been doing was to bend the relay arms to make small changes to the timing specs, instead of adjusting the cams.
The reader problem was evident with a DEUCE program to read eigenvalues. There were about 3 or so binary values punched to a card. The program did 9-24 after reading the last number on each card, and after 2Mc a 12-24 to read the next card.
As it was a production (paying) program, it was annoying to have the reader halt after each card (effectively running at 100 cpm instead of 200 cpm and doubling the time for the job).
The punch problem appeared on programs such as post mortem (stopping after punching each triad), STAC ditto, and ZP43, ditto.
A major problem with the machine turned out to be Unit A (the Master Oscillator). As you might recall, this had to be kept tuned. One weekend I set out to do just that. It was badly off, instead of 78 volt sine waves to make clock pulses, they were only 36 volts. (After clipping, the clock pulses were trapezoidal). When I had peaked the waveforms, I tried to run a program.
Absolutely nothing worked ! Nothing at all! At no setting of the Master Phase Control Knob would the machine work. I carefully inspected the unit, looking for signs of changed components. (original joints had red paint on them, making re-soldered joints easy to see.) One turned out to be an AOT for the phase control. I soldered in the correct value, and lo and behold, it worked!
But prior to 1963, in MULT/DIV units K, L, M and in the adder tappings had been altered on the delays (DN networks). I concluded that they had been trying to compensate for the bad waveforms coming out of Unit A. When we had unidentifiable trouble with MULT/DIV, I removed the cover of one of the DNs. The coils had been blackened by the soldering iron where tappings had been soldered and resoldered in a frenzy of experiment to get something to go.
The fact was that prior to 1963, the machine had been operating with trapezoidal clock pulses instead of nice square ones, and that consequently, every signal was a fraction late when it was clocked into or out of DLs etc.
When Arthur Bailey delivered the gospel about AOTs, he stated that one DEUCE that had been so badly maintained it had to be sent back to the factory to be completely rewired. I can believe that. But he didn't name the site and I didn't ask, and anyway, names wouldn't have meant anything to me then.
In my opinion, UTECOM was fit to be sent back to the factory. In the week before I returned from Kidsgrove, serviceability was 47.6%. 48 hours were consumed by unscheduled maintenance and lost time. The week before that, it was 55%, with 36 hours consumed in unscheduled maintenance and lost time.
The week before that, it was 11%, with only 2 hours 53 mins of good time. Unscheduled maintenance and scheduled engineering (sic) took 49 hours. Accompanying those good 2 hours 53 minutes, were 2 hours and 51 minutes of lost time due to machine failure. In October, one whole week went by with only 1 hour of good time !
We had a battle with MULT/DIV, and for a prolonged period, the machine wouldn't run on High. As we had a good range on the Low side, for a while we had to run the machine on Lo to have any margin of safety at all. I think that this period was late in the first year, before Jack Richardson dropped in on a whirlwind visit. But we overcame that.
The punch problems described above may seem trivial, but they were just part of a particular problem we had. The main problem was that the punch repeatedly broke the tapered pin in the main drive gear wheel. This occurred because of repeated starting and stopping of the punch. Those 9-24 and 10-24 problems were just one part of the suite of programs that did that.
As well as the breaking pin, the clutch repeatedly worked its way loose. Nothing would stop it from coming loose, no matter how tightly the screws were tightened. When it worked its way loose, the clutch dog bounced, and the drive wheel ran backwards, and in doing so drove the trailing wire brushes backwards so that they bent and shorted to the frame each time each subsequent card was punched (symptom - program dropped out)
So the remedy was to minimise stopping, and to have some crystallography programs re-written to punch cards continuously and in batches instead of just one at a time. We also got a teleprinter to relieve some of the load from the punch, which was heavily used.
Apart from the mechanical problems, the punch was unreliable in punching columns. Sometimes a column would be punched where required, but would punch also in the following row or rows when not required. As well, some columns would not punch intermittently. Some columns did not punch when output was sent up to 6ms after the row was in position. To deal with these problems, the punch magnet wiring was redone, some magnet wires straightened, and the cams re-timed. To ensure that the punch worked reliably, I wrote a program to read and punch binary, varying the time from 0ms to 6ms after the row was in position. Each day, as part of the proving of the machine, this program was run. Every three cards were sum-checked, and a sizeable lot of cards was read and punched.
The DEUCE was superbly engineered by EE. It was reliable, ran for long periods, days and weeks at a time without failure.
We did have some tough times, but they were mostly made by our site (as described above).
There was one thing that EE mis-advised, and that was the site conditions. Full-flow blown air through the machine was completely inadequate in Australia, and Jack Richardson conceded it.
On one stinking hot day in December 1963 or January 1964, when the room temperature was around 100F+ degrees (38C+), it was some 55 or 60 degrees C in the DEUCE roof where the transformers were housed. I don't recall now what the temperature in the drum was, but it was sufficiently high to concern me and when I checked in the roof, discovered that the temperature was hotter than transformers normally operate.
I switched off the machine, but it was already too late. All heads of our new drum had crashed, and badly scored the surface.
On another January day, it was raining long and hard as only Sydney
can rain. It was extremely humid in the machine room (because it was
As we cleared each fault, a new one appeared. By 5pm we were still clearing faults and had not been able to hand over the machine for any use. I gave it away, and switched off the machine, and hoped for better weather the next day.
It was. And UTECOM behaved.
Moist air blown from the bottom of the bays caused undue failures of 1M resistors. One failure of like kind was a design fault. In unit AIM (the bottom unit near the front panel) a valve grid was wired to a tag next to a +300V or -300 Volt supply tag. (I think that the potential difference was around 400 volts but cannot be sure.)
We had been experiencing intermittent failure of AIM, and you can guess the cause -- leakage between the tags, enough to change the grid voltage erratically by up to about 5 volts. It had been operating OK for a few years, but moisture finally put an end to it. I simply unhitched all the wires from the grid tag.
Occasionally test programs would not reveal any machine fault. Two programs in particular gave trouble when the machine seemed to be otherwise OK. These were GIP (which included matrix multiplication) and a crystallography program. So from an early time we used user's programs as "test programs". And we used them to prove the machine. These programs would be run each day on high margins and low margins prior to handing over the machine for general use.
© Robin Vowels - 19 November 2004