DEC PDP-9 Restoration

 
12/22/12
We removed the 709 power supply and checked for physical damage. Everything looked OK.
We reformed all of the capacitors in the power supply because it had not been powered on in a long time.
This is a process were a power supply is voltage limited to a little less than the rated voltage of the capacitors
and is current limited to about 20mA is connected to the capacitors. The voltage on the capacitors will build to
the set voltage over several minutes. Once this is completed the oxide insulating layer is rebuilt in the capacitors
and the will work OK. If you just turn on the power supply you will likely damage the capacitors.
We also reformed the electrolytic capacitors on the boards in the core memory system.
 
Click on the image for a larger view.
We used a power supply current limited to 20mA to reform the capacitors.
 
We attached a resistive load to the power supply and measured the output voltages. All looked OK.
The fan was sticky so we sprayed some WD40 in the end bearing to free it up.
It is noisy, but it works OK a few seconds after it is powered on.
Eventually we will need to disassemble the fans and lubricate them with oil.
 
We reinstalled the power supply and connected the AC wires.
We connected 110VAC to the power cord to the 841A power controller.
We could control the power state with the power switch on the console.
We connected the remaining red/white AC wires to the 709 power supply and found that all of the chassis fans work OK.
Some of the fans are noisy so may have to disassemble all of the fans to lubricate them with machine oil.
1/19/13 update: I was able to buy two NOS Howard fans on eBay. They have the same motor, but difference fan blades.
We can probably transfer the blades from a worn out PDP-9 fan to one of the new motors.
 
Click on the image for a larger view.
It looks like the distance between the mounting holes is 4-1/8".
 
Click in the image for a larger view.
A close-up of the fan motor.
 
The second 709 power supply is missing the fan. Let us know if you have a source for replacements. 
 
The hour meter runs and we add a few tenths of an hour to the 40,163 hours already on the system.
That is 20 years of 8 hours per day of run time.
 
We connected the DC wires to the 709 power supply.
It looks like the system was designed to have two of the 709 power supplies.
The connections to the other non-installed power supply are jumpered.
We turned the power supply on for a few seconds at a time and measured the voltages on the chassis test points.
The voltages looked OK and some lights on the console turned on.
We didn't smell anything burning and left the power on.
Warren measured the temperature of the modules in the system and found just a few above the ambient temperature.
 
Click on the image for a larger view.
 
We tried the basic Examine/Deposit functions, but did not get the expected response.
We can turn the PRGM STOP light on and off with the I/O RESET and START switches.
With the REGISTER DISPLAY switch in the API position the REGISTER lights flicker.
The rate of the flicker can be controlled with the REPEAT SPEED switch.
 
We connected the AC power to the paper tape reader/punch.
The punch motor didn't turn on with the AC power to the system.
We need to check the power switch on the front of the punch.
It looks like reader/punch has been modified and gets some of it's power through
additional wires, and some through the signal ribbon cable.
We will need to investigate this further.
 
Later this week we will do some basic debugging to see if any of the processor is working as expected.
 
12/28/12
We explored the Margin Power Supply control panel today because the voltage gauge was not working.
We found that the contacts in the wafer switch were really dirty.
We disassembled the switch, cleaned the contacts, and reassembled everything.
Of course our efforts didn't fix the voltage gauge.
 
Click on the image for a larger view.
The rear of the dissasembled Maintenance Panel.
 
Click on the image for a larger view.
One of the maintenance switch wafers before cleaning.
 
We traced the connections from the voltage gauge, through the wafer switch, and through the wiring harness.
The margin voltage sense wires terminated in a 2x3 contact connector mounted to a metal panel.
It turns out that the margin voltage sense wires get daisy chained from the CPU, to the TC59, to the TU20.
There was a cable to interconnect the margin bus from the TC59 to the TU20.
The "out" connector in the TU20 had a "loopback" plug plugged in.
We can't find the margin sense harness that goes from the CPU to the TC59. We should have it somewhere.
We plugged the loopback plug into the CPU margin connector and now the margin voltage gauge     works.
 
At the same time we looked for the I/O bus cable that goes from the CPU to the TC59.
We should have it, but it is missing somewhere in the warehouse right now.
 
We repaired two of the partially delaminated flexprint cables for the front panel.
We need to replace one flexprint with some ribbon cable so the rest of the lights on the front panel will work.
 
We connected the I/O cables and DC power cables to the paper tape reader/punch.
The punch motor runs when you push the feed switch.
The capstan for the punch does not advance because it is binding.
We need to disassemble the punch to clean and lubricate it.
The reader does not advance when you push the feed switch.
That will need some debugging because the controller actually controls the stepper motor.
 
12/29/12
Decided to look at why the Program Stop switch does not turn om the PRGM STOP light.
See page D-BS-KC09-A-10 Clock, Run, and Display (Sheet 1).
 
The PWR OK\ signal on pin L, and the CLK signal on pin D of the R409 in slot J22 look OK. (section D7 on the schematic)
The PWR OK\ goes active 150mS before the -15V is in regulation.
The CLK POS signal from pin F of the S603 in slot J23 is active when the RUN flip-flop in slot J31 is on. (section D6 on the schematic)
The KSP (Key Stop) signal going into the D pin of the R111 in slot J27 is active when the STOP key is pressed. (section C3 on the schematic)
The buffered KSP (Key Stop) signal going into the S pin of the R111 in slot J28 is active (-4V) when the STOP key is pressed. (section C3 on the schematic)
The R pin of the R111 in slot J28 is 0V. (section C3 on the schematic)
The L pin of the R111 in slot J28 is 0V.
The DONE(1) sognal on the K pin of the R111 in slot J28 is 0V. This indicates that the core memory cycle is not done.
The RUN(0) sognal on the N pin of the R111 in slot J28 is 0V when the processor is not running.
It turns out that the instruction didn't finish execution, so DONE(1) is not active, so the Program Stop switch is not enabled.
 
The I/O RESTART signal on pin D of the R002 in slot F34 is always low.
The CM CLOCK signal on pin U of the B105 in slot H22 is active.
The CLK POS signal on pins D & F of the R002 in slot J34 is active when the processor is running. (1MHz)(section C5 on the schematic)
The CM CLK signal on pin N of the B602 in slot H33 is active when the processor is running. (1MHz)(section C5 on the schematic)
 
See page D-BS-KC09-A-16 CM Timing. 
The CM CLK signal on pin R of the R111 in slot E22 is active. (section C5 on the schematic)
The SM(1) signal on pin S of the R111 in slot E22 is high, so the CM CLK signal will not get to pin U.
The AM SYNC BUS(0) signal on pin P of the R002 in slot E21 is low (-4V).
 
12/30/12
We looked at the signals that select addresses in the Control Memory.
Since the processor is not running we set the Maintenance Panel in Deposit and turned repeat on.
Speed = 4 = 150 uSec repeat rate.
Maintenance = Examine.
Repeat = On.
 
See page D-BS-KC09-A-16 CM Timing.
The signals IN CLR and CLR are active (high) when the Deposit, Deposit Next, Examine, Examine Next, and I/O Reset keys are pressed.
The signals CM STROBE A, B, C, & D are all active. (section D3 & D2 on the schematic)
We found that the CM CURRENT signal was two 50ns pulses where it should have been a single 80ns +25ns/-0ns pulse. (section C2 on the schematic)
The pulses coming from the B105 module in slot F28 (section B3 on the schematic) looked OK.
The pulse from the B105 module is wire-Ored with the 25ns delayed pulse from the B310 module in slot EF29.
So, the B310 delay module is about 12ns slow and the two pulses are not merged.
 
Click on the image for a larger view.
The upper trace is the CM CURRENT signal (broken). The lower trace is the EAE STROBE DLYD signal.
We replaced the B310 with a spare module and now the CM CURRENT pulse is about 85ns. This is OK.
These signals are inverted.
 
Click on the image for a larger view.
The upper trace is the CM CURRENT signal (working). The lower trace is the EAE STROBE DLYD signal.
These signals are inverted.
 
See page D-BS-KC09-A-17 CM Addressing.
We looked at the CMP 0-3 & CMG 0-3 signals. (section D5 & D6 on the schematic)
There is what looks like a voltage as a result of the CM current pulse on all of these signals.
We thought that maybe there were shorted diodes on the G210 that would enable all of the outputs.
All of the diodes on the G210 tested OK.
The input and output signals on pins D & E of the B105 in slot H21 were complementary.
 
Next Saturday we need to determine if only one CM line is being activated.
We also need to see if the CM address starts at 0, then goes to 1, then to the rest of the Deposit cycle.

01/05/13
See page D-BS-KC09-A-16 CM Timing
We looked at the CM CURRENT signal in section C2.
The signal went low when when the Deposit, Deposit Next, Examine, Examine Next, and I/O Reset keys are pressed.

See page D-BS-KC09-A-19 CM Sense Flip-Flops (Sheet 1)
We looked at the 0->CMA signal in section C7.
It goes low at power up or when the Deposit, Deposit Next, Examine, Examine Next, and I/O Reset keys are pressed.
The PK CLR signal in section C5 has the same behavior.

We looked at the 0 and 1 outputs of the CMA 0 Flip-Flop in section C6.
Both outputs were at ground level.
There was no -15V on the B pin of the module.
We eventually traced the problem to a dirty 04 margin switch.
When the -15 was restored to the Flip-Flop it worked correctly.

With the CM sense Flip-Flops working we started seeing some signs of life in the system.
The address set in the switches gets copied to the ADR register when the EXAMINE switch is pressed.
A lot of the system has to be functional for this to happen.

When the system is powered on all of the CM Sense Flip-Flops are cleared by the 0->CMA and PK CLR pulses.
When the EXAMINE switch is pressed the KIOA signals are set to 3.
When the CM CURRENT signal is activated the CM data at location 01 is read.
The CM STROBE * signals latch this data into the CM Sense Flip-Flops.
The ADSO, MBI, and SM bits in the CM data are set so the Address Switch contents are gated to the I/O Bus  (B).
Then Address Switch contents are gated to the O Bus, and then to the AR Register and the MB register.
The next CM address from the contents of 01 is 25.
The CM content turns on MBO, ARI, and KEY.
This copies the contents of the MB register to the AR register.
At this point the MC hangs waiting for the Memory Read Strobe.

The behavior of the MC is not consistent and is affected by the Speed switch setting.
Next week we will connect a logic analyzer to the CM Flip-Flops to make sure that everything is working correctly.
We also need to determine why the CM is not getting a response from the Core Memory Controller.

01/12/13
The Mode Info flexprint cable #26 that goes from the front panel to slot CP H36 had peeled apart.
I replaced the flexprint with modern ribbon cable because the signals in the cable are static.
The SING STEP, SING INST, and REPT lights work now.

We ran the maintenance test described in section 3.7.7.3 of the PDP-9 maintenance manual.
To start the diag you press the I/O RESET switch, turn the console REPT switch on, set the maintenance switch in the MAINT position, and latch the START switch up.
The contents of the AR register are copied to all of the other registers. You can use the REGISTER DISPLAY switch to see the contents of the registers.
The contents of the AR go onto the A bus, are run through the ADR to increment the value, and then onto the O bus, and into the MB.
The contents of the MB go onto the B bus, the B bus goes through the ADR, and onto the O bus.
The O bus is loaded into the AC, AR, and PC registers. It would also go into the MQ register if this system had the EAE.
The contents of the address switch register can also be inclusive-ORed with the ADR.
All of this behavior looks OK, so that means that large parts of this system are functional, especially the Control Memory.

YouTube Video


See page D-BS-KC09-A-10 CM Clock, Run, and Display (Sheet 1).
We looked at the A, B, and C flip-flops in section D2 & D3 because they control much of the CP timing.
The REPT CLK period is 8 uS, 28 uS, 200 uS, 2.6 mS, and 80 mS for switch settings of 5-1.
This is close to the values in the table on page D-BS-KC09-A-10 CM Clock, Run, and Display Timing.

Click on the image for a larger view.
The top trace is the REPT CLK. The bottom trace is the flip-flop C(0) output on pin J.
The period of the REPT CLK is 200 uS so the REPT SPEED switch was set to position 3.
This looks OK, and was also OK at other speed settings.
There was no output on pin J when the processor was running. This is OK.

Click on the image for a larger view.
The top trace is the flip-flop C(0) output on pin N of flip-flop B.
The bottom trace is the flip-flop B(0) output on pin P.
The period of the C(0) is about 18 uS so the REPT SPEED switch was set to position 5.
The output of the B flip-flop has glitches when the input signals change.
We swapped the S206 modules in slots J29 & J30. There was no change in the behavior.
We need to fix this or the CP timing will be really confused.

The Hours Meter showed 40181.9 when we finished.

01/19/13
EXAMINE and DEPOSIT are not working.
We need to determine if this is a processor or memory problem.
We spent some more time looking at the system's behavior when it is doing an EXAMINE.
We are actually getting three CM CURRENT pulses, not just two as we originally though.
One was 5uS before the other two when were running in a slow speed, so we didn't see it on the 'scope.
The first CM word has the SM bit turned on so it will wait to synchronize with the core memory controller timing.
The second and third words run asynchronously with the core memory controller.
 
Click on the image for a larger view.
The top trace is RUN.
The bottom trace is CM CURRENT.
The first CM CURRENT pulse should be reading CM word 01.
The second pulse should be reading CM word 25.
The third pulse should be reading CM word 26.
 
We looked at all of the CP/Memory Interface signals on page D-BS-KC09-A-24 CP/Memory Interface.
All of the signals looked reasonable, so the core memory controller is alive.
We didn't see any data from the sense amps.
It is possible that we have written all ones to all core locations with our experiments.
 
Click on the image for a larger view.
We looked at all of the signals on page D-BS-KC09-A-29 System Timing.
Everything here looks OK.
The IO RESTART signal was static so it was not included.
 
We spent some time looking at the schematics trying to determine the difference between the Examine and Deposit behavior.
From the schematics it is not obvious.

01/26/13
We looked at the D-BS-KC09-A-19 CM Sense Flip-Flops (Sheet 3) to see if the SAO flip-flip is working correctly.
We were thinking that if the SAO signal was always off the processor would always do a DEPOSIT cycle and never read core.

Click on the image for a larger view.
The top trace is RUN.
The bottom trace is SAO(1) for an EXAMINE cycle.

Click on the image for a larger view.
The top trace is RUN.
The bottom trace is SAO(1) for a DEPOSIT cycle.

So, it looks like the SAO signal is active and is different for an EXAMINE and DEPOSIT cycle.
Now that we know that the signal is active we need to verify that the timing is OK.

We decided to look again at the behavior of the A/B/C flip-flops on page D-BS-KC09-A-10 CM Clock, Run, and Display Timing.
We also looked at the timing diagram on page D-BS-KC09-A-11 CM Clock, Run, and Display Timing.

Click on the image for a larger view.
The top trace is RUN.
The bottom trace is C(1) for a repeated EXAMINE cycle.

We noted on 01/12/13 that the B flip-flop had glitches on the output and was not doing a divide-by-two.

Click on the image for a larger view.
The top trace is RUN.
The bottom trace is B(1) for a repeated EXAMINE cycle.

Since RUN(1) is tied to the preset input of the B flip-flop the B(1) output gets forced back high when it tries to flip low.
So this is normal behavior after all.

02/02/13
We decided to look at the operation of the core memory controller today.

With the system in maintenance mode performing repeated deposits we can see the address switches moving to the MB and then to the AR.
We can see the data switches moving to the MB.
A large part of the system needs to be functional for just that to work.

See page D-BS-MC70-B-11, Memory Test Connections.
We looked at all of the core memory signals available on the test connectors.
All of the signals are active, have reasonable wave forms, and reasonable timing.

Click on the image for a larger view.
These traces show the read and write current going through the current limiting resistors.

See page D-BS-MC70-B-4, Digit Drive Bits 0-17 (Sheet 1).
All of the address and data signals used for bit 0 look OK in their high and low states.
The Digit Write Sink (1) signal is 800 ns after the CLK signal. That looks reasonable.
The +V Digit 00 Res and -V Digit 00 Res signals look OK for all 16 possible memory address combinations, and for read and write.
That means that all of the steering diodes, drive transistors, and core wiring is OK.
The MBS00 through MBS17 signals all look OK in their high and low states.
That means that the data switches are getting to the core.

See page D-BS-MC70-B-5, Word Selection.
All of the address signals look OK in their high and low states.
The WR ^ MA5(1) and the WR ^ MA5(0) signals look OK and occur 800 ns after the CLK. Looks OK.

The WW ^ MA5(0) and the WW ^ MA5(1) signals are just high. There is no WORD READ strobe.
This would prevent the core from being read.

Click on the image for a larger view.
The top traces are 100 ns/division.
The bottom two traces are 200 ns/division.

See page D-BS-MC70-B-1, Memory Control (Sheet 1).
The MA5(0) and MA591) signals look OK.
The WORD WRITE(1) signal looks OK.
The WORD READ(1) signal is just high.
So the problem is not with these inverters.

See page D-BS-MC70-B-1, Memory Control (Sheet 2).
The READS OFF and WORD READ ON signals look OK and the timing looks reasonable.
The WORD READ(0) signal look OK, but the WORD READ(1) signal is just high.
We replaced the B213 flip-flop module in slot H33 with a spare.
Now we can read the original contents of core memory, but it is not rewritten.
We cannot deposit to core.
This is some real progress!

We looked at the outputs of the rest of the flip-flops on this schematic page.
All look OK except that DIGIT WRITE SINK(1) only goes down to -3V and the rest go to -4V.
We replaced the B213 in slot B16 with a spare, but it did not change the signal shape.
This signal goes to all 18 G219 Memory Selector modules.
We pulled all of the G219 modules, replaced 2, and looked a the DIGIT WRITE SINK(1) signal.
The signal looked OK until we got to the G219 in slot AB09.
We replaced the G219 with a spare module and the DIGIT WRITE SINK(1) still looked OK.

We had hoped that the core would read and write now, but found that the data switches are not transferred to the MB.
Oh well, now we know what to look at next week.

The Hours Meter showed 40193.5 when we finished.

02/09/13
No work due to blizzard.

02/16/13
Just to make sure that the registers, microcode, I/O buses, etc were working OK we ran the build in maintenance test.
All the register contents looked OK so we continued debugging the core memory write problem.

On 2/2/13 we noticed that we could read core, but the rewrite or deposit did not work.
Warren scanned core addresses and found a location that has bit-7 on.
We looked at the B169 module in slot C31. See page D-BS-MC70-B-3, Memory Input Multiplexer.
Pin J was high indicating that the MB07(0) signal was present.
Pin F Mode(0) was low indicating that the processor was granted access to the core memory.
Pin D was low indicating that the MBS07 was present.

We looked at the MBS07 signal on the G219 modules in slots AB09 and EF09.
The signal was present on both. See page D-BS-MC70-B-4, Digit Drive Bits 0-17 (Sheet 4)
So at least we know that the bits that were read from core made it back to the Digit Drive modules.

We wanted to make sure that the data switches from the console made it to the Digit Drive modules and changed the current waveform.
See page D-BS-MC70-B-4, Digit Drive Bits 0-17 (Sheet 1).
We looked at the -V DIGIT RES 00 signal on pin FF of the G219 in slot EF07.
We could see a 20V pulse when the bit-0 data switch was on.

So now we have verified that the Digit Drive signals look OK.
That only provides 1/2 of the current needed to write core, so it was time to look at the Word Selectors.
Click on the image for a larger view.
The top trace is CLK.
The bottom trace is -V WORD RES.

See page D-BS-MC70-B-5, Word Selection.
We looked at the -V WORD RES signal on pin JF of the G219 module in slot HJ27.
The signal was a steady -30V. This should have a pulse on it that was similar to the -V DIGIT RES 00 signal.
Without this pulse the write current to the core will be 1/2 of the required current and it will not work.

We looked at the +V WORD RES signal on pin HH of the G219 module in slot HJ27. 
We saw a -30V pulse similar to the +V DIGIT RES 00 signal.

Since the +V WORD RES signal looked OK we knew that all of the MA signals were OK.
We previously has a problem with the WR(1)^MA5(1) v WW(1)^MA05(0) signal that was caused by a faulty G219 module.

Click on the image for a larger view.
The top trace is CLK.
The bottom trace is WR(1)^MA5(0) v WW(1)^MA05(1).

We looked at the WR(1)^MA5(0) v WW(1)^MA05(1) signal on pin HV of the G219 in slot HJ27.
The pulse should have gone to -4V, but only went to -2V.

Since this signal only goes to the Word Drive G219 modules we pulled all eight of the modules associated with the Word Drive circuitry.
Without the G219s installed the WR(1)^MA5(0) v WW(1)^MA05(1) signal went to -4V so we knew that we had another defective G219 module.
After replacing the G219 modules one at a time we found that the one in slot HJ24 was the defective one.
After replacing the G219 with a spare the WR(1)^MA5(0) v WW(1)^MA05(1) signal went to -4V.

A quick test showed that we could examine, rewrite, and deposit to core.
This was a major milestone and meant that we could see if the processor would do anything.

We did some instruction testing to see what works.
The RAL, RAR, RTL, RTR, CLL, STL, CML, CLC, CLA, and CMA instructions work OK.
If the IZS instruction is not doing a skip it works OK. If the ISZ does a skip it sometimes goes to location 0.
The JMP instruction sometimes stores the contents of the PC where the JMP instruction was.

Running two JMP instructions



Well, the processor is alive, and lots of the instructions seem to work OK.
We will work on the flakeyness in the JMP and ISZ instructions next week.
Maybe connecting a logic analyzer to the ROPE memory sense amps would show if the microcode is working correctly.

02/23/13
Last week we found that the JMP and ISZ instructions were unreliable.
They worked OK when single stepping, or when running at a very low speed.
The processor would go off into the weeds when running at full speed.

This week we connected Warren's Logic Analyzer to the Control Memory address lines so we could see the micro-instruction flow while it was executing a JMP instruction.
It took a few tries to determine which signals and which polarity of the signals we needed to watch, but we eventually figured it out.

Click on the image for a larger view.
Warren's USB logic analyzer is the little silver box on the 'scope cart.
In conjunction with a digital 'scope we can really see what the processor is doing.
I can't imagine how a DEC field service person would have fixed the processor issues without a logic analyzer.

See schematic page D-FD-KC08-A-6 Key Flow and schematic page D-FD-KC08-A-18 CM Wiring Matrix and Program (Sheet 1)
It you push the IO RESET key the CM address lines would go to 01.
This confirms that the processor will just sit on the KEY NOP CM word.

When you press the EXAMINE key the processor goes to:
CM word 01 which copies the address switches to the MB, and starts a memory cycle.
CM word 25 which copies the address in the MB to the AR and is displayed on the console.
When the memory cycle finishes the contents of the core location will be in the MB and is displayed on the console.
CM word 26 which copies the PC to the MB in preparation for an EXAMINE NEXT.

We entered a JMP 000200 at address 000200 and ran it.
The processor would execute a few JMP instructions correctly and then execute the wrong micro-instructions.
When it failed the CM CURRENT strobe was about 1/2 of the expected duration.

Click on the image for a larger view.
The top trace is CM CURRENT.
The traces below are the CM address lines A1, A2, A3, A4, A5, and A6.
The address bits on the logic analyzer were grouped into a bus to make it easier to interpret. Unfortunately we got the bits in reverse order.
At the left the CM was left in the Fetch state, CM word 21, by the EXAMINE we had performed earlier.
The CM CURRENT strobe read CM word 6, the START key.
The next CM address in the cycle is 21, the FETCH state.

Click on the image for a larger view.
At the top left the CM CURRENT strobe finally happens after a long delay.
This is the FETCH state, CM word 21.
Then the CM word 12 is processed to decode the instruction.
The next CM address in the sequence is 24, but it is never used.
When the instruction is decoded it modifies the next address to CM word 74, the JMP.

Click on the image for a larger view.
Then we go to CM word 10, BGN.
Then we go to CM word 21 to fetch the next instruction.
Unfortunately the CM CURRENT strobe for the fetch is only 25ns long when it should be 70ns.
The next CM word 14 instead of 12 and then CM word 37 and the processor stops.

See schematic page D-FD-KC08-A-16 CM Timing.
The Control Memory Timing Circuit takes the output from pin D of the B602 Pulse Amplifier in slot F30, runs it through a B105 inverter in slot F28.
The output from the inverter goes to the B310 25ns delay line in slot EF29.
The output from the B105 inverter is wire-ORed with the output from the B310 delay line to stretch the 25ns signal into a 50ns signal.
When the processor fails the CM CURRENT strobe is only 25 ns long.
The short CM CURRENT signal does not put enough energy into the ROPE memory so it does not work correctly.
We think that the B130 delay line in slot EF29 that we replaced at the beginning of our debug efforts is defective.
The system behavior is better when it is cold, so maybe one of the transistors on the B130 is partially bad.
The two spare B130 delay lines that we tried were also broken.
We will repair two B130 modules this week and continue our work next week.

03/02/13
We bought some DEC3639 (2N3639) transistors from Circuit Specialists to repair the two defective B310 delay lines that are spares.
The D-664 diodes measured OK and we replaced all four transistors.
We also found lots of broken solder joints at the delay lines that probably were the cause of the problem in these spare boards.

See schematic page D-FD-KC08-A-16 CM Timing.
This week we added a lot more connections to the Logic Analyzer so we could see all of the signals that generate the CM CURRENT pulse.
The pulse that generates the CM CURRENT pulse for a fetch cycle starts with the CM CLK and SM(1) signals going to the R111 in slot E22.
This pulse is shorter than the pulse for the other types of processor cycles.
The schematic says that the CM CURRENT pulse needs to be 80 ns +25 ns / - 0ns.
The pulse from the fetch cycle is sometimes as short as 25 ns and sometimes there are two pulses.
The pulse that generates the CM CURRENT pulse for the other processor cycles comes from the B104 in slot F31.
These CM CURRENT pulses look OK.
We found that B310 delay line in slot EF29 had been change from the factory setting to reduce the delay by 12.5 ns.
The replacement wire-wrap was the wrong wire size, was not installed with the correct tool, and was loose and a poor connection.
 
Click on the image for a larger view.
The new wire from pins FN to FR was loose on the backplane pins so we replaced it with the factory setting from FN to FP.
If you look to the right of the 'scope probe you can see another timing jumper change from the factory setting.
Next week we will go through the process of measuring and adjusting the Control Memory circuit delays.
The processor is running reliably if we set the speed to 5 and lock the CONT switch on.
This runs the processor about 50 times slower than normal.
Hopefully adjusting the timing will let it run at the full 1 MHz speed.
Not bad for a 45 year old machine.
 
We tested a few more instructions and found that the LAC (Load Accumulator) and TAD (Twos Compliment Add) instructions work OK.
The DAC (Deposit Accumulator) instruction puts the contents of the AC back in memory, but then goes to the wrong address to execute the next instruction.
It looks like the contents of the AC is getting transferred to the PC.
That should not be too difficult to debug.
 
We also found that the ISZ (Increment and Skip if Zero) instruction does not increment the contents of a memory location.
 
03/10/13
Some of the delay settings in the CM timing circuit have been changed from the factory settings.
One changed wire was wire-wrapped correctly, so it may have been done at the factory or by DEC field service.
Two of the delay changes were a hack, and looked like the wires were just twisted around the wire-wrap post with pliers.
We removed these two wires and put them back to the locations shown in the schematic.
 
Click on the image for a larger view.
The system will now run at full speed!
 
We tried the LAW (Load AC with "n") instruction and it works OK. Only bits 5-17 can be used for the constant.
 
The ISZ is still broken. Fixing that should be our project for next week.
 
These are the original and the current settings for the B310 delay lines for the Control Memory.
 
 Slot  Factory Connection  Factory Delay Customer Connection Customer Delay Measured Delay Status 
 EF29  EE-EF 50ns  EE-EF  50ns   57ns  OK
 EF29  EN-EP  50ns EN-EP  50ns  57ns  OK 
 EF29 FE-FJ  25ns  FE-FF  50ns  60ns   Put back to FE-FJ
 EF29  FN-FP 50ns  FN-FR  37.5ns   55ns Put back to FN-FP 
 EF33  EE-EH 37.5ns  EE-EH  37.5ns  50ns  Changed to EE-EF 
 EF33  EN-EP 50ns  EN-EP  50ns  52ns  OK 
 EF33  FE-FK 12.5ns  FE-FJ  25ns  15ns  Looked OK so we left it as it was 
 EF33  FN-FR  37.5ns  FN-FR  37.5ns  50ns  OK 

See schematic page D-FD-KC08-A-16 CM Timing.
There is a chart at the top of this schematic page that lists a sequence for measuring and setting the Control Memory delays.
The first check is the delay from the CLK at C01H to the CM CURRENT at F26U. The delay should be 26ns +/- 10ns, and measured 50ns. This is a "check only" so we will look at this again later.
The second check is the delay from CLK at C01H to CM STROBE D at E27N. The delay should be 100ns +/- 10ns, and measured 96ns. We will leave this alone.
The third check is the width of CM CURRENT at F26U. The width should be 80ns + 25ns - 0ns, and measured 60ns. We will try to make this longer to get more current into the CM cores.
We tried replacing the R111 module at F26. There was no change to the width of CM CURRENT.
We tried replacing the R111 module at E24. There was no change to the width of CM CURRENT.
We thought that a bad diode on one of the G210 CM Driver modules could load the CM CURRENT signal.
We pulled both G210 modules, but without a load we couldn't see the CM CURRENT signal.
All of the diodes on the G210 measured OK.
We thought that a bad diode or resistor on the W005 at F27 might not pull down the CM CURRENT signal enough.
We tried two different W005 modules, but there was not change to the CM CURRENT signal.
The system runs at full speed now, so we will focus our efforts on getting all of the instructions to work correctly.
We will revisit the delay adjustments later.
 
The Hours Meter showed 40210.1 when we finished.
 
03/16/13
Click on the image for a larger view.

This is the MC09 (G920) ROP (Read Only Program) memory from the PDP-9. It is the 45 year old equivalent to a ROM and stores the microcode for the processor.
There are 64 wires that go through the 36 transformers. The wires go through either the "1" side or the "0" side of the transformer.
When a pulse is sent down one of the 64 wires it creates either a "1" bit or a "0" bit in each of the 36 transformers.
The B213 JAM Flip-Flop detects and latches the bits that then control the behavior of the processor and indicate the next microcode step.
SLAC found that 80% of their PDP-9 CPU failures were in this module, so they designed a more reliable replacement.
We hope that our PDP-9 doesn't have the level of failures that SLAC observed.

We worked on the broken ISZ instruction. We found that the CJIT bit (pins D & E on B213 in slot D28) from the Control Memory was inactive.
The signal was about -2V and should have been either -4V or 0V.
The problem could only be in the B213 JAM Flip-Flop or the ROP memory.
We replaced the B213 JAM Flip-Flop in slot D28 with a spare. The voltage level was OK now, but the CJIT signal was still inactive.
We looked at the CM SL11 signal (pin K on B213 in slot D28) and found it inactive.
If a transformer connection is OK we should measure the very low resistance across the sense coil in the transformer.
If the connection is bad we will either measure an open, or the 470Ω +/-10% terminator resistor across the coil.
We removed the ROP memory and measured the resistance across all of the transformers.
We found three transformers (AR0, CJIT, CMA1) that measured ~500Ω when they should measure about 0.5Ω.
Warren resoldered the connections on the PCB and the resistance went to 0.5Ω.
We hope that this is not an indication of the legendary unreliability of the Control Memory in the PDP-9.
Lichen Wang at SLAC found that 80% of the failures in their PDP-9 were in the MC09 module.
The ISZ instruction now works. This instruction is always used in programs so it is vital that it work correctly.
 
Click on the image for a larger view.
The top trace is CM CURRENT and the bottom trace is the CJIT signal in the MC09 ROP Memory.
You can see a "1" and a "0" bit and crosstalk from the other wires.

We tried some other instructions and found that the NOP, HLT, LAS, ION, IOF, CLON, and CLOF all work.
The DAC and DZM instructions work, but the processor stops executing instructions after the instruction.
The DAC and DZM instructions will be our project for next week.
 
The Hours Meter showed 40214.4 when we finished.
 
03/23/13
Click on the image for a larger view.
The top trace is CM CURRENT and the bottom trace is the STROBE DLYD in the CM Timing circuit.

This is a NOP, NOP, DZM, JMP -3 loop. The first 11 CM CURRENT signals are OK.
The 12th CM CURRENT signal is early and was not synchronized with the Core Memory controller.
 
We moved many of the logic analyzer probes from the Control Memory signals to the microcode flip-flops so we could see if the microcode contents were reading correctly.
When we ran the DAC or DZM instructions we observed a CM CURRENT signal that was early, because it did not wait to synchronize with the Core Memory controller.
We eventually traced the problem to a CONT(0) CM signal that was a little early and triggered an extra CM CURRENT pulse.
We measured the Control Memory timing again and slowed down the B310 delay line in slot EF33 that controls the CM LOOP timing.
We moved the wire-wrap wire from pins EE-EH to pins EE-EF.
Adding 12.5ns to the delay increased the LOOP timing enough to stop the false triggering of the CM CURRENT signal.
The CM Timing circuit then waited for the CM CLOCK ^ SM(1) to synchronize with the Core Memory before generating a CM CURRENT pulse.
The slower Control Memory timing eliminated the extra STROBE DLYD signal and now DAC and DZM instructions work OK.
The Hours Meter showed 40218.3 when we finished.
 
03/30/13
The PDP-9 was reluctant to run when we first powered it on Saturday. After about 10 minutes it ran fine.
There are many things that can go wrong with a 45 year old computer, but we will eventually need to find the cause of that behavior.

We worked on the paper tape reader so we can load instruction diagnostics.
See schematic page D-BS-KD08-A-9 Reader Control (Sheet 1).
Pushing the FEED button on the paper tape reader resulted in just one jog of the sprocket wheel.
The RDR A and RDR B flip-flops should generate the four signals for the stepper motor as long as the FEED button is pressed, but we saw just one step.
We saw just one RDR INDEX and just one RDR CLK pulse when the FEED button was pressed.
We had several spares for R401 Clock flip-chip in slot E03 that controls the acceleration, speed, and deceleration of the paper tape reader, so we tried that.
We measured the resistance of the trimpot on the broken R401 and set the trimpot to the same value on the replacement R401.
The resulting frequency from the R401 was a nearly perfect 600 Hz to make the paper tape reader run at 300 CPS.
Now the paper tape reader fed as long as the FEED button was pressed.

We tried reading a paper tape that came with the system, but it would only read three characters and stop.
Warren quickly pointed my poor choice of a tape because it was in BIN format not HRI format.
Using a short diag tape, this time in HRI format, yielded better results.
We compared the bits on the tape with what was in memory and found that bit 10 was always a zero.
See schematic page D-BS-KD08-A-9 Reader Control (Sheet 2).
The RD HOLE (2,8) signal from the paper tape reader looked OK, but the S205 flip-flop in slot D07 never changed.
We didn't have a spare S205 Dual Flip-Flop flip-chip, so we substituted an R205. The R205 is the same module but with weaker pull-down resistors.
We will repair the S205 is put it back in the system. 
Now we can load binary paper tapes, almost always correctly, into core.
We need to perform the Tape Reader Adjustments in the PDP-9 Maintenance Manual to improve the reliability.

We looked through the TC59 boot loader paper tape. Hopefully we will be able to locate the 1/2" software tapes, or recreate the tapes.

The PDP-9 paper tape reader controller is much more complicated than the one in the PDP-8 because the PDP-9 can load binary paper tapes into core without running a loader program.
You enter the load address for the program in the console switches and press the "Read In" button.
The controller hardware/microcode reads three characters from the paper tape, assembles them into an 18-bit word, stores it in core, increments the target address, and continues until it sees hole 7 punched.
When it sees hole 7 it executes the last instruction loaded from the paper tape.
The instruction is usually a HLT to stop the processor, or a JMP to run the program that was just loaded.

We also looked at the Teletype interface.
See schematic page D-BS-KD08-A-11 Teletype Control (Sheet 2).
The TTO LOAD signal causes the bits to propagate through the flip-flops and out the serial port is not active, so we didn't get very far.
We need to look at the S603 Pulse Amplifier in slot C39, the R450 Variable Clock in slot C40.
Since we didn't have the W070 TTY cable connected maybe the Teletype input was busy and inhibited the output?

04/06/13
Use the PDP-8/S to make binary images of the PDP-9 diagnostic paper tapes, and to make duplicates that we can use.
The original paper tapes are 40+ years old and are fragile.
 
Install the W070 in slot A33 in the I/O chassis and debug the console serial port.
 
Tune the paper tape reader so it always works correctly.
 
Load instruction diagnostics through the serial port or the paper tape reader and test all of the instruction combinations and interrupts.

04/13/13
We connected the W070 TTY cable to a RS-232-to-Current-Loop converter and then to a PC.
We have standardized on a current-loop cable connector that mates with a VT52 or VT420 on all systems.
The TTY controller actually worked on the first try.
Good thing, because there are about 30 Flip-chips in the TTY interface, so debugging would have been a challenge.


The diagnostic paper tapes for the PDP-9 are dried out and cracking.
We tried to use the PDP-8/S in the museum to make tape images and duplicates.
Unfortunately the PDP-8/S was reluctant to run today.
We finally found an oxidized edge connector on one of the Flip-chips that caused an instruction decoding problem.
We made binary images of two tapes, and will make the rest next weekend.

We are scanning the PDP-9 diagnostic manuals and will post the the manuals and paper tape images on Bitsavers.

I am writing a PDP-9 HRI tape disassembler so that we can make MACRO source code from the binary tapes.
We can also use the program to verify the binary images of the tapes against the source listings in the manuals.

04/20/13
We started working on the broken PDP-9 paper tape punch while the processor was running the MAINDEC-9A-D0CA Memory Address Test.
The punches were a little rusty after being unused for a very long time. That was easy to fix with cleaning and oiling.
The solenoid that enables the tape feed would not activate.
Replacing the W040 solenoid driver flip chip didn't help.
It looks like the sensor that detects the position of the punch drive shaft is not working.
It may be a challenge to find a replacement or to fix the one that we have.


We used the PDP-8/S to make images of most of the diagnostic paper tapes.
We will get them posted to Bitsavers when they are ready.
The scans of the diagnostic documentation are already on Bitsavers.
The Hours Meter showed 402226.2 when we finished.

04/27/13
MAINDEC-9A-D0BA ISZ Test showed just one error during a 20 minute run.
That is probably a memory problem too.

MAINDEC-9A-D0CA Memory Address Test, MAINDEC-9A-D0DB JMP Self TestMAINDEC-9A-D0EA JMP-Y Interrupt Test, and MAINDEC-9A-D0FA JMS-Y Interrupt Test all ran OK.

MAINDEC-9A-D1AA Basic Memory Checkerboard ran for about 20 minutes with a single error.

MAINDEC-9A-D1BA Extended Memory Checkerboard ran fine for about 15 minutes, and then showed some errors.
We have not run through the whole system adjustment procedure.
Hopefully adjusting the memory voltages and timing will fix these errors.

MAINDEC-9A-D1FA Extended Memory Address Test ran without errors.

MAINDEC-9A-D01A Instruction Test Part 1 ran for about 45 minutes, and then halted at E587, address 006206.
Now the processor is acting strange, so something in the Control Memory is broken again.
That will be the project for next week.

05/4/13
We tried the basic functions of the system to see what still worked.
We found that examine/deposit and the built-in maintenance test worked OK.
That means that the Core Memory, the Control Memory, some of the CPU, and some of the I/O are working.
At least we have enough working to start diagnosing what is broken.

We tried a few simple instructions like a JMP and LAC and found that they worked OK.
Memory reference instructions like an ISZ or a DAC fail and end up with the PC = 0 instead of the correct value.
When the ISZ instruction is executing the microcode goes through addresses 06, 21, 12, 24, and 32.
The microcode should have gone to address 30 instead of 32, so there is a problem with the Control Memory address promotion circuitry.
We narrowed down the problem to the R111 module in slot F23.
We replaced the module with a spare, and the ISZ worked.
We put the original R111 back and the ISZ instruction still worked.
So, it looks like we have a backplane connector reliability problem or a bad solder joint on the module. 

We also found that the LAC and DAC instructions worked OK, but the DZM instruction did not work.
We found that the microcode was going to the correct 63 address, but then it went to address 0 instead of address 10.
Address 0 is a NOP and will hang the processor.
We wiggled and reaseated all of the modules in the control memory circuit and then the DZM instruction worked OK. 

We ran the MAINDEC-9A-D1AA Basic Memory Checkerboard for 90 minutes without an error.
That means that large parts of the Core Memory, CPU, and I/O controller are working OK. 

We tried the MAINDEC-9A-D1BA Extended Memory Checkerboard.
This diagnostic does a more exhaustive Core Memory test than the previous basic test and showed errors last week.
The diagnostic failed and we found that the DAC instruction used by the diagnostic was not working correctly.
We replaced the R111 in slot H23 and then the diagnostic then ran OK. 

To test our theory about backplane connector reliability problems or a bad solder joints I banged on the back of the chassis while the diagnostic was running.
The CPU stopped immediately and would not run until we reaseated the Control Memory modules again. 

So, it looks like we need to clean the contacts for the Control Memory modules and inspect the modules for bad solder joints.
This will be our project for next week.

05/11/13
We worked on the vibration sensitivity problem in the PDP-9.
The PDP-9 Maintenance Manual describes "Aggravation Tests" for intermittent faults.
Warren ran an ISZ, JMP, JMP program and watched the console lights while I aggravated the modules in the Control Memory circuitry by tapping on the handles.
We quickly found that the B310 delay line in slot EF29 was a problem.
We had previously tagged this module as suspect so we replaced the module with a repaired spare.
The vibration sensitivity was gone, but the ISZ instruction didn't work correctly.
Warren reflowed the solder joints on the original B310 module and put it back in the system.
The ISZ instruction worked correctly and the vibration sensitivity was gone.


Click on the image for a larger view.

We continued aggravating the modules and found that the MC09 Control Memory was vibration sensitive.
Warren reflowed all of the solder joints on the MC09.
The processor worked OK sometimes, but worked OK only if we pushed up on the right side of the MC09.
I pushed a Sharpie below the MC09 and everything worked OK.
Warren gave me lots of grief about my diagnostic techniques and insisted that I post an image of the "repair".

We ran a series of diagnostics and during the tests the bit-6 light in the register display stopped working.
It didn't take long to determine that the I/O signal that turns on the light was OK.
Wiggling the module that carries the bit-6 signal to the back of the front panel fixed the light.
That was much easier than disassembling the front panel and replacing a bulb.

All of the diagnostic programs now run OK except for the Instruction Test II.
This fails during XCT instruction testing, so we still have debug work to do.
The processor halted at 00417. This corresponds with E1432 in the source code.
Memory location 20 contained 00616. Constant K21 contained 000021.
These are compared during the diagnostic, so this was a failure.

The processor is nearly completely functional now, so hopefully we have just a little more debugging to do before we can look at the TC59 1/2" magnetic tape controller and the TU20 tape drive.

The Hours Meter showed 40240.7 when we finished.

05/18/13
We took one step forward and one step backwards on the PDP-9 today.
We removed and replaced the solder on all of the connections on the MC09 Control Memory board during the week.
The system worked OK when the board was reinstalled today and ran the MAINDEC-9A-D1AA Basic Memory Checkerboard without errors.

The MAINDEC-9A-D0BA ISZ Test test worked OK for a few minutes and then started reporting errors.
Now none of the memory reference instructions work. 

There is a problem in the Control Memory circuitry where the MBI signal is turning on when it shouldn't and makes any instruction that fetches memory get the wrong data.
The result is that the instruction is stored in the memory location were the data belongs.
We pulled the B602 Pulse Amplifier in slot E23 to disable the 1->MBI signal.
After that the memory reference instructions worked OK.
Now all we need to do is find out why the 1->MBI signal is getting activated when it shouldn't.
Fixing that will be the project for next week.

At least we don't need the Sharpie next to the MC09 to keep the system running now.

We removed the inductive position sensor from the paper tape punch.
The wire in the coil is broken internally, so it will need to be rewound.
We will measure the diameter of the wire to see what gauge it is.
Then we will unwind the coil, counting the turns as we go.
Then we can rewind the coil with new wire.
Hopefully the punch will work with a repaired sensor.

05/24/13
We made good progress on the PDP-9 restoration yesterday.
Unlike last Saturday the system worked OK when we powered it on.
We loaded the MAINDEC-9A-D1AA Basic Memory Checkerboard which ran fine for a few seconds and then stopped.
The built-in Maintenance Test would not even run, so something is really broken in the Control Memory.
We connected the logic analyzer and 'scope and started searching for the fault.
Of course once we had all of the tools connected the system decided to run OK again.

We ran successfully several of the basic diagnostics, and tried the MAINDEC-9A-D2BA TTY Test for the first time.
We don't have a working Teletype yet, so we connected a VT220 terminal.
During some character test patterns we saw errors on the display.
We measured the transmit speed and found that it was running at about 102 baud instead of the required 110 baud.
We adjusted the trimpot on the R450 Variable Clock module in slot C40.
There are four trimpots on the R450. Three were still sealed with paint.
The one that had the paint removed was the one that we needed to adjust.
The TTY diagnostic works OK now.

DEC PDP-9 Running Instruction Test II


We were able to run the MAINDEC-9A-D02A Instruction Test Part 2 for the first time.
That ran for 20+ minutes without a fault.
While it was running we bumped into the system and it stopped.
A power cycle was required to get it running again. 

So, when the system is running everything in the processor, core memory, and I/O is working OK.
There is some loose connection or defective module in the system that makes it vibration sensitive.
We need to find the loose connection or it will cause problems forever. 

Soon it will be time to power up the TC59 tape controller and TU20 tape drive and see how much of that works.
It looks like the 834 AC power controller in the TC59 cabinet supplies the TU20 cabinet.
The DC power supply in the TU20 cabinet feeds the TC59 controller.
The 728 DC power supply that was in the TC59 cabinet was removed.
The power supply in the TU20 cabinet should have more than enough capacity to power the TC59.
We have a spare power supply from the PDP-8/S that we could put in the TC59 cabinet.

The Hours Meter showed 40250.7 when we finished.

06/1/13
The intermittent problem in the PDP-9 was permanent today so debugging was a lot easier.
Even a simple function like Examine didn't execute the correct microcode sequence.
We looked at the Control Memory signals with the logic analyzer and a 'scope.
The logic analyzer trace showed double strobe signals to the Control Memory flip-flops.
The first strobe latched the correct signals, the second did not.
That caused the processor to execute the wrong microcode and do strange things.

After looking at Control Memory Timing circuitry we found that yet another B310 delay line in the Control Memory Timing circuitry was misbehaving.
This one generates the Control Memory flip-flop strobes and is a B310 that we had previously repaired.
The two repaired spare modules also misbehaved.
Both of the spare B310 modules had one of the four transistors that were leaky.


Click on the image for a larger view.
We connected a spare backplane connector to -15V through a 100 Ohm resistor for a load, to Ground, and the CLK signal from the processor so we could see how the remaining delays worked.
We found that there was severe ringing on 2 of the three remaining delays.
This might be an artifact from the 12" wires between the module and the power/ground on the backplane.
Next week we will make a better test setup and replace the bad transistors.

06/8/13
Last week we couldn't find the 2N3639 transistors that we bought from Circuit Specialists so I ordered more transistors.
Today we borrowed a transistor from one broken B310 delay line to fix another.
The repaired B310 fixed the problem that we saw last week with the Control Memory strobe.
Of course, after we borrowed a transistor from a broken B310 we found the package of 100 new transistors.

The B310 repair fixed the Control Memory so it would cycle through the built-in diagnostics
Unfortunately the Memory Buffer contents would not display on the console, so we chased that issue for quite a while.
We finally found that the 1->MBI signal was active when it should not be so the MB contents got clobbered. See D-FD-KC09-A-19 CM Sense Flip-Flops (Sheet 2).
That signal is generated by the IN CLR signal which was active when it should not be.
The IN CLR signal comes from the B310 on D-FD-KC09-A-16 CM Timing.
It looks like the signal should be inhibited by the KEY (1) signal, but was not.
The scope showed the same signals on pins E & D of the B104 module in slot F31.
We pulled the module and found that the transistor used in this circuit had different resistance readings compared to the other three on the module.
We replaced the module with a spare B104 module after we measured the transistors in a spare.
We now found the processor misbehaving even more.
It was still really broken when we put the original B104 module back. 
The KEY (1) signal is now inactive. That signal comes from a Control Memory flip-flop.

Oh well, at least we know what circuit to chase next week.

06/15/13
Of course the KEY(1) signal worked fine this week, so we looked at why EXAMINE does not work.

We misunderstood the microcode sequence for Examine and Deposit, so we spent quite a bit of time looking at address promotion logic.
It turns out that there is hardware logic that controls the different behaviors, not microcode.

We could not get the microcode sequences to run correctly and consistently.
Without microcode working this processor will do nothing.
We repaired the broken B310 delay line, and put it back in slot EF29, but that didn't help.
The Control Memory Current signal was a little short so we replaced the B602 pulse amplifier in slot E32 with a spare.
That didn't help either.

At this point it will not execute the microcode sequences correctly, so it really won't do anything.
The documentation is more than a little confusing, so we have some studying to do. 

06/22/13
We then found that the DEI microcode signal was always active and was enabling a circuit that modifies the next microcode address in the sequence.
We traced that problem to a bad sense amplifier in slot D20 in the Control Memory circuit.
When that module was replaced with a spare the microcode sequence looked much better, but not correct.
Sometimes the next microcode address was correct, sometimes not.
We found that one of the delay line modules was creating a double pulse instead of a single pulse and making reading the microcode contents unreliable.
We replaced the M310 delay line in slot EF29 with a repaired module.
When that module was replaced the processor started behaving correctly again.
After lots of frustrating debugging it was nice to see the system back to somewhat normal behavior.

When we executed the built-in maintenance test we found that there is a problem in the adder, registers,
or one of the buses that is preventing the adder from carrying a 1 from bit 7 to bit 6.
We were surprised to find that the carry signal from one adder to the next is not the normal negative logic.
The adders are a unusual design to make them fast enough to add in a single microcycle.
We swapped B131 and B133 modules for bit 6 and bit 7 of the adder, registers, and bus multiplexers, but haven't found the problem yet.
Next week we will try replacing the B133 in slots A16 & A20 that are connected to the adder outputs for bits 6 & 7.

06/29/13
We continued the bit-6 problem that we found last week.
The incrementing between bits 7 & 6 did not work when running the built in diagnostic.
We swapped the B133 modules in slots between A20 & A24, and A16 & A24.
See D-BS-KC09-A-21 Mb and Adder (Sheet 2)
At the point we have swapped All of the components for bits 6 & 7 for the adder with no change in the behavior.

We decided to take different path and tried to examine the highest possible core address.
Bit 6 from the switches did not get to the address lights on the console.
We traced the light driver signals through the I/O controller and compared the signals for bits 6, 7 & 8.
The IO BUS 06 signal was different from IO BUS 07 and IO BUS 08.
See D-BS-KD09-A-7 Input Mixer (Sheet 1)

The IO BUS 06 (B) signal actually comes from the processor chassis.
We continued comparing the signals for 6, 7 & 8 and eventually that the AR register had the wrong value for bit-6.
We replaced the B213 JAM Flip-Flop in slot C18 with a spare and now the built in maintenance program worked correctly.

We ran the D1AA, D01A, D02A, D0BA, D0CA, D0DB, D0EA, and D0FA diagnostics for a total of 5.5 hours without a failure.
I guess that the processor is back to a running state again.

07/5/13
We tried to demonstrate the system to the visitors from MARCH, but it would not run.
Examine/Deposit are not working.
The built-in maintenance program runs OK.
Hardware read-in of a paper tape works OK, but without Examine it is difficult to tell if it put anything in core.
We will fix this tomorrow.

07/6/13
Yesterday we found that we could not Examine or Deposit Core memory.
We have seen this behavior several times before, and the problem was always in the Control Memory.
We spent several hours verifying that the Control Memory was in fact working OK.

See page D-BS-MC70-B-2 Memory Timing Chain
We verified that the system CLK signal was present, and POST CLK and SYNC CLK looked OK.

See page D-BS-MC70-B-1 Memory Timing Chain (Sheet 1)
We verified that the MA JAM WORD, MA JAM PAR, and MA JAM DIGIT signals looked OK.
We verified that we could see the address switches loaded into the MA register during an Examine or Deposit.

See page D-BS-MC70-B-2 Memory Timing Chain (Sheet 2)
We verified that the READS OFF, DIGIT READ ON, WRITES OFF, and WRITES OFF signals looked OK and their timing was OK.
We verified that the DIGIT READ DRIVE, and DIGIT READ SYNC signals looked OK and their timing was OK.
We found that the WORD READ signal was inactive. That would stop core from working.

The READS OFF signal on pin F of the B213 in slot H33 looked OK.
The WORD READ ON signal at pin H was inactive.

See page D-BS-MC70-B-2 Memory Timing Chain
There was no signal on pin D of the Pulse Amplifier in slot E33.
The input signal on pin F was inactive.

The signal on pin H of the B360 in slot D33 was present, but there was no output on pin L.
The signal on pin H of the B360 in slot D30 was present, and there was a signal on output pin L.
We replaced the B360 in slot D33 with a spare. Now we see the WORD READ ON signal on pin L.

We adjusted the timing of the WORD READ signal.
The leading edge is adjusted with the B360 in slot D33 that we just replaced.
The trailing edge is adjusted with the B360 in slot F33. This adjustment was OK.

Now the core memory Examine/Deposit works again.
We ran the D0CA Memory Address Test, and the D1FA Extended Memory Address Test.
Both ran perfectly. I guess that the processor is back to a running state again.

If it runs next Saturday we will power up the TC59 and see if the PDP-9 can talk to it.

07/13/13
The PDP-9 actually ran OK when we powered it up today.
It ran the Extended Memory Address Test and the Extended Memory Checkerboard diagnostics for about two hours, and then we found a problem with address bit 16.
We started by looking at the AR bits 16 & 17 to see if they change when the switches are changed and they Examine is pressed.
Bit 17 changed and bit 16 did not, so the problem is not in the AR register. See page D-BS-KS09-AC, AR, MQ, PC (Sheet 3).
The address & data switches are actually an I/O device, so we looked at the IO BUS 16 (B) and IO BUS 17 (B) signals.
The IO BUS 17 (B) signal changed with the switches, and the IO BUS 16 (B) signal dod not.
These signals come from the I/O controller on page D-BS-KD09-A-7 Input Mixer (Sheet 2).
We compared the outputs of the B141 diode gates for buts 16 & 17.
The signal on pin D of the B141 modules in slots B17 & B18 both changed with the switch settings.
That only left the R123 Diode Gate module in slot D15 as the culprit.
We changed the module, and everything worked again.
The 2N3639 transistors on the module that drives pin P measured about 50 Ohms resistance between the emitter and collector.
This is the same transistor part number used on the many B modules that have failed, so we have lots of spares.
We replaced the defective module with a spare and tagged the module for repair.

We tried to run the D2CD High Speed Reader Test.
It failed because it reads the processor status word and found that there was no tape in the paper tape punch.
We either need to modify the program to ignore the tape-out bit or fix the paper tape punch.

We ran the D7AD Basic Exerciser today for the first time.
We had to inhibit the paper tape punch test because the punch is not fixed yet.
It seems to run OK and the terminal beeps every 5 minutes or so.
Periodically it stops with the PC = 0 so we still have some repair work to do.

We powered up the TC59 tape controller from a laboratory power supply.
No smoke, and some lights came on.

We need to make some AC power cords for next week so we can power the TU20, TC59, and the PDP-9 at the same time.
It looks like we need to find some obsolete, non-NEMA, Hubbell 3333C receptacles to make the power cords.
Next week we will connect the I/O cables between the PDP-9 processor and the TC59 tape controller to see what works.

07/20/13
We wired new AC outlets for the PDP-9 and the TU20 tape drive.
We borrowed the pigtail from the PDP-9 that has a Hubbell 3333C receptacle on one end and a NEMA 5-20P on the other end.
We powered up the tape drive and it is actually mostly functional
After we learned the right sequence of button pushes we actually mounted a tape.
The lower capstan actuator was a little sticky and caused the tape to creep.
After fiddling with it for a while it works OK.
When loading the transport doesn't stop at the load point.
The light is on, so we will need to check the operation of the photo-diode.
The transport doesn't respond to the FWD and REV buttons, but that is likely because the transport is not "ready".
We manually operated the capstan pinch rollers and were able to verify that the reel motors, brakes, tape position sensors, and vacuum pump are mostly functional.
Most of the indicator lights are out.

The PDP-9 processor worked normally when we powered it up today.

07/27/13
The PDP-9 worked fine this weekend. That is three weeks of running OK. Cross your fingers that it stays reliable for a while! 

We took apart the status display for the TU20 tape drive.
It is not exactly designed for easy service. Almost all of the bulbs were burned out.
Fortunately Chicago Miniature still makes them and Mouser has them in stock.

Click on the image for a larger view.
The bulb for the Unit Number display had been repaired by soldering a grain-of wheat bulb to the original bulb contacts.
We will try replacing it with a LED car dome light bulb that will be bright, but will not make much heat.
The one we are getting has 8x LEDs in it so we can remove some of the LEDs if it is too bright.

The vacuum pump, reel motors, and reel brakes in the TU20 are working OK.
We need to debug the BOT/EOT sensors so we can get it to go ready. 

Click on the image for a larger view.
We fixed a spare power supply from a PDP-8/S and installed it in the TC59 tape controller cabinet.
Apparently there are two versions of the power supply, one with 19" mounting designed to attach directly to the cabinet, and one that is narrower that is designed to mount to the swinging door.
Ours was the larger version so we couldn't mount it to the swinging door.
It works fine, but we may try modifying the power supply base so we can attach it to the swinging door. 

We did more work on the AC power wiring.
That was a bit of a challenge because the obsolete twist-lock plugs used in this system don't follow the current ground/line/neutral connections.
We checked the wiring about five times before we powered it up. Works OK.

07/30/13
John W. Rogers donated two B310s, four B213s and a pair of B169 modules to our restoration effort.
We have had plenty of problems with the B310 Delay Lines, and the B213 JAM Flip-Flops.
There are lots of B169 Inverters in the system, but none have failed yet.
These spares will be really helpful.

08/3/13
The PDP-9 worked fine this weekend. That is four weeks of running OK. Cross your fingers that it stays reliable for a while! 

We finished wiring the new AC outlets for the PDP-9 and TU20 tape drive.
We had the PDP-9 processor, the TC59 tape controller, and the TU20 tape drive all powered at the same time while the PDP-9 ran diagnostics.
The AC controller in the TC59 is connected to a switched outlet on the PDP-9.
When you power on the PDP-9 it tells the TC59 to power on too.

Click on the image for a larger view.
We replaced all of the indicator lights in the TU20 with the with the CM7370 bulbs that we got from Mouser.
During testing we found a burned trace on the display panel that prevented the POWER light from working.
Warren soldered a wire where the trace was.
The LED car dome light bulb that we got on eBay will work work nicely.
It is not too bright and won't make enough heat to melt the film with the unit numbers.
We need to get some thin double-sided adhesive tape so we can put the unit number display back together.
The REV function of the tape drive decided to work this week, but the FWD doesn't.
The BOT sensor that was always on last week is now off.
Maybe we should just leave the tape drive powered on for a while to see what else starts to work.

We connected the I/O bus from the PDP-9 to the TC59.
We don't have the two BC09 quad-cables that are used for the PDP-9, PDP-10, and PDP-15 I/O so we used 8x cables from a PDP-8.
We can actually write bits from the Accumulator to the Command Register in the TC59.
That means that lots of the TC59 are functional.

We need to reconstruct the TC59 diagnostic tapes for the PDP-9 so we can start testing the controller.
We have tape images of the PDP-15 versions of the TC59 diagnostic tapes and a source listings for the PDP-9 versions.
I wrote a PDP-9/PDP-15 disassembler in C#. It even seems to work OK.
We could disassemble the PDP-15 tapes into MACRO source, edit the source to match the PDP-9 listings, then assemble the source using the SIMH PDP-9 simulator, and punch a tape with the PDP-8/S.
That will be a bit of work, but should create a good tape.

08/10/13
We got the unit number and status display on the TU20 tape drive fixed. The car dome light LED bulb that we tried last week in the unit number display was a little too big. (I also installed it backwards so it didn't work.) We bought a smaller one, and it works nicely. It has a reasonable brightness, but it is very white compared to the incandescent bulbs. Maybe we can put a yellow sleeve over it to make it look more like the bulbs. The unit numbers are what looks like photographic film stuck to the drum with double sided adhesive. The original adhesive was dried out and the film was just sitting near the display. Carpet tape was a good substitute for the original adhesive. We were a little concerned about substituting an incandescent bulb from a car dome light for the original bulb because it might get hot. The LED "bulb" is bright but it won't get hot and melt the film with the drive numbers.

We tried the mag tape boot loader paper tape that came with the PDP-9 just to see what ti would do. It ran for a few instructions, and then executed a CAL instruction and got stuck in a loop at addresses 21/22. We tried the the IZS diag but it failed. Even an ISZ/JMP/JMP loop failed. The built-in maintenance test ran OK, so at least the adder, registers, buses, and some of the Control Memory is working OK. We started debugged the odd behavior in the Control Memory. It looked like sometimes the Address Promotion logic that decodes the next Control Memory address worked incorrectly. After about 30 minutes of work the processor decided to work OK again. We let the processor run the instruction test 1 diagnostic while we worked on the tape drive. After several hours it stopped running again. We will work on it Monday.

The BOT sensor on the TU20 drive would not go on. That fix was just an easy adjustment on the Photosense Amplifier. (I bought a manual for the HP 7975A tape drive on eBay, scanned it, and put it on Bitsavers.) The EOT sensor is always on. We found a bad 2N1304 transistor in the EOT circuit on the Photosense Amplifier. Warren replaced it with a 2N2222 that he had. The EOT circuit in the transport works now, but there is still an EOT logic problem in the tape logic that we now need to fix. Fortunately the nice people at the Living Computer Museum had a DEC TU20 manual and scanned it.

The Hours Meter showed 40287 when we finished. We have added 124 hours since the restoration started.

08/11/13
I spent some time on the TU20 today.
The W501 Schmitt Trigger in slot D09 that buffers the EOT signal was bad.
I replaced it with a spare and now both the BOT and EOT sensors seem to work.
It is interesting that one of the Germanium transistors in the EOT circuit in the tape transport and the W501 that buffers the signal were both bad.

The FWD button did not work.
The switch was OK, but the W501 in slot C10 that buffers the switch was bad.
I replaced it with a NOS spare and that signal works OK now.
FWD still did not work.
I traced the FWD signal from the Schmitt Trigger to a R602 Pulse Amplifier in slot B13.
I replaced it with a spare and now FWD works sometimes.

The power supply in the TU20 tape drive died during testing.
The AC capacitor in the ferroresonant circuit was about to overheat and leak.
We had a spare because this is a common failure in the DEC power supplies.
It only took about 10 minutes to fix it.

There is still something flaky with the tape drive.
I don't think that the power on reset circuit or the switch RESET circuit is working correctly.
We will spend more time on it tomorrow.

08/12/13
It was a holiday in RI today so we had some more debug time.

Click on the image for a larger view.
We found a R602 Pulse Amplifier in slot B16 that was bad.
One of the leads on a transistor was rusted through and a diode was cracked.
Replacing it with a spare fixed the RESET button.

We found an R401 Clock module in slot A15 that was bad.
Replacing it with a spare fixed the power on initialization issues.
The tape motion seems to be completely working in local mode now.

We connected the TU20 tape drive to the TC59 tape controller that is connected to the PDP-9 processor.
We ran the little paper tape bootloader that came with the PDP-9.
It tested to see if the tape controller was ready (it was),
    then selected the TU20 tape drive and tested to see if it was ready (it was),
    then rewound the tape (that actually worked and it stopped at the BOT marker),
    then it read lots of tape and tried to execute the garbage on the tape that we used.
Now we know that most of the time the tape controller and tape drive motion control are working correctly.
Since we don't have a bootable magnetic tape for the PDP-9 there is not much we can do further.
We need reconstruct a bootable magnetic tape.

YouTube Video



We don't have paper tape images of the PDP-9 mag tape diagnostics.
We do have the source, but it would be quite a project to reconstruct the paper tape.
Warren looked at the code for the PDP-15 magnetic tape diagnostics.
Even though the format of the tape is very different from the PDP-9 version, the diagnostic is compatible with the PDP-9.
We powered on the PDP-8/S for the first time in a few months, relearned how the PC->paper tape punch process works, and punched a TC59 diagnostic paper tape.
The resulting paper tape actually read correctly and ran on the PDP-9.
The results were not pretty because it failed every test that we tried.
We have the failure results and will create a debug plan for next Saturday.

08/17/13
Click on the image for a larger view.

We reran the MAINDEC-15-D4AC-PB TC59 Magnetic Tape Control Instruction Test.
This diagnostic behaves the same as the MAINDEC-9A-D4AF-D TC-59 Instruction Test even though the code is different.
We have the documentation for the PDP-9 version, but not the PDP-15 version.
The diagnostic failed immediately on test 0 where it tests the IOT instructions for the TC59.
The first test writes all bits on to the command register, reads the command register, clears the command register, and then reads the command register again.
The second time it reads the command register all of the bits should be zeros, but they were ones.
If we looped the diagnostic, the status lights would sometimes stay full on, sometimes they would be off, and sometimes they would flicker.

We entered a short program to do just what is described above and watched controller signals with a 'scope.
The decoded IOT and data signals looked fine, so the PDP-9 processor was OK.
We traced the issue to an intermittent CLEAR ALL signal for the command register, see schematic page TC-50-0-2, section C5.
The signals going to the R602 Pulse Amplifier in slot A21 were solid, but the output on pin K was intermittent.
We used both of our spare R602 modules in the TU20 tape drive so we need to fix one.
Warren merged the components of the two broken R602 modules from the TU20 to replace this broken one, but it didn't fix the problem.

08/17/13
It was an abbreviated work weekend due to other priorities.
We adjusted the supply tape hub so it would grip the reel better. Last week it let a tape reel go flying.
The Hours Meter showed 40294 when we finished. 

08/31/13
We continued debugging the TC59 magnetic tape controller on the PDP-9.
Without Warren telling me where to put the 'scope probes the debugging is a little slower. (He is in South Dakota, hopefully only for a short time.)
I swapped the suspect R602 pulse amplifier with one from another area in the tape controller, but the misbehavior was the same.

I thought that the problem might actually be one of the boards connected to the output signal, so I pulled all of those modules a few at a time.
Again, no change in the behavior.

The schematics are a little confusing because they use diamonds and arrows to indicate level and edge signals.
They also use white diamonds and arrows to indicate active high and rising edge signals, and black diamonds and arrows to indicate active low and falling edge signals.
It gets really confusing when they use a white diamond (0) and a black diamond (1) from a flip-flop. These are actually the same signal.

I revisited the input signals to the pulse amplifier and found that the POWER CLEAR was always at ground potential.
Since this signal has a white arrow it should normally be at -4V and pulse high when it is active.
See page 18 in the TC59 tape controller schematic PDF file.
I chasing the signal through an R107 inverter (page 10) in the tape controller, up the I/O cable, and into the I/O controller in the PDP-9.
I found a defective B213 Jam Flip-Flop in slot E20 of the I/O controller. (See page 115 in the processor schematics)
Fortunately John Rogers recently donated some PDP-9 spares, including B213 modules.
Replacing the B213 fixed the misbehaving signal in the tape controller.

I toggled in a little program that executes tape controller IOTs to exercise the newly fixed signal, and it works fine now.
When you hit the I/O reset key you get CLEAR ALL pulses for about 2 seconds, and then the signal goes low, exact what we want to see.

I tried to load the MAINDEC-15-D4AC-PB  TC59 MAGNETIC TAPE CONTROL INSTRUCTION TEST from the PDP-15.
The loader at the beginning of the tape loaded and ran, but didn't load the rest of the diagnostic tape.
This diag has loaded before and shown the problem with the CLEAR ALL signal.
Something must be broken in the PDP-9 processor again.
The processor runs the ISZ test, Memory Checkerboard, Memory Address test, and a small loop of instructions to write to the TTY just fine.
Maybe fixing the I/O controller in the PDP-9 broke something else in the processor or I/O controller.

The B213 Jam Flip-Flop in slot E20 of the I/O controller that I replaced handles the IO PWR CLR signal and the IOT 4 signal.
Maybe the half of the B213 for the IO PWR CLR signal is working and the half for the IOT 4 signal is not.
That will be the project for next time.

09/28/13
It looks like it was an operator error problem when the D4AF magnetic tape controller diagnostic failed to load last time.
We have the documentation for the PDP-9 version of the diagnostic, and the paper tape from the PDP-15 version.
The PDP-9 version loads a 17720 and the PDP-15 version loads at 17700.
I must have set the address switches to the PDP-9 settings last time.
This time I used the PDP-15 address and the diag loaded and ran for a while.


It passed the IOT Test Part 1,
the Command Register and Bit Data test,
the Data Buffer Bit and Data test,
the Data Channel Transfer Direction test,
but failed on test sequence #4 the IOT Test Part 2.
So the tape controller is partially functional.

The output from the diagnostic shows the address of the beginning of the failed test routine as 001644.
We need to disassemble the PDP-15 diagnostic tape so we can see if the failing test is the same as the PDP-9 source listing, and then see if we can determine what it was doing at the time.
That should give us an idea of where to look for the problem.
The Hours Meter showed 40298 when we finished. 

10/05/13
This week the PDP-9 would not load the mag tape diagnostic, even with the correct switch settings.
After 30 minutes of fiddling with reading and writing core memory, trying instructions, and single stepping the diag loader program it decided to work OK and run some diagnostics.
The D0BA ISZ Test ran OK.

The D1AA Basic Memory Checkerboard halted a address 233.
The address of the error was 17766, right where the diag loader goes.
The core contents should have been a 777777, but was 777767, so, we dropped bit-14.
This failure was very repeatable.

The D01A Instruction Test 1 halted at 3124, or E284.
At this point it was testing the Link and AC register.
The AC contained 000020, should have been 000000, so we dropped bit-14 again.

I swapped the (normally troublesome) B213 modules for bits 14/15 and 16/17 in the AC register and the fault moved to bit 16.
I replaced the B213 module in slot C35 with one from our recent Canadian donation and everything worked OK.
It successfully ran the Instruction Test 1 for over an hour during our lunch break. 

I tried to reload the mag tape diagnostic.
The loader ran, but did not read the rest of the paper tape.
Single stepping the loader showed that a bit in one of the instructions was dropped making a DAC instruction go to the wrong address.
This changed a JMP I instruction and made the loader go into the weeds.

I reloaded the D1AA Basic Memory Checkerboard.
It failed with a dropped bit -14.
I looked through the schematics and found LOTS of flip-chip modules related to bit-14.
I swapped the B169 modules in slots B32 & B36, no change.
I swapped the B169 modules in slots B33 & B37, no change.
I swapped the B169 modules in slots B34 & B38, no change.
I swapped the B169 modules in slots B35 & B39, no change.

There are lots more modules to try.

Oh well, mode debugging work to do next week.

10/14/13
We continued debugging the PDP-9 processor problem.

I ran the D1AA Basic Memory Checkerboard which quickly halted.
The diagnostic information said that core location 7 was bad and should have contained 777767.
The only possible core contents are 000000 or 777777, so the diagnostic program was not OK.

Warren suggested running some small instruction loops to see what actually works.
An ISZ, JMP, JMP loop worked OK.

A LAC, CMA, JMP loop failed with bit-14 in the AC = 0.
After a little more debugging I found that a LAC instruction that loaded 777777 from a core location would put 777767 in the AC.
That problem could have been in the core memory subsystem or in many of the processor buses.

I tried a LAC, RAL, JMP loop and found that shifting a 1 from bit-14 to either bit-14 or bit-15 would fail.
I swapped the B213 modules between A33 and A35, but there was no change.
I swapped the B213 modules between A32 and A36, but there was no change.
I swapped the B213 modules between C35 and C39, and the error moved to bit-16.
I replaced the B213 module that came from C35 with a spare from the Canadian donation and it fixed the rotate problem

I tried to run the D1AA Basic Memory Checkerboard which quickly halted again.
I ran the D01A Instruction Test Part 1 but it halted at address 505.
That meant that the link was not a 1 after a rotate.
I thought that the rotate instructions were working correctly so I looked a what the diagnostic was doing.
The instruction at core location 502 should have been a 740010, but it was a 740000.
I deposited the correct instruction in that core location, but bit-14 always came back as a 0.

I deposited 777777 into core locations 0 through 77 and found that any core location where the address bit-16 was a 1 contained a 0.
I swapped the G219 Memory Selector modules between slots AB01 and CD01, and the problem moved to core bit-17.
I replaced the G219 module that came from slot AB01 with a spare, and now the core memory works OK.

I ran the D1AA Basic Memory Checkerboard for a few minutes without a problem.
So, it looks like the processor is working OK again and I can return to debugging the TC59 magnetic tape controller.

I was able to load and run the TC59 mag tape controller diagnostic again.
It still fails IOT Test #2, so that will be the project for next week.
I also noticed that some of the indicator lights for the Memory Buffer were not working.
I swapped bulbs, but that did not fix the problem.
It could be a blown MPS6534 transistor on the Indicator Bracket.
We have a few spares so we should be able to fix lights.

Just for fun I tried the magnetic tape boot program that came with this system.
It loaded and ran, but hung in a loop waiting for the tape transport to become ready.
This has worked in the past, so something else is broken.

Next week I will look into the ready bit from the tape transport not getting to the tape controller, and the IOT diagnostic problem.

10/19/13
The processor is working fine this week and still contained the TC-59 tape controller diagnostic.

Last week I found that the READY signal from the tape drive was not active.
The PDP-9 source code for the TC-59 diagnostic says that tape drive 0 must be online and ready.
It does not have this requirement in the procedure in the front of the diagnostic manual.

I did some signal tracing in the TU20 logic and experimenting with the tape loading procedure.
I found that if you push the Forward button to get to the BOT marker when loading a tape the READY signal does not go active.
If you push the FORWARD button again, and then REWIND to the BOT the READY signal goes on.
This behavior is probably not normal, but at least now I know how to get the tape drive to go ready.

Last week three of the DATA BUFFER lights on the TC-59 did not go on during during step #2 and #3 of the TC-59 tape diagnostic.
I though that it would be easy to transfer the contents of the Accumulator to the DATA BUFFER using the undocumented LDB IOT instruction to turn all of the lights on.
I had to dig through the diagnostic source to find that you need to set the TC-59 COMMAND register to WRITE before you can write to the DATA BUFFER with the LDB instruction.
I guess that makes sense, and I did get all of the indicator lights enabled.
By swapping indicator lights I found that one was dimmer than the others so I could not see it flicker, one was burned out, and one MPS6534 transistor on the Indicator Bracket is bad.
Circuit Specialists has the obsolete MPS6534 transistors in stock for $0.20 each.
Onlinecomponents has the DIALCO 507-3917-1475-600 indicator lights in stock.

With the tape drive online the TC-59 diagnostic got further in step #4, but still halted at 01657.
The ILLEGAL and ERROR lights on the TC-59 were on.

I noticed the WORD COUNT register contained a 2.
This register is set to the 2's compliment of the number of words plus 1 to transfer to/from core and is incremented during each transfer.
When the value reaches zero the transfers stop. So, it is a little unusual for it to contain a 2.
That will be the project for next week.

The Hours Meter showed 40310 when we finished. 

10/26/13
When I was debugging the TC59 mag tape controller on the PDP-9 last week I found that sometimes the READY signal from the TU20 would not go active.
Today the intermittent problem became permanent.
No matter what I did with the TU20 controls the drive would not go ready.

The ~MOTION signal goes inactive when the tape is moving and goes active when the tape stops.
That is supposed to cause the SETTLING DOWN signal to go active for 5ms, and then go inactive.
The SETTLING DOWN signal is always active, so READY never goes active.

I measured the voltage drop on all of the transistors and diodes on the R303 Integrating One-Shot in slot A21 that drives the SETTLING DOWN signal.
All of the measurements look OK.
I am not really qualified to debug an R module to the component level, so a replacement module would be really helpful.

The next step is to get a replacement R303 module, or get some help debugging the broken one.

The PDP-9 processor worked fine this week.

11/2/13
We received lots of comments, suggestions, and offers for help with the R303 module problem.
I did some more checking on the R303 module in slot A21 based on Vince's and Warren's suggestions and found that the R9 trim-pot was open.
Jack Rubin sent us a replacement R303 module. It had several bad diodes so it became a source for a 20k Ohm trimpot. We will repair it later.
With the replacement trim-pot and the R303 adjusted for a 5ms delay it works!
Now the TU20 tape transport goes ready, and we continued with the TC59 tape controller diagnostics.
The TU20 periodically goes into Forward mode when it is online. We will need to debug that too.

When we last ran the Tape Controller Instruction Test it was failing test #4.
The first four tests check some of the IOT instructions, then read/write the Command Register, then read/write the Data Buffer, then do some Data-Break activity.
All of this worked OK so major parts of the tape controller are working.
I was a little surprised to find that the Data-Break is working because it is very complicated, and and both the circuitry in the processor and tape controller for it have not been tested. 

Tape Controller Instruction Test #4 does more IOT instruction testing.
It clears the tape controller registers, then tells the tape controller to execute the configured command with a MTGO IOT instruction.
This instantly causes an Illegal Command error because the TC59 can't execute a NOP command.
Then it reads the TC59 Status Register looking for the Error Flag and Illegal Command bits.
The Status Register was not getting loaded into the Accumulator, so the test failed when it did not find the error bits. 

Note the burned resistor in the middle of the W640 Pulse Amplifier.

I entered a short instruction loop of MTRC, MTRS, and a JMP.
I then looked at the RCM and RST signals from the IOT decoder. See page TC59-0-3 in the schematic.
Both decoded IOT signals looked OK.
Both signals get buffered by a W640 Pulse Amplifier in slot F22.
The buffered CM->I/O BUS signal looked OK.
The buffered STATUS->I/O BUS signal was always at ground.
I pulled the W640 Pulse Amplifier and found a burned 100 Ohm current limiting resistor for the transformer.
I borrowed a 100 Ohm resistor from another module, but the STATUS->I/O BUS signal was still at ground.
Further tracing found two 2N3639 transistors in the W640 that had failed.
The failed transistors turned the driver transistor on all of the time the burned current limiting resistor.
With the bad transistors replaced the Pulse Amplifier now amplifies.
With the STATUS->I/O BUS signal working the STATUS REGISTER gets loaded into the Accumulator.
It also passes Tape Controller Instruction Test #4 now. 

The next test in the sequence, Tape Controller Instruction Test #5, Partial Command Decoding, immediately failed.
PC=3247, ADRS=7517, 001000, AC=0, WC=2, CA=007700, CMD=001000, STAT=540000, CADATA=777777
This test tries to rewind the tape when it is at the BOT.
It should cause an INSTRUCTION FAULT.
That will be next week's project. 

I also tried Tape Controller Instruction Test #6, Initial Tape Motion.
As soon as the tape controller sent a command to the tape transport, the tape transport went offline.
Yet another issue to debug.

11/10/13
We had a lot of visitors at the RICM this Saturday, partially because of the book sale, so PDP-9 debugging time was limited.
I ran the TC59 mag tape controller diagnostic and it again failed, as expected, on test 5.
PC=7247, ADRS=7517, AC=0,WC=2, CA=7701, COMD=001000, STAT=540000, CADATA=777777
When I hit the CONTINUE switch the PC=2004.
At least this means that the processor is still working OK.
The first thing that this diagnostic does is try to rewind the tape when it is at the BOT (Beginning Of Tape).
This should fail because as an illegal instruction. It also causes the tape drive to go offline, which it shouldn't.

At this point I started fiddling with the TU20 tape drive, and found that it would only go online for a moment.
I checked the function of all of the control panel switches and their associated W501 Schmott Triggers
These are static levels and are all OK.

The OFF LINE and ONLINE signals from the W501s in slots C08 and C09 drive the R202 Flip-Flop in slot A20.
I looked at the LOCAL and REMOTE signals from the Flip-Flop, and found that REMOTE (the signal that makes the tape drive go online) would only go active for about 200ns.
I swapped the R202 Flip-Flop, but there was no change in the behavior.
I tried pulling or swapping the Flipchips that are connected to the LOCAL and REMOTE signals.
I swapped the Pulse Amplifiers in slots B12, B13, and B14, but saw no change in the behavior.
I removed the W051 in slot D12, but saw no change in the behavior.
I swapped the W501s from slots C08 and C09 to C12 and C13, but saw no change in the behavior.
I removed the R113 in slot A14, but saw no change in the behavior.

Monday I will look at the POWER CLEAR signal that presets the Flip-Flop to see if that is causing the problem.
I need to look at the R107 in slot A17

Once I can get the tape drive to stay online I can continue debugging the TC59 tape controller.

11/11/13
The problem with the TU20 tape drive not staying ONLINE was easy to fix, but I am not sure what was wrong.
I looked at the POWER CLEAR signal and found that the signal was always active.
This was causing the LOCAL/REMOTE flip-flop to go back to LOCAL when it was set to ONLINE.
I replaced the R107 inverter that enables the R401 clock that makes the POWER CLEAR signal.
Everything worked OK after the flipchip replacement.
Just as a check I put the original R107 inverter back, and everything still worked OK.
Oh well, at least it is working again.

I tried the tape diagnostic and it failed at the same point in test 5, the second IOT test.
I single-stepped the program and it looks like it is trying to rewind a tape that is at the BOT, looks for the expected error, and does it again.
After a lot of loops it reports an error. Not sure why.

I tried the next test in the diag that tests tape motion.
It failed too and left the drive in the FWD mode.
After fiddling with the drive again I found that FWD didn't work well, but REV did.
Rewind was also sluggish.
I cleaned the tape path and found lots of contamination.
That didn't help.
I tried adjusting the FWD capstan.
That made a slight improvement, but not enough to make it work perfectly.
The coil voltages to the FWD and REV capstan electromagnets are the same, so the problem is probably not electronic.
I noticed that the tape between the vacuum columns was difficult to move.
Maybe the tape was sticky?
I tried a reel of CDC mag tape and the sticky tape problem went away.

We don't have any tapes that were written on a 7-track drive and are working under the assumption that the diag will write before it reads.
That may not be the case.
I can periodically see data being decoded on the tape controller when the tape is moving, but that might be EOFs or something.
Maybe I need to write a small program to write a bunch of records and two EOFs.
Maybe then the diag will work correctly.

11/16/13
The magnetic tape diagnostic failed in the same way as it did last week.
That means that the tape transport and the processor are still behaving the same as last week.

To simplify the tape controller debug process I toggled in a short program that just writes an EOF to the tape.
The data for the EOF record comes from the tape controller so it does not need data-break (DMA) for this function.

It looks like the simple program runs OK, but the tape just keeps on spinning instead of stopping after the record is written.
All of the timing in the tape controller is performed by counters.
One counter runs off a clock that matches the tape density.
When it counts down from 32 it reads acceleration and deceleration values from the tape drive and puts them in another counter.
When that counts down the tape is up to speed and the EOF record and an EOR are written.
It looks like this happens, but something should stop the activity after the records are written.

Next week I will look into the circuitry that stops the writing process.

11/24/13
I did did a lot of reading and research this week on how the TC59 tape controller and tape transports interact.
DEC developed a tape transport data bus for the TC58 and TC59 that allows up to eight TU10, TU20, and TU79, 7 track or 9 track tape transports to be connected to a single TC59 controller.
Even though the tape transports were manufactured by different companies the DEC interconnect was always the same.
Since the mechanics of the tape transport were different, DEC invented a way for the tape controller to ask the tape transport for the acceleration and deceleration time delays for each type of tape command.

The tape controller starts by receiving the MTGO IOT instruction from the processor.
It then waits a little for the tape controller electronics to decode the instruction and determine of everything is OK.
It then tells the tape transport to send the acceleration/deceleration time value, starts the transport moving, waits the appropriate acceleration time, checks to see that the data has been written to the tape, waits the appropriate deceleration time, and then signals that the command has been completed.

I wrote a little instruction sequence to just write an EOF to the tape and then checked each signal in the write sequence. See page TC50-0-11 of the schematics.
Everything looked OK until I got to receiving the delay value from the transport.
The R 1/7(1) signal from the tape transport was always at ground. See connector A01 on schematic TC50-0-3 sheet 1/2.
I swapped the R123 Diode Gates in slots B17 and B18 of the TU20. No change. See schematic TU20-0-1 sheet 1/2.
I tried replacing the R107 Inverter in slot D01 of the TC59. No change. See schematic TC50-0-3 sheet 1/2.
I removed the R107 Inverter in slot D01 of the TC59 and ran the test program. The R 1/7(1) signal went to a logic low.
I pulled the cable from the TU20 that plugged into connector A01 in the TC59. The R 1/7(1) signal went to a logic low. See schematic TC50-0-3 sheet 1/2.
That meant that the problem was probably not in the TC59.
I pulled the R123 Diode Gates in slot B17 of the TC59. The R 1/7(1) signal went to a logic low. See schematic TU20-0-1 sheet 1/2.
The RD BUF 6(1) signal to pin S on the R123 in slot B17 was stuck low.
I replaced the R203 Triple Flip-Flop in slot B27 of the TU20 and now the RD BUF 6(1) signal looks OK. See schematic TU20-0-3 sheet 1/2.
I also found that the R 4/5 signal from the tape transport is weak, but the data is still received OK by the tape controller. We need to investigate further.
I continued checking the signal sequence from TC50-0-11 of the schematics.

The RECORD DATA signal from the R602 pulse amplifier in slot A08.
The upper trace is pin K. It has a nice sharp rising edge, but takes almost 2 ms to turn off.
The lower trace is pin D, the collector of the first transistor in the pulse amplifier.

All of the subsequent signals seem to work OK, but we will need to look at this in the future.
Once the defective flip-flop was replaced everything on the write sequence worked.

The tape transport reads the data after it is written to the tape.
I started looking at the read side of the circuitry and found that the READ SKEW OVER signal is inactive.
I don't know if the data was not written to the tape or that it was not read.
The tape transport is sending the read data to the tape controller so the controller does not know that the write is complete and the tape just keeps spinning.
That will be the debugging project for next week.

12/1/13
The PDP-9 processor is still running OK, and has been for several weeks.
After last week's repair the data from the TC59 tape controller is getting to the TU20 tape transport OK.
Each data bit from the tape controller goes through an R302 delay module in the tape transport to compensate for skew in the tape head.
The data then goes to R205 Flip-Flops, to the G287 Write Drivers, and to the write head.
Since the data is NRZ encoded a 1 bit in the data will cause the flip-flop to change state and write data to the tape.
A 0 bit does not write anything to the tape.

The data from the TC59 for tracks 1-7 is 1100000.
The data output from the R302 delay modules looks OK. See schematic TU20-0-3 sheet 1/1.
The state of the R205 flip-flop in slot B02 looks OK, but the flip-flops in slots B03 and B04 have both outputs low. That is not right.
I swapped the flip-flop modules in slots B03 and B04 for spares but saw no change.
I swapped the flip-flop module in slot B02 with the one in slot B03 but saw no change.
I removed the write driver in slot A03 but saw no change.
I swapped the write driver in slot A02 with the one in slot A03 but saw no change.
This will be the debugging project for next week.

12/7/13
This week I started by looking at the TU20 tape drive power on circuitry.
The R205 flip-flops that create the NRZI data that is written to the tape get a POWER CLEAR preset signal
I found that four of the flip-flops didn't preset correctly.
I swapped two of the flip-flop modules in slots B04 and B05 for spares, but found that four of the spares were also bad.
Fortunately we have lots of the R205 modules, but eventually we will need to repair some of the broken ones.

Each bit of the record data from the tape controller goes through a B302 adjustable delay to compensate for the mechanical delay in the tape head.
One delay was always driving the signal to the flip-flop.
Replacing the B302 in slot B09 fixed that problem.
I measured the resistance of the two trimpots on the original B302 that control the delay time and set the replacement B302 to the same values.
Hopefully the tape data skew will be close enough to work OK.

I ran the little tape diag program that just writes an EOF to tape.
Now I can see the delay register, write buffer, and read buffer indicators light up on the tape controller.
This is real progress!
The write buffer lights indicate 170, an EOF character.
The read buffer indicates 540, so there is more work to do on the read side of the tape transport or tape controller.

Each time the WRITE ENABLE signal is received by the tape transport it presets the NRZI flip-flops so the data on the tape is predictable.
The R603 pulse amplifier in slot A09 that generates the preset was bad.
Replacing it fixed that issue.
The G287 Write Amplifiers that drive the tape head now all have the correct differential voltage for writing the EOF character to the tape.

At this point the PDP-9 processor died.
I ran the built-in maintenance function and found that the registers incremented, but did not always carry from bit 9 to 10.
This could be a problem with the Address Register, Adder, Memory Buffer, multiplexer, etc.
That should not be difficult to fix.

12/15/13
This week I had planned to debug the PDP-9 processor after the failure last week.
The symptoms were similar to previous failures of the B213 Jam Flip-Flops in the registers.
The built-in Maintenance Test makes this debug task fairly easy.
We have some spares and I knew just where to look for problems. 

It looks like some Christmas elves (maybe Warren?) visited the warehouse during the week.
When I powered on the system everything looked OK.
I toggled in the little tape controller debug program and it actually worked normally.
I am sure we will see this processor failure mode again, but for now I went back to debugging the TC59 tape controller and TU20 tape drive.

The tape controller debug program just writes an EOF record to the tape.
This doesn't use data-break (DMA) so it doesn't use much of the tape controller logic.
Previously I had traced the signal flow from the processor to the tape controller, and from the tape controller to the tape drive.
I was convinced that everything was working OK.
Last week I traced the signal flow in the tape drive and found quite a few problems in the NRZI write logic.

This week I looked at the output of the G287 Write Driver modules that drive the write head.
Wherever there was a 1 bit in the cable from the tape controller there was +/- 15V pulse on the signals to the tape head.
That looks promising, and is probably causing something to be written onto the tape.
They used to make a fluid called Magnasee that you could brush on the magnetic tape and see the magnetic field.
If we had some of that fluid we could check to see if the tape was actually written.

Working under the assumption that the tape is being written I started looking at the read logic.
All I could see on the signals from the tape head was noise.
This tape drive reads what it writes to determine when it has completed a function.
I need to do some more research on signal timing so I know exactly when to look for the read data on the tape.

YouTube Video


Just for an experiment I tried sending a Write command instead of a Write EOF to the tape controller.
I had to set the Word Count (2s compliment) and Current Address values in memory for data-break.
It looks like this works OK and the data-break terminates when the Word Count gets to zero.
The tape still keeps spinning because the controller doesn't detect the read data.

I also noticed that three of the Read Buffer indicator lights were on when writing to the tape.
This is probably not correct and needs investigation.

12/16/13
The processor misbehaved when it was first turned on, but worked OK after about 5 minutes. Someday we will need to debug this issue.

Yesterday I noticed that three of the Read Buffer indicator lights were always on when the tape controller was busy.
The lights for tracks 4 & 8 were bright, and the light for track B was dim and then bright.
I followed the signals from the inverters that drive the lamps, through the I/O cable to the tape transport, through the Diode Gates, and to the READ BUFFER flip-flops.
The data on the cable showed the same pattern as the indicator lights, but not the same as the contents of the READ BUFFER flip-flops.
Some of the signals from the R123 Diode Gate that drives the cable looked really strange.
When I pulled the module to swap it I found a R203 flip-flop in slot B17 where a R123 Diode Gate should be.
Swapping the module for the correct one did not fix the problem.

The upper trace is the enable signal from pin N of the R107 in slot A23 that enables the Diode Gates that drive the Read Data signals to the tape controller.
The lower trace is the signal from pin K of the R113 Diode Gate in slot B20. This explains the indicator light behavior.

I continued looking a the other modules that drive the signals on the cable and found a defective R113 Diode Gate in slot B20.
Replacing that module fixed the issue with the Read Buffer indicator lights.

The mag tape system still does not work so more debugging is needed.

It looks like the tape drive is not reading data.
It does a read after write so it might not be writing on the tape.
I have contacted the Computer History Museum's 1401 restoration team about getting a 7-track tape that I can use for testing.

12/20/13
At this point I certainly know a lot more about what the TU20/TC59 should do, but it is still broken.
I have checked everything in the tape controller and the tape drive all the way to the write head.
Everything looks like it is working correctly when it writes to tape.

I wrote a little 17 instruction program that lets me set the tape controller instruction, density, parity, etc on the console switches, execute the command, and keep the resulting controller status.
There are lots of blinky lights on the TC59 tape controller when data-break is working.
I was able to set bits in core memory and then watch the bits get sent to the tape head.
It is interesting to see the controller split a 18-bit word into 3 7-track tape writes.
The parity circuit is working and changes the contents of the "C" track when I set odd or even parity.

The system reads what it writes to tape so that it can make sure that it wrote OK.
It looks like it is writing OK, but it is not reading anything after a write.
I really don't know if it is not writing to the tape, or if it is not reading the tape.

I swapped some email with the IBM 1401 team at the Computer History Museum.
They are one of a very few places on the planet that can still write a 7-track tape.
Getting a 7-track tape from them will let me determine it the drive is not writing or not reading, and get me pointed in the right direction for more debugging.

Observations:
  • The RECORD DATA signal strobes data to the tape head.
  • The period is 28 μs @ 800 BPI, 40 μs @ 556 BPI, and 200 μs @ 200 BPI.
  • (I found in the TE16 manual that these values should be 28 μs @ 800 BPI, 40 μs @ 556 BPI, and 112 μs @ 200 BPI, so I need to check the  200 BPI timing again.)
  • The signal to the write head is +/-14V for all tracks except "B". That track is +/-11V. (That is about the same for the TE16.)
Things to check:
  • The Error Flag is set after a tape rewind. This should not happen.
  • Necessary voltage to the write head to write on a tape.
  • Expected voltage on the read head when it sees data.
The TE16 manual says that this is how the tracks are written in a 7-track magnetic tape.

The TU10 manual says that the signal from the tape head should be about +/- 14mV.
I imagine that our TU20 should be about the same.

An EOF (End of File) is written at the end of blocks of data to indicate that this is the end of the file.

12/21/13
It looks like the TU20 electronics are a little different from the documentation that we have.
  • There is a G709 in slot D1 instead of the W028 show in the documentation.
  • The three R303 modules for the skew delay are not in our TU20. This looks like an ECO.
  • The left bit in each 6-bit character in a word gets written to track "B".
  • The 6-bot character in bits 0-5 is the first written to the tape.
This is the RECORD DATA signal at the top and the output from the G287 Write Driver for the tape head at the bottom.
The TC59 was programmed for a 2 word data-break that results in 6 character writes to the tape.
The data was all ones, so there is a magnetic field transition at each write.

12/27/13
I did some more Tape Drive debugging on the PDP-9 this afternoon.

Click on the image for a larger view.

Some of the issue with the tape drive not working may be due to "Operator Error".
I have been threading the tape through the guides, winding it on the take-up reel and pushing the LOAD button at the top of the tape drive.
This procedure doesn't close the "flux gate" that goes on the outside of the tape next to the tape head.
If you push the START button on the tape transport it closes the flux gate and starts the vacuum pump.
I am not sure exactly what the flux gate does.

I tried reading the 9-track NRZI tape that I made on the PDP-10. It didn't see any data, but it did find the EORs.
With the flux gate closed I can write a record to the tape, and the transport will stop at the end of the record.
This is a big improvement.
The LATERAL and LONG PARITY lights are lit when I record to tape.
I configured my test program to only write two words (6 characters) to tape and watched the parity bit get generated.
I set the memory locations to a variety of bit patterns to test the parity generation.
Some combinations like all zeros and even parity don't generate any flux transitions on the magnetic tape and cause the tape transport to keep spinning after the record is written.
I need to look into this.

The IBM 1401 team at the Computer History Museum will create a 7-track tape for me to read.
They also explained in detail about a situation where the IBM tape controller will substitute a character that gets written to tape to avoid the situation described above where you don't get any flux transitions.

I noticed last week that the ERROR light on the TC59 tape controller goes on when you rewind a tape.
I went through the schematics and found that EOT or BOT will turn on the error status.
That is not documented in the TC59 manual.
I will have to send DEC a note about that. ;-)

More debugging tomorrow.

12/28/13
I made a 9-track, 800 BPI, NRZI tape on a HP SCSI tape drive connected to a Sun server and tried to read it on the PDP-9.
It didn't see any bits at all.
It did see some bits from the 9-track tape written on the TU45 on the PDP-10.
After I repaired the tape drive in the HP SCSI tape drive it would no longer write tapes that could be read on the PDP-10.
Maybe not seeing any bits on the PDP-9 confirms that the problem is in the HP tape drive.
I will increase the write amplitude on the HP drive an try it again.

I looked at the read data in the TF59 tape controller when writing single bits to tape. All patterns work OK except when writing track B.

Writing all zeros with even parity causes the tape drive to keep spinning after the write.
That means I can't write a NULL character to tape in BCD mode with even parity.

Writing all zeros with odd parity causes works OK.
It can write all bit patterns to tape in binary mode with odd parity.

The chart below shows the individual bits in the Accumulator and what tracks they correspond to.
The track labeling for the 7-track drive is according to the IBM standard.
Since the TU20 is a 7-track drive it can write 6 bits of data (plus parity) at a time.
The left most 6 bits are written first, then the middle 6 bits, then the right 6 bits.
I turned on individual bits (out of a set of 6) to see if the parity generator circuit would work correctly. (it does)

Some notes from debugging...

 Accumulator Bit01234567891011121314151617
 Works Correctly?NYYYYYNYYYYYNYYYYY
 Tape TrackBA8421BA8421BA8421
 Odd ParityGGGGGGGGGGGGGGGGGG
 Even ParityLLLLLLLLLLLLLLLLLL
TC59 Tape Controller, Write Data

 Tape TrackBA8421P
 C03 Output Connector PinDEHKMPS
 # of pulses observed1611661
 G084 pin V pulses0066660
 G084 pin S pulses0066660
 R302 delay output pulses0666660
 R203 flip-flop output 0/106/66/66/66/66/60
TU20 Tape Drive Read Data behavior when writing 3 words (6 characters) to tape.
The inconsistent observations may be due to differences in the TU20 wiring and the TU20 documentation.

I can see read data about 8ms after it is written to the tape.

All of the tracks will now write and read except the B and the Parity tracks.
The part of the G287 Write Driver in slot A02 that drives write head channel "B" has low output.
The track B head signal is about +/-10V, the other tracks are about +/-15V.
I will replace the two 2N3500 transistors, Q7 & Q8, for the B channel and see if the voltage increases.

I also found that the schematics that we have for the TU20 tape drive are newer and slightly different than our drive.
That makes debugging a little more challenging.

12/30/13
Warren gave me a bunch of pointers on how the G287 Write Driver works.
I measured the voltage drop across all of the diodes on the G287 from slot A02 that drives tracks B and A; all were OK.
All of the diode drops in the transistors looked OK except for one that drives the B track.

Click on the image for a larger view.

In the process of replacing the 2N3500 transistor I found that one of the leads on the original transistor was rusted (probably from mouse pee) and broke off.
I inspected the transistors at the top side of all of the G287 modules and found them all rusty, so I replaced all 10 transistors at the top of the G287 modules.
These transistors drive tracks B, 8, 2, and Parity.
Some of the diodes on theses modules have small cracks and should eventually be replaced.

The B track will now write data, but not all of the tracks will write.
I measured the write head drive voltage and found that some of the G287s still misbehaved.
The typical voltage to the head is about 10V.
Track 8 was 2.5V for both directions.
Track 4 was 10V for one direction and 2.5V for the other direction.
Since the electronics was designed to run either a 7-track or a 9-track transport I was able to swap G287s with channels that are not used for the 7-track transport and get all of the tracks to have a good looking head write signal.
I swapped A05->A03, A03->A06, A06->A05.
This is a really bad idea because this could cause big debugging problems in the future.
We should completely fix all of the G287 modules in the future.
There is some jitter on some of the write signals, so some more debugging is still needed. 

On the read side, I can see data on all of the tracks except for the parity track.
I swapped the G084 modules in slots C30 and C32.
The module in slot C30 for track 1 still works and the Parity signal still is broken.
There is read data on the connector pins at slot C04 where the cable to the TC59 goes.
I don't see the READ BUFFER lights on the TC59 go on.
This might be due to the small percentage of time that the data signals are present.

Of course the tape diagnostics die immediately when they see the parity error.

As of now I can write and read the 7 data tracks, but not read the parity track.
This is some really big progress!

I need to look at the write data signal for the parity track, and the signals from the read head that go to the G084 modules.

1/5/14
Lots of people suggested that I measure the resistance of all of the tracks in the read/write and erase heads.
This is a good way to find out if the tape head has any chance of working.

 Track Write Resistance Read Resistance
 B 1.2Ω 6.1/6.2Ω
 A 1.2Ω 5.4/5.0Ω
 8 1.3Ω 5.3/4.9Ω
 4 1.2Ω 5.1/5.6Ω
 2 1.3Ω 5.2/5.3Ω
 1 1.3Ω 5.0/5.3Ω
 P 1.3Ω 5.0/5.0Ω
 0/1 - -
 0/0 - -
 Erase 22.3Ω -
This all looks OK so our time restoring the TU20 will not be wasted.

Today I took the head guard off so I could clearly see the face of the write/read/erase heads, tape guides, and tape cleaner.
With binocular magnifiers on I could see a big chunk of oxide stuck to the head where the parity track is.
I couldn't see or clean the oxide with the tape guard in place.
It took quite a bit of effort with alcohol and a Q-tip to really clean everything.
The HP tape transport manual doesn't have a procedure for reinstalling the head guard.
I found a procedure in Pertec and Kennedy tape drive manuals that I could use.
They said to fold a piece of mag tape and use three thicknesses as the clearance between the head and the guard.
That worked OK.

After really cleaning the head I could write/read the parity track, but the tape controller still gets parity errors.
I adjusted the read amplifier gains, they were all low.
I could see data coming out of the G084 parity track read amplifier, but found that the R302 adjustable delay for the parity track that compensates for read skew was bad.
I measured the trimpot resistance on the broken R302 and set the replacement one to the same values.
Hopefully the read skew adjustments will not be too far off.
After replacing the delay module I can read/write all tracks.

The TC59 tape controller reports Lateral Parity errors, so something is broken in the parity circuit in the tape controller.
If I have just 1 bit on in each character in the word, and parity set to even, no parity errors are reported.
Odd parity reports errors with this data.
If I have 2 bits on in each character in the word, and parity set to even, parity errors are reported.
Odd parity reports no errors with this data.
I will work on that next week.

Several times during debugging the processor would stop running.
I put it in maintenance mode and could see that the bits were not getting transferred from 6->5 reliably.
I left it in maintenance mode for about 10 minutes and it then ran OK.
After I get the tape drive and controller working I will look at the processor again.

The Hours Meter showed 40369 when I finished, so we have put 206 hours on this system.

1/11/14
We had quite a few visitors today, and soldering to do on Arduino Shields for our STEM robotics classes, so I didn't have much time for debugging.

The processor has been failing intermittently.
If I run the processor in maintenance mode the bits in the MB and the registers is not always incremented correctly.
If I leave the processor running in maintenance mode, after about 5 minutes it will work correctly.
After it works correctly, it will stay working correctly for several hours.

The tape drive will correctly read and write data.
If I write a block of all ones with even parity (the parity bit is a 0) no errors are reported.
If I write a block of all ones with odd parity (the parity bit is a 1) a Lateral Parity error is reported.

The parity data from the TU20 tape transport is in the signals in the TU56 tape controller, so the problem is in the tape controller.
I looked at the relationship of the READ STROBE and the READ PARITY ODD signals.
The READ STROBE signal goes active about 500ns before the READ PARITY ODD signals change.
This relationship looks OK, and the waveforms of both signals look OK.

I compared the READ PARITY ODD (open diamond) signal from pin E on the B130 3 Bit Parity Circuit in slot D11 with the RBP(1) (solid diamond) signal from pin T of the R107 Inverter in slot D01.
The parity signal from the tape transport and the calculated parity from the tape controller were the same if the number of data bits was odd and the parity was even.
So, for some data patterns the parity circuitry is working correctly.
I need to test quite a few more data patterns and the calculated parity to see of the parity circuitry or if the logic that reports the parity errors is broken.

At this point in the testing the processor died again.
In maintenance mode the MB is correctly incremented, but none of the other registers are incremented.
This means that large parts of the processor timing, microcode, and data bus circuitry is working.
This problem should not be too difficult to debug because the difficult part, the microcode, seems to be working.

1/18/14
I spend about 5 hours today chasing signals all over the PDP-9 processor and I/O controller trying to find out why the processor died
All of the microcode circuitry looked OK as did the contents of the register flip-flops.
I traced the register contents to the A-bus, the Adder, the I/O Bus, the Buffered I/O Bus, and eventually to signals that turn on the transistors that turn on the light bulbs for the register indicator on the control panel.
It didn't take too long to see that the -15VDC power for the indicator lights was missing.
I though that the problem was going to be the Margin Check switches, but it turned out to be two of the 75 fuses (yup there are 75 fuses in the processor) had blown.
The processor was working fine, but we could not see the results on the console so it looked broken.
So, the processor is working fine again and we are back to debugging the tape controller and tape drive.

This week only the B, A, 8, and 4 tracks are reading data. The 2, 1, and parity tracks are not.
I traced the write data bits from the tape controller, to the delays, the NRZI flip-flops, the differential amplifiers, and to the write head signals.
All looked OK.

The upper trace is one of the track bits where it comes out of the delay line.
Each bit will trigger the flip-flop and make NRZI data to drive the differential write amplifier.

I ran out of time before I checked the read data for all of the tracks, so that will need to wait until next week.

1/25/14
Debugging the TU20 tape drive on the PDP-9 has been challenging.
The tape transport and controller do a read-after-write to verify that the tape was written correctly, to detect the end of the record, and then stop the tape motion.
Since we don't trust either the read or write circuitry it is difficult to localize the problem when data doesn't show up where it should.
We don't have any Magnasee yet so we can't see if and how well the tape is being written.

Since the TU-20 is a rare 7-track drive we don't have another drive that can write a test tape.
The team that restored their two IBM 1401 systems at the Computer History Museum kindly wrote a 7-track tape on one of their 729 tape drives for us to try.

I wrote a little 22 instruction program that reads a tape command from the from the front panel switches, executes the command, and halts on completion with the status in the accumulator.
I unplugged the cable to the Write and Erase heads so misbehaving electronics would not damage the test tape.
I set the command for a read with even parity and 556 BPI density.
I was rewarded with good data, a lateral parity error, and a record size error.

Reading a "foreign" tape on the PDP-9 is a little challenging.
The Word Count for the TC59 is set to the number of 18-bit Words to transfer from the tape controller.
Each word corresponds to three characters from the tape drive.
I set the Word Count (three characters to a word) to a large number so the tape controller would just read to the end of the record.
The remaining Word Count shows the number of words (x3 for characters) that were not read.
I fiddled with the Word Count values and found that 777365 (this is in 2s complement to make it more challenging) read all of the record, but still resulted in a lateral parity error, and a record size error.
The last word read expects three characters to be read from the tape.
Unfortunately the tape records look like they are 1238 characters long so the tape controller thinks that the last character is missing.
DEC must have had special code in the operating system to handle this situation because it is likely that any foreign tape record would not have a modulo 3 record length.
The test tape records all contain an octal 12 and lots of octal 20 characters.
I need to find out what the equivalent EBCDIC or BCD or whatever character these are.

Now that I had a good tape to read I ignored the write logic for a while and concentrated on the Lateral Parity Error problem.
Since the Longitudinal Parity was OK I assumed that of the tracks were being read correctly.
The test tape only had data on on tracks A, 8, 2, and P so this is not a thorough test.
The parity checking circuitry is made up of 3x B130 parity flip-chips connected to a fourth B130. See page TC50-0-3 2/2 for the circuitry.
Determining how the parity circuit worked from the TC59 and the B130 flip-chip schematics was not obvious.

I check all of the inputs to the B130 modules.
Using the READ SKEW OVER signal from the TU20 I could see the data from each track on the tape transport.
Tracks 8 & 2 had a 1 bit at the first character, and all subsequent characters had a 1 bit on tracks A & P.
I found that the signal R 0/1(1) Solid Diamond was always active (low).
The R 0/1(1) Open Diamond signal from the tape drive was always inactive so the R107 inverter flip-chips in slot A09 was bad.
This signal is for 9-track tap drive.
We don't have a 9-track tape transport, but the circuitry is in the tape controller to support it anyway.
Replacing the R107 module fixed the Lateral Parity Error problem.

Next week I can test the write circuitry now that I know that the read and parity circuitry is working correctly.

2/1/14
The long standing intermittent issue in the processor decided to be a lot less intermittent today.
When the system was first powered on it behaved OK and read the tape from the IBM 1401.
I put a scratch tape on the TU20 and tried to load the MAINDEC-15-D4AC-PB MAGNETIC TAPE CONTROL INSTRUCTION TEST.
The HRI loader loaded almost all of the tape, and then stopped.
The processor was still in the RUN state, but was not doing anything.
I tried some smaller diags, a memory test.
It loaded and ran for a little bit, but stopped with the processor in the RUN state and the register display blank.
I tried the built-in Maintenance Test, but the registers would not increment past bit-7 reliably.

The failure was intermittent, so I could see signals behaving correctly, and incorrectly.
This is sometimes easier to debug than just broken.
I looked at the CM CURRENT signal on pin U of the R111 in slot F26.
This signal is the core of the microcode and gives a good indication about what the processor is doing.
When running the Maintenance Diag we should see a repeating pattern of two pulses and a long gap.
Sometimes the second pulse was missing, sometimes it was OK.
The first pulse reads microcode word 07, and causes the contents of the Address Register to go to the adder, get incremented, and go to the Memory Buffer.
The second pulse reads microcode word 22, and causes the contents of the Memory Buffer to get transferred to all of the registers.

I looked at the microcode bits to see if this sequence was happening, and found that the CONT (Continue) bit was intermittent.
The CMSL30 signal from the microcode looked OK, and the other inputs were inactive.
The CM STROBE D signal that latches the CMSL30 signal into the CONT flip-flop was intermittent, so the CONT microcode bit was intermittent, so the processor was not reliable.
The CM STROBE D signal comes from pin D of the B602 10Mc Pulse Amplifier in slot E27. (Mc got changed to MHz long after this machine was designed)
The input on pin F was always OK, but the output on pin D was only about -2V, but sometimes was -4V.
I swapped the B602 modules in slots E26 and E27 and found that the fault followed the module.

We don't have any working spares for this flip-chip, so it is time to fix the two broken ones that we have.
The B602 flip-chip contains two transformer coupled pulse amplifier circuits.
I checked the diode drop in the diodes and transistors, and compared the working side to the intermittent side, and everything looked OK.
The transistors, according to the schematic, are DEC-6A and DEC-6B, and are marked DEC2894-3A and DEC2894-2B.
Our crack team of consulting experts said that DEC bought large numbers of transistors and selected them for certain characteristics.
The documentation for the selection criteria is not available.
I bought some NOS Motorola 2N2894 transistors on eBay that will be delivered this week.
Hopefully they will be OK for a substitution.

I also looked at the signal from pins L & U of the B310 delay in slot EF29.
The signal was a short positive pulse that went to about 50% of the original voltage for about 400 ns.
That did not look right, so I swapped the module for a spare.
The replacement module had a nice clean 50ns pulse.
We have had a lot of bad solder joint problems with these modules.
I don't know if this would cause a problem, but it is not the correct behavior.

2/8/14
First thing this morning the processor ran OK, but misbehaved after running for about 10 minutes.
We got back to looking at the microcode logic and why CM STROBE D is intermittent.
There are two Pulse Amplifier circuits on the B602 Pulse Amplifier, one drives CM STROBE C and one drives CM STROBE D.
We put the B602 on an extender board so we could compare the behavior of the two circuits.
There was some evidence that both circuits were misbehaving, but as a quick try I replaced the 2N2894 transistor that drives the output transformer on the B602.
That didn't fix the intermittent processor problem.

The upper signal is CM STROBE A and the lower signal is PWR CLR POS from the S603 in slot J23.

Further investigation showed the PWR CLR POS signal that sends pulses to the microcode logic during power up is sitting at 2.0V with sometimes an additional 500mV of noise added.
This signal is diode ORed with the signal that creates CM STROBE C.
Since the signal is outside of the maximum 3.2V for a logic low level, and could cause problems in the B602.
The S603 Pulse Amplifier that drives the PWR CLR POS signal has three circuits on it.
We compared the transistors on the three circuits and found that the 2N2894 transistor that drives the output pin M had conduction in both directions between the base and collector.
Replacing the transistor did not fix the PWR CLR POS signal level.
There is probably another module connected to the PWR CLR POS signal that is pulling the signal up towards ground.

Next time:
Pull all of the modules that are connected to the PWR CLR POS signal to see if there is an improvement.
Swap the B602 modules in slots E26 and E27 again, and this time carefully look at the signals on E27 pins D & N to see if they are better.
If not, then the B602 modules are probably OK and the problem is elsewhere.

2/10/14
I did some more checking on the S603 flip-chip that drives the PWR CLR POS signal.
Last Saturday we replaced the 2N2894 transistor that drives the output pin M because it had conduction in both directions.
It wasn't the transistor that was a problem, it was D42, a DEC D-664 or 1N3606 diode that was the problem.
The 1N3606 diodes are almost impossible to find.
Fairchild recommends a 1N4153 as a replacement, but that is obsolete too.
I installed a spare S603 module in slot J23 until we can get replacement diodes.
The PWR CLR POS signal is now sitting a -4V when inactive.

I ran MAINDEC-9A-D02A INSTRUCTION TEST PART 2 for 30 minutes without a failure.
Hopefully is the fix to the intermittent problem that we have been chasing for months.

I tried MAINDEC-15-D4AC TC59 MAGNETIC TAPE CONTROL INSTRUCTION TEST again.
It failed at test #3 when it tried to test the data-break facility. 
This test has run fine before, so something new is misbehaving.

Next time:
Look through the source for MAINDEC-15-D4AC TC59 MAGNETIC TAPE CONTROL INSTRUCTION TEST.
Write a small program that forces a data-break from the TC59 and find out why it doesn't work.

3/15/14
We had a request to supply images of  paper tapes that contain 27-bit Floating Point software source code for the PDP-8.
I wrote a little program on the PDP-9 to read a paper tape and write it to the console serial port.
I used Warren's 20mA-to-RS-232 converter to connect a laptop to the PDP-9 console port and captured the paper tape image from the PDP-9 console.
The PDP-9 console only runs at 110 baud, so reading a paper tape at 10 characters a second took about 25 minutes.

Reading the tape on the PDP-9 was so slow that I was motivated to fix the power supply problem in the GNT paper tape reader/punch.
The GNT ran at 120 characters a second and made images of all three paper tapes in just a few minutes.
Now that the GNT is working again, I will write a Windows program to control the GNT reader/punch so we can easily read/punch paper tapes for the PDP-9 and PDP-8 systems.

8/3/14
Several times over the last month we have tried to run MAINDEC-9A-D1AA Basic Memory Checkerboard.
It loads OK, but gets stuck in a loop.

The system runs the built-in diags OK, so the microcode is working, and the core of the processor is working.

I toggled in a JMP-JMP instruction sequence and that worked OK.
So, at least the basic parts of the processor are OK.

I toggled in an ISZ loop and it failed.
The ISZ instruction doesn't increment the memory location.

The JMP, CLA, DAC instructions work.

3/12/16
We were working in the warehouse today so I gave the PDP-9 a try.
It loaded MAINDEC-9A-D1AA Basic Memory Checkerboard from paper tape with the HRI, but nothing was stored in core memory.
I tried to write to core using the front panel, but it didn't work.
The built in Maintenance tests ran OK, so at least some of the microcode is running OK.
After running the built in Maintenance tests for about 15 minutes the core memory started working.
I reloaded the Memory Checkerboard, and it ran for about 90 minutes before it halted.
So, we have an intermittent connection that is probably temperature sensitive.
I suspect the delay lines because they have caused similar problems in the past.

To-Do:
The system was disassembled for shipment and needs to be reassembled. (Started)

Find the two I/O cables to connect the TC59 to the PDP-9.
If we don't have the cables we might be able to use seven cables from a PDP-8 or borrow some from another PDP-9/10/15 collector.
 
There is some unconfirmed information that when this system was in its last days of service they had problems with the ROPE memory for the microcode.
There a rubber sheet that compresses the "E" cores together. We will need to replace it.
We have several spare ROPE memory boards. We have no idea if they are good, or what microcode is programmed.
 
We have a spare 8k core stack if we find problems with the core in the system.
 
We were also told that when someone was trying to fix the system they pulled modules while the power was still on.
That may make it challenging to revive this system.
This system uses some of the same transistor only R series Flip-chips as the PDP-8/S so we have some spares for the modules.
It also uses quite a bit of the faster B series modules. We have just a few spare B modules.
 
The rough plan:
Reform the capacitors in the 709 power supply for the processor and test the power supply. (Done)
Reconnect the I/O cables for the paper tape reader/punch. (Done)
Find and connect the Teletype interface cable. This is actually on the PDP-11/23 that was connected to this system.
Power up the system and see what works. (Done)
There was some discussion that many of the light bulbs in the front panel were burned out.
    (All of the Register, Memory Buffer, and Interrupt lights work.) 
Reform the capacitors in the TU20 power supply and test the power supply. (Done)
Power up the TU20 and see what works. (Done)
The tape drive uses vacuum columns so it may be a significant challenge to get it working. 
Reform the capacitors in the TC59 power supply and test the power supply. (Done)
Connect the TC59 tape controller to the I/O section of the PDP-9 and to the TU20. (Done)
Debug the TC59 and the TU20. (In process)

The instructions tested so far are:
CLA    Works OK
CLC    Works OK
CLL    Works OK
CLOF    Turns the CLK light off.
CLON    Turns the CLK light on.
CMA    Works OK
CML    Works OK
DAC    Works OK
DZM    Works OK Fixed on 3/23/13
HLT    Works OK Fixed on 3/23/13
IOF    Turns the PIE light off.
ION    Turns the PIE light on.
IZS    Works OK Fixed on 3/16/13
JMP    Works OK Fixed 3/10/13
LAC    Works OK
LAS    Works OK
NOP    Works OK
RAL    Works OK
RAR    Works OK
RTL    Works OK
RTR    Works OK
STL    Works OK

The boards replaced in the PDP-9 processor so far are:
 
B310 Delay Line in slot EF29 of the processor with a spare, and again with a repaired module 
B213 JAM Flip-Flop in slot C18 of the processor with a spare
B213 JAM Flip-Flop in slot C35 of the processor with a spare, and again with a spare
B213 JAM Flip-Flop in slot D20 of the processor with a spare
B213 JAM Flip-Flop in slot D27 of the processor with a spare
B213 JAM Flip-Flop in slot D28 of the processor with a spare
B213 JAM Flip-Flop in slot H33 of the processor with a spare
B213 JAM Flip-Flop in slot E20 of the I/O controller with a spare
B310 Delay Line in slot EF29 of the processor with a spare 
B360 Adjustable Delay Line in slot D33 of the Core Memory with a spare
G219 Memory Selector in slot AB09 of the Core Memory  with a spare 
G219 Memory Selector in slot HJ24 of the Core Memory  with a spare 
KC09 (G920) repaired, and repaired again.
R111 Diode Gate in slot H23 of the processor with a spare
R123 Diode Gate in slot D15 in the I/O controller.
R401 Clock Flip-Flop module in slot KD09-E03 of the I/O controller  with a spare. 
S205 Dual Flip-Flop module in slot KD09-D07 of the I/O controller with a lower drive R205 spare. We need to repair the S205 and put it back in the system.
S603 Triple Pulse Amplifier in slot J23 with a spare. Diode D42 on the original conducted in both directions.

The boards replaced in the TU20 Tape Drive so far are:
 
2N1304 transistor in the EOT circuit on the Photosense Amplifier in the tape transport.
G287 Write Driver in slots A02-A06, replaced 2x 2N3500 transistors for tracks B, 8, 2, and Parity. Some of the diodes on theses modules have small cracks.
R113 Diode Gate in slot B20 with a spare.
R123 Diode Gate in slot B17 has poor drive to pin P. Working OK, but should be checked further. The R123 Diode Gate in slot B17 was actually an R203 flip-flop. It was replaced with the correct spare.
R203 Triple Flip-Flop in slot B27 with a spare.
R205 Dual Flip-Flop in slot B04 with a spare.
R205 Dual Flip-Flop in slot B05 with a spare.
R302 Dual Delay in slot B09 with a spare. Set trimpots to the same values as on the original.
R302 Dual Delay in slot D29 with a spare. Set trimpots to the same values as on the original.
R303 Integrating One-Shot in slot A21, replaced the open Trimpot.
R401 Clock module in slot A15 with a spare
R602 Pulse Amplifier in slot B13 with a spare.
R602 Pulse Amplifier in slot B16 with a spare.
R603 Pulse Amplifier in slot A09 with a spare.
W501 Schmitt Trigger in slot C10 with a spare.
W501 Schmitt Trigger in slot D09 with a spare.

The boards replaced in the TC59 Tape Controller so far are:
 
R602 Pulse Amplifier in slot A21 with a repaired module.
W640 Pulse Amplifier in slot F22, replaced R17, Q8, and Q9.


Comments