Hallo allemaal, Marko, received your VC10. I0m working on it. Andre, did you receive the SRCs of the PRGs for the 8x96? Here is another idea of mine: combining the 65SC816 and the 74LS612 in a PET. So this doc also contains a lot of tech stuff. I also produced a GIF containing the SCH but because of the problems we had when I sent the contents of the ROMs of my German and US CBMs, I only sent it as a personal file to Andre, Marko and Frank. So if you're interested as well, please notify me. (or better ask one of the others as I will be 'out of the air' for at least two days :-( ) One genearl question remains: What happens if the processor of a PET or 1541 runs out of phase with the onboard clock or even at a (slightly) different speed? In case of the PET I think of the video and in case of the 1541 I think of the I/O for reading from/writing to the disk. =========================================================================== BIG-PET What is it? BIG-PET is a project that enables you to expand your PET/CBM with the 65816 and a lot of RAM, ROM and I/O. Up to 16 MB if you want to! :-) Use: - Emulation of new ROMs - Testing new hardware - Attaching a complete PC board so you can use PC-cards as well Products: a card with hardware. The idea. Part one: Since 1985 (?) the 65816 is available. This an upgrade of the 6502 with internal 16 bit registers and capable of addressing up to 16 MB. The 20 MHz. version is used in the SUPER-CPU from CMD, a module to be attached to your C64 or C128. AFAIK the 65816 is also used in the SNES from Nintendo and the SG2 from Apple. It would be nice if we could use its power for the C64/128 and their older brothers as well. The 65816 also has a little brother, the 65802. This CPU is internal a 16 bitter and is pin compatible to the 6502. But due to this pin compatibility it lacks certain features the 65816 has. The most important feature that is missing is the ability to address up to 16 MB RAM, ROM or I/O. <B>The pinouts</B> <PRE> 65816 65816 +---------------------+ /VP GND -+ 1 40 +- /RESET | | /RDY -+ 2 39 +- CLK2 VDA | | /ABORT CLK1 -+ 3 38 +- SO M/X | | /IRQ -+ 4 37 +- CLK0 | | /ML NC -+ 5 36 +- NC /BE | | /NMI -+ 6 35 +- NC E | | VPA SYNC -+ 7 34 +- R/W | | +5V -+ 8 33 +- D0 | | A0 -+ 9 32 +- D1 | | A1 -+ 10 31 +- D2 | 6502 | A2 -+ 11 65C02 30 +- D3 | 65SC02 | A3 -+ 12 29 +- D4 | | A4 -+ 13 28 +- D5 | | A5 -+ 14 27 +- D6 | | A6 -+ 15 26 +- D7 | | A7 -+ 16 25 +- A15 | | A8 -+ 17 24 +- A14 | | A9 -+ 18 23 +- A13 | | A10 -+ 19 22 +- A12 | | A11 -+ 20 21 +- GND | | +---------------------+ 6502: A0..A15 = Addressbus CLK0 = Input, clock for the processor, also known as PHI0 CLK1 = Output, clock of the processor, also known as PHI1. Is inverted CLK0, about 3 ns. delayed CLK2 = Output, clock of the processor, also known as PHI2 Is CLK2, about 6 ns. delayed D0..D7 = Databus, Addresbus A16..A23 (65816 only) IRQ = Input, maskable interrupt NC = Not connected NMI = Input, non-maskable interrupt R/W = Read/Write-line RDY = Input, ReaDY, causes th CPU to wait until released. Does NOT work during a write-cycle for the original 6502. Does work for the 65SC816, 65SC802 and 65SC02 of GTE and CMD. RES = Reset-line SO = Input, signal to set Overflow-flag of status register SYNC = Output, signals fetch of opcode, active (H) 65816: ABORT = Input, prevents modification of the internal, causes 658xx to call vector at $00FFF8/9 BE = Input, Bus Enable, puts 65816 in tristate E = Output, reflects the state of the Emulation bit M/X = Output, reflects registers used as 8 or 16 bit wide ML = Output, Memory Lock signals outer world read-modify instruction is executed. VP = Output, Vector Pull VDA = Output, Valid Data Address VPA = Output, Valid Program Address </PRE> As you can see the 65816 is not completely pin compatible but the incompatibility is minor. The most important difference is the lack of CLK1 and CLK2. After some experiments I found out that I had to generate these signals by using two 74LS14 gates. I modified a card produced by Elektuur/Elektor in such a way that I only had to replace the original 6502 of any system by this card to let it run on a 65816. This system worked fine for Acorn Atom, CBM8032SK, PET8032 and PET3032 but NOT for the VIC20. Until now I haven't figured out the reason for this behaviour. (Delay is too long so the VIC-chip messes up the bus at the end of the generated CLK2?) Part two: Andre Fachat had the genious idea using a 74LS610 Memory Management Unit in his <A HREF=''>the CS/A65</A> enabling him to expand his system up to 1 MB. This does not mean that suddenly the processor is capable of addressing this 1 MB. The 1 Mb is divided in 256 blocks of 4 KB of which only 16 are accessible by the 6502. For more details how the 74LS610 functions, see my document about its brother, the <A HREF=''>74LS612</A>. The special feature where I am interrested in is the ability to shuffle those 4 KB blocks in any order you want to. In C64 terms: you are able to move the I/O area from $D000/DFFF to $6000/$6FFF. This last feature I want to use in BIG-PET as well because it enables you to use your CBM or PET as a big 6502 emulator for other systems. Part three: The next trick is to combine the capability of the 65816 of addressing 16 MB and the shuffle feature of the 612. Why not only using the 612 for addressing the complete range? Using the 65816s own capability to address 16 MB enables us to read from/write to areas not covered by the MMU at that moment. This also enables us to mirror the MMU in another 64 KB segment meaning we can let it disappear from the original first 64 KB (could be needed when emulating another system) but we still would be able to reprogram it when needed. Part four: The 65816 is available as fast as 12 MHz (20 MHz.?) and it would be a waist not to use this speed. As I already mentioned, BIG-PET is going to be used in a CBM or PET and they standard run at 1 MHz. The onboard chipset disallows us to run it at a higher clockspeed. But that does have to stop us using faster chips in other 64 KB segments so that they can be used running at higher clockspeeds. I obtained a handfull of 45 ns. 32*8 KB SRAMs, former Cache-RAMs of obsolete 80386 motherboards. These can be aproached at 4, and even 8 MHz. It should be possible to copy the contents of the ROMs and the RAM in area $0000/$7FFF to this fast RAM and then use this instead of the original ROM/RAM. (Like shadow-RAM on PCs) Realisation Addressing 16 MB by the 65816 AND the 74LS612 MMU: The addresslines A16..23 of the 65816 have been multiplexed with the databus and are not available in the normal way as the other addresslines. The normal procedure to generate these lines out of the databus is to latch them using a 74F573. CLK1 can be used to perform this latch. CLK1 is generated from CLK0 using a 74F04 (U3a). The MMU has its own lines to address all the 16 MB. As we cannot have two devices driving the addressbus at one time, we either have to choose between the 65816 or the MMU. My idea is that the MMU only drives the addressbus when the first segment, $000000/$00FFFF, is selected. This can be archieved by 'ORring' A16..23. Here for I use a 540 (U1) to invert all A16..23 and a 133 (U8) to AND the inverted signals. As the 133 actually is a NAND gate we first have to invert its result with an inverter(U3b). The result is used to select either the 541 (U7) buffering the 573 (U6), or the MMU (U4). (We cannot tristate the 573 itself because we will loose the information causing the tristate) Because the MMU also takes care of A12..15, we have to bypass the MMU for these lines by means of another 541 (U2). Because we want to be able to disable the whole configuration, we take advantage of the fact that the 541s have two enable-lines. One of them we use for the above mentioned selection, the other for disabling the whole bus. The MMU lacks this feature so we have to use an OR-gate (U5a) to combine the 'disable'-signals. The signal to disable the whole bus originates from a 04 (U3e). Its input originates from /BE of the 65816. (which is an input) Addressing the MMU: To be able to program the behaviour of the MMU, it has to be fit somewhere into the mappings of the 65816. As mentioned before, it will be mapped any way into another segment then the $00-segment. In this way we always are able to (re)program it using the 65816s extra capabilities. The CBM/PET has 2*4 KB of unmapped memory meant for adding extra ROMs with additional software. One of these areas can (partly) be used to map the MMU. Should we need this specific area for testing a new ROM, then we physically map this ROM in another segment and remap it the needed 4 KB area by means of the MMU. From that moment on reprogramming the MMU is now only possible by addressing it thru the other segment. In the future I also want to make a C64 version. The C64 has only 512 bytes of free space. Mapping the MMU into this area has the disadvantage that we won't be able to attach regular cartridges using this area. (Unless I find a way to remap only 512 bytes) The advantage of mapping the MMU into the existing memorymap is that you can program it in BASIC . (BASIC is not able to address beyond the first segment) I decided to map the MMU in segment 0 and 1. This is done by using a 74LS677 (U12), a 16-bit comperator. This IC checks if A17..23, A7, A9..10, and A12..15 are (L), and if A8 and A11 are (H). Notice that I did not mention A16. By discarding this line I created the needed mirror in segment 1. The output of the 677 becomes active (L) in the range $9000/9FFF and is used to enable a 138 (U9) which on its turn enables the MMU. Programming the MMU: If we want to use the complete ability of the MMU of addressing 16 MB, then we must be able to program all the 12 bits of all the registers of the MMU. As the 65816 is only an 8-bitter, we have to create the additional 4 bits ourself. My solution is to use either a 6522 or a 6526 to deliver these 4 bits. It looks a bit like using a riotgun to kill a mouse but the 6522/6 will be used for more tasks. Using an external latch has one disadvantage, we won't be able to read the contents of those four bits without extra hardware. The extra hardware needed in this case are a 04 and a 573. This IC latches the data written to/read from the extra four bits (D'0..D'3). The data is latched whenever the /CS-input of the MMU is enabled. The outputs of the 573 are connected to the same four I/O lines of the 6522/6 as well. Another line, D'5, of the 6522/6 takes care of enabling the output of the 573 towards the 6522/6. Caution: - We only must take care of the fact that the used I/O lines are programmed as inputs the moment we output the data of the 573. - The moment we want to output the data of the 573 towards the 6522/6 the I/O lines must be programmed as inputs. - The moment we want the 65816 to read data from the MMU, the I/O lines and the 573 must be disabled. - The moment we want the 65816 to write towards the MMU, the 573 must be disabled. The I/O lines must have been programmed as outputs and have been filled with valid data. One line, D'4, we'll use to put the MMU in map-mode or not. We have to take the MMU out of the map-mode the moment we want to (re)program it. After a RESET all the I/O lines have been switched to input and a resistor takes care of pulling the MM-input (H), disabling the map-mode in this way. The extra port of the 6522/6 Until now we only needed 6 of the 16 available I/O lines. As I said I want to devellop a "Big-PET" version for the C64/128 as well. In that case we need something to emulate the onboard port of the 6510/8502 and by 'pure coincidence' we have a complete port available for this purpose. Take in mind that only bits 0..5/6 are used on the 6510/8502 but we cannot use bit 7 for other purposes as we have no idea how existing software threats this bit! Ideas for using the Big-PET 1) Testing new ROMs. We map RAM to the area normally used for ROM, after having filled it with our own program. One remark regarding the C64/128: remapping any of the ROM-areas also remaps the underlaying RAM. This means that SW writing to this RAM will destroy the existing data in the remapped RAM! 2) Emulating other 6502-systems We replace the original 6502 of the system to be emulated, by example a 1541 disk drive, with an interface which is nothing more but a 40 pins connector, a piece of flatcable and some buffers on a card. The main idea is to let the 65816 do the job the original 6502 normally did. But instead running the PRG in external systems ROM, we run our own PRG in for this case remapped RAM. !!!!!!!!! !!!!!!!!! One problem may occur an that is what clock should we use: the one of the PET of the one of the external system? I think that is system dependent and I have no idea what will happen if the 65816 runs out of phase with the onboard 16 MHz (which is used to drive the video) or out of phase with the onboard clock of the external system. !!!!!!!!! !!!!!!!!! 3) Using PC cards One idea is attach a complete PC board to our Big-PET. In this case I'm thinking of XT boards fitted with the 8088 or NEC V20 ie. the 'external' 8 bitters. Connecting boards normally fitted with a 80x86 or V30 implies we have to find a mechanism to read from/write to the 16/32 bits databus. A problem is that 8088 uses a multiplexed bus. But in this case the databus and the addresslines A0..7 have been multiplexed. There are two solutions for this problem: - A XT uses several ICs to generate all the signals as we know of: 8284, 8288, 74LS245, 74LS573. Take these ICs out of the board and supply our own signals there where needed. Problem: only works with boards not fitted with those big square custom ICs who are meant to replace the above mentioned ICs. - We mix the address- and datalines on the interface card and generate all other signal to emulate a 8088. What ever solution we use, we still have (at least) one problem to solve. The XT is capable of addressing 1 MB of memory but also capable of addressing $400 bytes of I/O. My solution is either to use an extra segment for addressing this I/O or using a part of the fast I/O segment mentioned above for this purpose as well. One remark: I do have the complete disassembly listing of the original IBM XT-ROM. In this way we only have to translate the 8086-code to 6502-code to program the onboard ICs. Higher clockspeeds: I have some 5 MHz. 65SC816s so I would be interrested running them at least at 4 MHz. as this frequency is available in all CBMs and PETs. The main idea is to extend the positive halve of the clock towards the CPU as long as needed: <PRE> -+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +-- 4 MHz. ³ ³ ³ ³ ³ ³ ³ ³ ³ ³ ³ ³ ³ ³ ³ ³ +---+ +---+ +---+ +---+ +---+ +---+ +---+ +---+ (A) -+ +---------------------------+ +-------------------------- 1 MHz. ³ ³ ³ ³ 65816 +---+ +---+ -+ +---------------+ +-------------- 1 MHz. ³ ³ ³ ³ system +---------------+ +---------------+ </PRE> The 'cripled' 1 MHz signal for the 65816 can be generated by ORring the 4, 2 and 1 MHz. signals. The actual ORring must become active as soon as the hardware has detected the right area at point (A). As mentioned before, the normal procedure to generate the addresslines A16..23 out of the databus is to latch them using a 74F573. But the state of these addresslines already are stable 'long' before the rising edge of CLK0. At least long enough to have the MMU stabilised his ouputs at this point as well. In case this does not work out well, the system should at least run at the 'cripled' 2 MHz. What part of the 16 MB range should be run at 1 MHz. and which part not? With exception of the very first segment, the choice is up to you. I'm thinking of using a 74LS138 as following: the first fysical 2 MB run at 1 Mhz., the rest at full speed. At least one segment of the 'slow' and one of the 'fast' area I will reserve purely for I/O-puposes. The fast I/O segment can be used for ICs like the 16550 which are able to run at 4 MHZ. or higher. At least one segment of the slow area will be reserved for emulation purposes. (see later) At least one segment of the 'slow' and one of the 'fast' area I will fill only with RAM. Why this differentiation? Using only fast RAM as shadow-RAM can have its drwabacks because there are programs using program-loops for timing purposes and these loops depend on the internal clock. So running the program in fast RAM could have unforseen impacts. Remark regarding attaching a XT board: In an XT the I/O and ROms normally are addressed at a lower speed as the RAMs. But we don't have to worry about that because the onboard logic takes care of this by normally slowing down the CPU. We only have to take care of the fact that we connect the used 'slow down' signal with the 65816 as well. ============================================================================ Groetjes, Ruud
Archive generated by hypermail 2.1.1.