• Ever wanted an RSS feed of all your favorite gaming news sites? Go check out our new Gaming Headlines feed! Read more about it here.

Deleted member 5148

User requested account closure
Member
Oct 25, 2017
2,108
So I'm replaying Sonic like in forever and I just noticed that labyrinth zone has alpha values when I get in the water.

At first I thought it was a simple color palette swap but lo and behold the water tide rises and ascended while keeping the transparent effect on sonics body.

Looks amazing, game still looks sharp colorful and full of life.

Not to mention that sick beat that plays during this stage.

If you're afe trying to get those chaos stones the game isn't about going fast, but to survive with 50 rings before reaching the end stage.

Great replay value when aiming to get the real ending.
 

deltatheta

Member
Oct 27, 2017
19
I think it just switches the palette while the screen is being drawn when it gets to a certain line in the screen.

Since the screen is split "vertically", it just has to draw all the above-water portion, then switch the palette and draw the below-water portion. If the screen was split horizontally, this wouldn't be possible. Just one of the many "tricks" that were possible in the 16-bit days.
 

Tizoc

Member
Oct 25, 2017
23,792
Oman
I did a quick search and found this ye olde thread that discussed it
http://forums.sonicretro.org/index.php?showtopic=19663

Sonic 1 IMO still holds up as it has that 8/16-bit era Platformer vibe going for it. Of the original 3...OK 4 if we include CD, my ranking would be
Sonic 3 (& Knuckles)>=Sonic 2>Sonic 1

I'd need to replay Sonic CD someday, but yeah Sonic 1 holds up if you look at it as more of a 16-bit era platformer than what the successors improved on.
 

Chojin

Member
Oct 26, 2017
2,624
Dithering.

(No not really)

When I think of the Genesis and now Mario Odyssey I always think of dithering.
 

Ramune

Member
Feb 3, 2018
87
Not to mention the nice ripple effect added to Japanese version (among other things). IIRC, doesn't the screen briefly flash whenever Sonic goes in and out the water in Sonic 3? I want to say that game wasn't as seamless in that regard. I'd check myself, but I'm about to head to work :(

Dithering.

(No not really)

When I think of the Genesis and now Mario Odyssey I always think of dithering.

I also remember the Game Gear games and their "Dithering" in water areas in the first two games really affecting Sonic's sprite a bit. Except Aqua Lake Act 2 in Sonic 2, but I remember it doing something before that level begins I guess to palette swap.
 

ReyVGM

Author - NES Endings Compendium
Verified
Oct 26, 2017
5,434
Talking about transparencies with no screenshots. Tsk tsk.
 

stan423321

Member
Oct 25, 2017
8,676
It is a mid-screen palette switch.

Old consoles generate picture of each frame as they send it to the TV, row by row. Genny sprite-and-tile chip allows changing pallete values in the h-blanks, short intervals between rows, without further consequences. There isn't enough time to change whole palette in one h-blank, so Sonic 1 displays white waves to mask the multiple transition lines somewhat; Sonic 3 does not bother as few could see them anyway.
 

hibikase

User requested ban
Banned
Oct 26, 2017
6,820
It's all about changing the color palette mid-refresh.

This is the same idea as false perspective effects. On traditional video game hardware you can change the graphics while the screen is refreshing. By changing the scrolling offset of a layer mid-refresh, you can create some sort of 3D perspective. This is also used for games with a static full-width HUD on the top or bottom of the screen like SMB or SMB3 (but not SMW, it used sprites instead).
 

Blackpuppy

Member
Oct 28, 2017
4,193
In layman's terms: an old crt "draws" a scanline, starting at the top and left to right. When it finishes drawing a scanline, the gun stops and realigns for the next scanline. It does this about 480 times.

What the game is doing is: once it reaches a certain point in the frame and the raster stops and gets ready to draw the next scanline, the game program says "hey! Switch the palettes". So when the next scanline starts drawing, the sprites are a different color.
 
OP
OP

Deleted member 5148

User requested account closure
Member
Oct 25, 2017
2,108
It's all about changing the palette mid-refresh.

This is the same trick as false perspective effects. On traditional video game hardware you can change the graphics while the screen is refreshing. By changing the scrolling offset of a layer mid-refresh, you can create some sort of 3D perspective.


Wow and what about the stage tiles warping under water, was that animated or done with code?

Also, any other games use this trick to create underwater effects?


Bonus question: bit offtopic
Could the neo geo do this?
 

hibikase

User requested ban
Banned
Oct 26, 2017
6,820
Wow and what about the stage tiles warping under water, was that animated or done with code?

Also, any other games use this trick to create underwater effects?


Bonus question: bit offtopic
Could the neo geo do this?

Warping effects are also done by changing the scrolling offset of layers line by line while drawing the screen.

Changing those properties mid-refresh is a big part of how console graphics worked.

The Neo-Geo actually had a weird-ass graphics chip that didn't use layers and just drew everything as a sprite, which made many of those effects very difficult.
 
OP
OP

Deleted member 5148

User requested account closure
Member
Oct 25, 2017
2,108
Warping effects are also done by changing the scrolling offset of layers line by line while drawing the screen.

Changing those properties mid-refresh is a big part of how console graphics worked.

The Neo-Geo actually had a weird-ass graphics chip that didn't use layers and just drew everything as a sprite, which made many of those effects very difficult.


If you play garou mark of the wolves you can see rock howards hands filled with transparencies when the purple flames come out.

I've also seen some fog effects ( one dithering ) in backgrounds.
 

hibikase

User requested ban
Banned
Oct 26, 2017
6,820
If you play garou mark of the wolves you can see rock howards hands filled with transparencies when the purple flames come out.

I've also seen some fog effects ( one dithering ) in backgrounds.

That's just dithering, ie. alternating between displaying and hiding a sprite on every frame, in order to give the illusion of a translucent sprite. The Neo-Geo hardware does not support true alpha blending, the SNES is the only notable 16-bit era hardware that supports it. For everything else, dithering was the go-to technique for making things look translucent.
 

Deleted member 11018

User requested account closure
Banned
Oct 27, 2017
2,419
That's just dithering, ie. alternating between displaying and hiding a sprite on every frame, in order to give the illusion of a translucent sprite. The Neo-Geo hardware does not support true alpha blending, the SNES is the only notable 16-bit era hardware that supports it. For everything else, dithering was the go-to technique for making things look translucent.

Well, on consoles without alpha blending there's flickering (display/hide/display/hide), there are intensity flags (shadow/highlight of the genesis), there's pre-rendering cheat, and there is mesh effect where you alternate
XOXOXOXOXO
OXOXOXOXOX
XOXOXOXOXO
with
OXOXOXOXOX
XOXOXOXOXO
OXOXOXOXOX
which gives good transparency with not-too-good CRTs :)
(and yeah, palette switching on the fly)

as far as sonic goes in labyrinth, it's been well explained.
edit : I removed the example of VBlank handling, after all, that's game code... you can find disassemblies easily :)
 
Last edited:

hibikase

User requested ban
Banned
Oct 26, 2017
6,820
Oh yeah, forgot about mesh. Sonic games do use that in some places.

Also I miscalled "dithering", I really meant flickering
 

Baladium

Banned
Apr 18, 2018
5,410
Sleep Deprivation Zone
If you're afe trying to get those chaos stones the game isn't about going fast, but to survive with 50 rings before reaching the end stage.

chuckles-you-mean-the-chaos-emeralds-3655217.png
 

Deleted member 12790

User requested account closure
Banned
Oct 27, 2017
24,537
It is a mid-screen palette switch.

Old consoles generate picture of each frame as they send it to the TV, row by row. Genny sprite-and-tile chip allows changing pallete values in the h-blanks, short intervals between rows, without further consequences. There isn't enough time to change whole palette in one h-blank, so Sonic 1 displays white waves to mask the multiple transition lines somewhat; Sonic 3 does not bother as few could see them anyway.

Not quite. The SNES only allows DMA during H-Blank or VBlank, but the Genesis is unique in that it allows DMA anywhere, any time (with one slight caveat). Just to make sure everybody is on the same page, let's talk about how old TVs drew their images to the screen. This gif I found online is good starting point for this discussion:

aa-raster-1.gif


Old televisions had what was called a phosphor gun that shows a beam of electricity to the screen of your old CRT television. When the electricity touches the glass of your screen, it causes a phosphorescent glow, and that glow is what forms the basis of the image. This process is mechanical - if you were to open up your TV and could shrink down and look inside, you'd physically see the gun moving left and right and up and down. It "scans" the image out line by line, going from the left side of the screen, to the right, and from the top of the screen, to the bottom. The gun has two states - on and off. When it's moving left to right, the gun is on and spraying phosphor onto the screen. Because it has to move back to the left (or top) of the screen to begin drawing the next line, it turns off while moving right to left (or up to down) temporarily. When It's back into position, it turns back on and begins drawing again. While the gun is off, nothing is being drawn to the screen. This period where nothing is being drawn to the screen is called a blank - Hblank = horizontal blank (aka when gun is moving right to left), Vblank = vertical blank (aka when gun is moving bottom to top).

Video game hardware is meticulously timed so that the chips inside sync up to the amount of time it takes for the television to draw lines, and when blanks occur. This is because TV's are a fixed update interval, so if these chips are synced to the amount of time it takes for the television to draw, they can just send data continuously and the TV will draw correctly. These old consoles do not work like a progressive scan modern display. When you tell the console to draw a pixel at X, Y, you need to actually wait for the gun to be in that position to draw. If you map out the time it takes to complete HBlank and VBlank, the actual drawing area, conceptually, looks more like this:

l6JxIZH.png


Active display is when the gun is actually drawing visible parts on screen. The yellow portion is the time, per scanline, that it takes for the gun to move back to the left, which is invisible, but still takes time (aka hblank). The blue portion at the bottom is the time it takes for the gun to move back to the top of the screen from the bottom (aka vblank). you can see that VBlank is much longer than HBlank, and unlike HBlank, it occurs all at once without interruption (where each vertical line of hblank is separated by a line of active display). When doing retro programming, these blanking periods are critical for doing intense calculations and polling, like figuring out if collision detection happened, or checking the controller to see what buttons were pressed. Because the system is so tightly timed to the update cycle of the television, this means you only have a certain amount of time to do these calculations, so you need your program to be speedy or else it'll slow down the game -- if you, for example, take too long to finish your VBlank calculations before the gun moves to the top of the screen, the genesis will not get the signal from the programmer that they are done doing vblank calculations in time, and it'll actually halt drawing for an entire frame, retaining the original image. This full frame of delay is what causes slowdown. Slowdown happens in multiples of 0.5 from the previous refresh rate. I.e. if your TV refreshes 60 times per second, and your VBlank calculations are causing one frame of delay, then you're only displaying 30 frames per second.

DMA is Direct Memory Access. It describes the way in which the CPU of the console interacts with memory. The Genesis (and SNES) work in that they are a bunch of processors on a large board working in sync with one each other. Each processor can be thought of as, essentially, it's own tiny computer, with RAM and an instruction set. They work largely independent of each other, doing their own thing blindly. The Genesis essentially works like this:

lVTq5MU.png


The chips we are concerned with discussing in this example are the main CPU of the genesis, the 68000 processor, and the graphics chip of the Genesis, the Video Display Processor. This example is being really simplified to make things easier to understand (i.e. no talk of registers). The VDP has a bit of ram called VRAM while the genesis has a much larger amount of system RAM. Typically, the 68000 has a bus running to the main system RAM and can't access the VRAM. To allow independent chips to communicate with each other, independent chips will have a tiny pice of shared ram that they can read from. This is called a control port. Typically, if you wanted to change the way the VDP worked, the 68000 CPU would request some bit of binary game code from the game cartridge ROM, which the 68000 has a bus running to and thus can read. It'd put that instruction to change the VDP into RAM, then move it to the piece of shared memory. The VDP can't read system RAM, but it can read that little piece of shared ram. After moving the instruction to the control port, the 68000 goes back to doing whatever work it needs to do. Then, when the VDP is ready, it'll read from the control port, interpret the command, and change the values in its VRAM accordingly. In a traditional gameloop, these things occur during blanking. This is because the VDP has a unique quirk -- the bus it uses to read VRAM from is also the exact same bus it uses to send data to the television's phosphor gun. Thus, if you do changes to the VDP during vblank, it ensures there is no conflict between reading a value to the screen, and writing a value to the color portion of VRAM.

BMecM81.png


DMA is not a typical mode. Direct Memory Access allows the 68000 CPU to write directly to the VDP's VRAM, bypassing the control port or having the VDP interpret the instruction set. In demoscene, the process of bypassing the established communication interface (i.e. the control port) to communicate directly to the hardware is called Blasting the hardware. The SNES can only do DMA during blanks, but the Genesis has the really cool ability to do DMA any time (again, with a caveat I'll get to later). If you look at the above picture, you should realize a problem with doing this, however. If you do DMA during active display, the VDP is using the bus to draw pixels onto the screen. To do DMA, you use that same bus to move data from the 68000 CPU to the VRAM. The end result is something called CRAM Dots (color ram dots). Most emulators do not display CRAM dots (and those that do sometimes do so with errors), and can usually only be seen correctly on original hardware. What happens is, if you DMA during active display, the value you move to CRAM gets drawn on screen at the position the phosphor gun is currently at. After math (which I'll get into later), depending on the genesis display mode, this means the genesis can change between 161 and and 182 color palette entries per active display horizontal line.

Thus, getting back to the original comment and topic at hand, Labyrinth Zone (and many of the water levels in Sonic) use DMA to change the color palette of the game midway through the screen to simulate transparency. The Genesis has a clock that can be used to roughly figure out how far vertically the raster gun is on screen, so it can check a value vs that clock to tell when on screen to change the color values. Sonic 1, 2, and Sonic CD just blast the VDP chip during active display without regard for the phosphor gun. On real hardware, this causes visual problems from the CRAM dots:

GtoXA37.png


Look at the water line of Labyrinth Zone in this picture which has the sprite layer removed. you can see repeated vertical bands of CRAM dots. These remain fixed on screen and are really ugly to look at. To solve this, Sega added flashing wave sprites drawn at the water line. The flickering nature of these sprites meant that people generally couldn't see the CRAM dots, it effectively hid the problem. This created a secondary problem, however -- the Genesis can only display a fixed number of sprites on each horizontal row. Drawing these flickering water sprites ate up the amount of sprites it could draw in a single frame. Theoretically, sprites may flicker as they are multiplexed at the water line, but in practice this is really hard to see unless you know what you're looking for.

However, unhappy with this flaw, Yuji Naka revised his solution in Sonic 3:

JfYp4Gc.png


Sonic 3, in many stages, does not use flickering sprites to hide CRAM dots, because it doesn't produce CRAM dots at all. To avoid CRAM dots, what Sonic 3 and Knuckles does is wait until Hblank to perform the DMA. This limits the number of colors that can be changed per line, however, as only performing DMA during HBlank means you have little amount of time to do so, and not enough time to change all the colors. Look at the water line in that Sonic 3 picture and you'll see how they carefully mapped the color changes over several lines. Notice how the tree trunk on the right side of the screen changes colors several vertical lines below the water line, and immediately to its left is a bit of background scenery that is bright red that also gets changed a line later than the water line. Look at tails, his colors don't get changed at all because he stops drawing before they'd be affected. Look at Sonic, his color changes happen about 1 row of pixels below the water line.

Now, the caveat and math I spoke about earlier in this post. The astute among you may realize that "Blasting" the hardware sounds familiar. This, in fact, is where the term "Blast processing" comes from. Yes, blast processing is a real thing, and no, it doesn't just refer to blasting the hardware. Blast processing is a real effect, that is directly tied to the way the VDP uses a single port to draw to the screen and access CRAM. By very carefully timing DMA to VRAM over, and over, and over again, for every single pixel visible on screen, you can fill the entire screen with CRAM dots. What's neat about CRAM dots is that they display whatever color is pushed to VRAM, regardless of palette. Normally, the VDP can only draw one of 16 colors from a selected 61 colors out of 512 onto screen. But CRAM dots can be any single one of those 512 colors all at once. So if you fill the entire screen with CRAM dots, you can display the entire genesis palette on screen all at once. Blast processing is a very specific technique to do just that, allowing the Genesis to display a huge palette onscreen at once. No commercial games ever used this because it wasn't until the last few years that a method was found to sync the chips in the genesis well enough to allow this technique. If you are interested, I'll quote a post of mine from earlier this year going into the nitty gritty about the timing:

Blast processing is amazing. I mean the actual, real deal Blast Processing, the technique described to the Sega marketing guy that he didn't understand at all. It is an extremely technical bug, that people knew about for decades, but was considered flawed until 2012 when primarily 3 coders - Oerg88, Nemesis, and ChillyWilly, developed a method to perfect the technique. I've been trying to write a blast processing demo myself without looking at their source codes for several weeks now and I still can't replicate their steps without cheating. But the process is so insane it's worth talking about.

So, firstly, what is blast processing? Most people assume the term refers to the general speed of the Sega Genesis CPU and how that speed allows the Genesis to do more runtime (read: CPU bound) special effects than the Super NES, thinking "blast" as a term for quickness. Blasting, in fact, is a demoscene term that means sending data to a piece of hardware without an interface. In this specific case, "blast processing" referred to a way the programmers at Sega could "blast" data at the Video Display Processor, the chip that outputs the final image to the television, ignoring hardware data structures and interfaces the chip has built in intended to let people control it, using a method called "Direct Memory Access" to produce strange, even fantastic results way, way beyond what the Sega Genesis hardware could supposedly do.

There is a very good gamehut (head and lead programmer from Traveler's Tales) episode where he discusses stumbling upon the technique back in the 90's and why he, and sega, abandoned it:



That video gives a pretty good broad overview of the concept, but skips a lot of the technical nitty gritty that makes the entire process a lot more amazing. And, more importantly, it skips the recent (relatively) discoveries which now makes the effect viable. In short, to recap what the video above talks about, and go into a bit more detail, the Sega Genesis VDP exposes a few hardware ports to the Sega Genesis CPU and DMA controller which allows software to modify the video hardware. This is intended, for example, to let the CPU change palette colors in the VDP's Color RAM. The problem is that these exposed ports are singular working in parallel, rather than dual ports. Meaning there is only one port, for example, to both read and write to Color Ram in the VDP. When you write to the Color Ram, you are simultaneously using the same ports the VDP uses to display color on the screen. This causes an error known as CRAM dots when you change a color -- the current position where the VDP is drawing to the screen will display a pure RBG value that equates to the 15-bit value being written to CRAM

Now, the Genesis displayable screen is actually much larger than the normal television display, extending far to the left and right (and up and down). As such, most programmers can hide palette changes so they occur when the VDP is drawing offscreen pixels, hiding the artifiact. But some games can't get the timing down and CRAM dots are visible on a real TV screen. What Sega, and Traveler's Tales, and modern demoscene coders realized is that, if every time they changed a palette color, they could produce a 15-bit color pixel on the screen, then they could write loops that would do this over and over again and cover the entire screen in 15-bit color pixels, producing direct color images with a color palette of over 4,000! Way, way, WAY beyond the master palette of 512 the Genesis can produce, and the 61 it normally is only able to display. More importantly, these images can be treated as direct color framebuffers -- where each pixel can be directly accessed and drawn. Old video games systems prior to around the time things like the 32X and 3DO and Jaguar started appearing, used Tiles and cel structures to quantize screen space and data, making it possible to full entire screens with images with only a little bit of memory, at the expense of losing fine pixel granularity. Which is a long way of saying that the Sega Genesis was never designed to let you select and draw single pixels on screen. Instead, the smallest element you are supposed to be able to draw with is an 8x8 tile, which either has to be aligned to a grid (and thus can only move left, right, up, and down in 8 pixel steps), or as one of 20 hardware sprites that can be drawn in any location, but don't provide nearly enough to fill an entire screen. Being able to directly plot pixels is what allows systems to do 3D plotting, forr example.

So this blast processing ability was super, super useful. It gave the Genesis a drawing palette that was 63 times bigger than normal, and let it manipulate the screen directly in enviable ways that wouldn't become the norm for an entire console generation later. Sega knew about it, and knew it was actually a potentially killer technique, but it was never used in a single commercial game. Why? Because the entire method requires extremely strict timing, and the Genesis provides no method of syncing the CPU to the VDP. You have to understand that the CPU is a microprocessor, and the VDP is also a microprocessor, and they run entirely independent of each other. It's almost like they are two separate computers, sitting on a single motherboard, each with it's own RAM and timing. The ports I described earlier are areas of shared ram between the CPU and VDP that let them communicate, but they are asynchronous. The CPU sends data through the port, and then when the VDP is ready, it reads it, and vice versa. It's normally up to the programmer to make sure the read and writes don't collide, and that is done by scheduling these commands liberally, with expectations of large gaps between when one CPU writes and one reads. As such, any methods of synchronization between the two chips is vague. This is compounded because there are very slight timing variations between different models of Sega Genesis. And, unlike other retro systems, the Genesis didn't provide any syncing commands for video horizontal video signals. By contrast, consoles like the C64 and Atari 8-bit computers could provide commands that would make a CPU wait until it got a signal from the television to begin drawing a horizontal line.

These variations in timing means that Traveler's Tales, for example, began experimenting with manual calibration -- letting users use the controller d-pad to set values that slightly altered timing until the picture came in correctly. But they decided that was too cumbersome, and it was essentially impossible to write an application that could discern for itself where to start drawing, to be able to sync up the DMA routines. If your timing is off, what happens is that the image jitters left and right per scanline, and these jitters can be huge depending on how off the timing is. Sometimes like 20 pixels off. That ruins the picture.

Well, in 2012, as mentioned earlier, 3 long time demoscene developers finally figured out a method to sync CPU instructions to VDP instructions based off of VBlank. The entire process of figuring out the precise steps to sync any Megadrive this way is fascinating. Firstly, understand how the Megadrive itself keeps timing of its chips. There is a master clock that cycles at 53.693175 MHz on every megadrive. This value was chosen because it times with the other chips in whole values:

53.693175 / 15 = 3.57 MHz, the speed of the Z80

53.693175 / 7 = 7.67 MHz, the speed of the 68000 cpu

53.693175/10 = 5.36 MHz

This last figure is the speed of a clock inside the VDP called the Pixel Clock. This Pixel Clock controls the master timing for the entire VDP chip. Every tick of the Pixel Clock, a second counter called the Serial Counter is ticked twice. Thus, for every pixel that is processed by the VDP, the serial clock ticks twice. This was discovered by Nemesis using a logic sniffer and lots and lots of notes:

vram%20logic%20analyser.jpg


These are his timing notes from sniffing the bus to the VDP over a full frame, noting every VDP operation takes 4 serial clock ticks. The actual logic is very complex, because VDP operations are nested, and actually take 7 serial clock ticks to execute, but for the purposes of timing, only the start of each operation is important, and the first tick of a new operation always happens on the 4th tick of the last operation, making 2 operations work in succession for the final 3 ticks of each cycle. This was discovered by Nemesis mapping the timing access patterns like so:

VDP%20VRAM%20timing%20H32%20-%20small.png



VDP%20VRAM%20timing%20H40%20-%20small.png


With a hard reference of how long operations took in the VDP relative to the master clock, a formula of specific commands done in a precise order was discovered that could narrow down execution to precise synchronization. The first step in the process is to give the command to the Genesis CPU to wait for vblank from the television. Despite not having fine sync controls, the Genesis CPU can tell when Vblank occurs on the television. However, it does so vaguely, and gives an error range of 12 or so pixels. Meaning the Genesis can think VBlank begins on any pixel clock count from 0-12. That's a large range that can induce jitter.

The next step in the process is to fill up a number of First-In, First-Out memory queue slots the VDP has. To speed up operations and make scheduling more easy, the VDP doesn't just let you write one command to it at a time, but rather you can buffer 13 commands in a field of memory that is FIFO, and the VDP will execute them in order sequentially, popping them off the queue. The timing of this operation, with the timing of the VDP vs the timing of the master clock just so happens to evoke an effect similar to the Collatz Conjecture. The Collatz Conjecture is an algorithm with a special ratio that, when repeated over any integer N, will eventually devolve into a finite pattern of 4, 2, 1. The exact ratio of the VDP timing does not produce a finite pattern, but it does eventually devolve into a 2, 1, 2, 1, 2, 1, 2, 1 alignment pattern over several operations (and the ratio eventually desyncs after too many iterations). Thus, if you fill the FIFO memory with enough commands, it'll devolve into that alignment pattern, and the amount of memory the FIFO can hold is too low to desync. So after these two steps, we are now within a 2 pixel clock cycle alignment of our goal.

The final step of the puzzle relates to refresh cycle slots. If you refer to this image:

VDP%20VRAM%20timing%20H32%20-%20small.png


You see that on Serial clock count 156, it performs once a "refresh cycle." This is because the Sega Genesis uses dynamic ram for its color ram rather than static ram. Dynamic Ram is cheaper than static ram, but it requires a jolt of electricity every few moments to keep the data inside of it alive, vs static ram which keeps the data alive until power is cut entirely. These "refresh cycle" slots are moments when the genesis sends some electricity to the DRAM. This is an extremely high priority process, taking priority over every other command. If you perform a DMA within this Refresh Cycle, it'll wait until a 2 serial clock boundry is hit to perform it because it needs to refresh DRAM, which will eliminate the potential erroneous 2 pixel clock cycle offset (if it exists) that is caused by filling the FIFO method. This is because the math works out to the DMA command is able to be either 1 or 2 serial clock tick bound given our error, meaning those ticks can happen over any 2 serial clock tick moment. But the Refresh Cycle takes a full pixel clock tick, which itself works out to 2 serial clock ticks. Regardless of what our DMA command's alignment is - 1 serial clock tick or 2 - it won't execute until definitely 2 serial clock ticks go by because the Refresh Cycle needs to occur. You have to carefully time your DMA command from your CPU after you fill the FIFO queue to wait specific amount of time to hit this reset cycle reliably, however, and it's been worked out that a specific number of NOP commands (no operation) called by the CPU will align perfectly to the reset cycle. This is currently where I'm stuck, I've been trying to decypher the math to figure out how many NOP commands align to the reset cycle. Chilly Willy, Nemesis, and Oerg88 know the number, of course, but part of the fun is working out the math from the timing on your own. I know the number of NOP commands is circular , occuring again X number of ticks again and again. So if the number of NOP commands is, say, 5, and the cycle is 20, then it'd occur at 5, 25, 45, 65, etc. But again, I haven't worked out the cycle.

All this works out, however, to perfectly timing the VDP to the CPU to the Television screen, without any external input from the player. Once that's done, the effect is unlocked. The Megadrive doing blast processing is really awesome, but sort of useless because it halts the 68000 CPU, you can't do anything while it's drawing to the screen because the CPU is being used to blast the pixels to the VDP. So it's mainly good for static images on the Genesis.

BUT with a Sega CD attached, the proposition becomes entirely different. Since the Sega CD has it's own 68000 CPU that operates independently, it can use the Genesis 68000 CPU as a blast processor to draw a direct color 4096 color image to the screen, while it's own CPU handles the game logic, music, and everything else. This is how things like the Sega CD port of Wolfenstein 3D is done.

There are drawbacks, however. Because there is a refresh cycle of DRAM every several "cels", this manifests on screen by a vertical row of pixels being twice as fat. These vertical rows happen 5 times per frame, so there are 5 rows of "stretched" pixels across the screen. This can be hidden by carefully managing your image output, or by use of noise generation to hide the artifact. Another draw back is that blast processing halves the horizontal resolution, so you're working at a 160x200 resolution. Actually, the effect draws off screen and if you have an HDTV which does not hide overscan, you can actually output a widescreen image from the genesis this way at 180x200. Despite these draw backs, Blast Processing is a super useful video mode, especially when it can be enabled and disabled at will. It's especially great for large, screen filling colorful images.

Blast processing is absolutely amazing stuff. It's a weird instance where Sega knew about it, advertised the hell out of the feature, but totally didn't get it and never once used it. But it WASN'T marketing fluff, and it really could let the Genesis (and especially Sega CD) do some honest to god amazing stuff for the hardware.

This effect was thought impossible to achieve with sync by Nemesis originally. It was Oreg88 who stumbled upon the repeated pattern the VDP timing of the FIFO queue would devolve into after repeated operations, and Chilly Willy with Oreg who figured out the exact number of NOPs needed to sync everything. Really, really great cooperative teamwork there.

You can see this trick all over Overdrive 2 by Titan:



And a quick pic for those who don't want to watch a video:

tSzt6zK.png
[/QUOTE]


Now, the caveat to both Blast Processing and DMA in general: while the 68000 CPU can DMA to VRAM at any cycle, the sega genesis uses DRAM for all of the ram in the system. DRAM is dynamically refreshed random access memory. It is the opposite of SDRAM, static random access memory. They describe the way in which RAM is able to retain the value. Deep down inside the electronics, all these "bits" of information are actually electrical currents. There exists types of circuits that trap electrical currents in loops with a small transistor gate that can be opened or closed to release the electrical current. By trapping electrical currents in these small loops, you can "store" a bit of information. There are a couple of different ways to trap memory. Both SRAM and DRAM need an electrical current to be provided occasionally to keep the electrical currents trapped in these loops, but SRAM needs this electrical current much, much less frequently. This makes SRAM, for example, good for saving games using small watch batteries, as the SRAM will draw electricity from that battery extremely slowly, enough to last for years. DRAM, by contrast, needs to be refreshed very consistently. You, essentially, need to pump it with electricity every couple of microseconds to keep the information in it alive. This need for electricity makes DRAM much cheaper than SRAM, which is why consoles use DRAM for their memory. So, to make sure its ram does not die, there is a clock on the genesis that is timed to constantly refresh DRAM. The VDP itself has DRAM that also needs to be refreshed. When the VDP DRAM is refreshed, it halts all processes to the VDP, including DMA. When this occurs, for a single pixel, DMA is disabled and you get a slightly horizontal stretching only at that pixel. The areas between these refresh periods are known as DMA Access slots, as those are the moments when DMA is usable. This is really, really technical stuff and not really necessary to understand how DMA works, but it's the caveat I mentioned before about being able to "DMA anywhere." You can DMA anywhere there is a DMA Access slot, which is everywhere except the single pixels where DRAM is refreshed.
 
Nov 1, 2017
3,067
Not quite. The SNES only allows DMA during H-Blank or VBlank, but the Genesis is unique in that it allows DMA anywhere, any time (with one slight caveat). Just to make sure everybody is on the same page, let's talk about how old TVs drew their images to the screen. This gif I found online is good starting point for this discussion:

aa-raster-1.gif


Old televisions had what was called a phosphor gun that shows a beam of electricity to the screen of your old CRT television. When the electricity touches the glass of your screen, it causes a phosphorescent glow, and that glow is what forms the basis of the image. This process is mechanical - if you were to open up your TV and could shrink down and look inside, you'd physically see the gun moving left and right and up and down. It "scans" the image out line by line, going from the left side of the screen, to the right, and from the top of the screen, to the bottom. The gun has two states - on and off. When it's moving left to right, the gun is on and spraying phosphor onto the screen. Because it has to move back to the left (or top) of the screen to begin drawing the next line, it turns off while moving right to left (or up to down) temporarily. When It's back into position, it turns back on and begins drawing again. While the gun is off, nothing is being drawn to the screen. This period where nothing is being drawn to the screen is called a blank - Hblank = horizontal blank (aka when gun is moving right to left), Vblank = vertical blank (aka when gun is moving bottom to top).

Video game hardware is meticulously timed so that the chips inside sync up to the amount of time it takes for the television to draw lines, and when blanks occur. This is because TV's are a fixed update interval, so if these chips are synced to the amount of time it takes for the television to draw, they can just send data continuously and the TV will draw correctly. These old consoles do not work like a progressive scan modern display. When you tell the console to draw a pixel at X, Y, you need to actually wait for the gun to be in that position to draw. If you map out the time it takes to complete HBlank and VBlank, the actual drawing area, conceptually, looks more like this:

l6JxIZH.png


Active display is when the gun is actually drawing visible parts on screen. The yellow portion is the time, per scanline, that it takes for the gun to move back to the left, which is invisible, but still takes time (aka hblank). The blue portion at the bottom is the time it takes for the gun to move back to the top of the screen from the bottom (aka vblank). you can see that VBlank is much longer than HBlank, and unlike HBlank, it occurs all at once without interruption (where each vertical line of hblank is separated by a line of active display). When doing retro programming, these blanking periods are critical for doing intense calculations and polling, like figuring out if collision detection happened, or checking the controller to see what buttons were pressed. Because the system is so tightly timed to the update cycle of the television, this means you only have a certain amount of time to do these calculations, so you need your program to be speedy or else it'll slow down the game -- if you, for example, take too long to finish your VBlank calculations before the gun moves to the top of the screen, the genesis will not get the signal from the programmer that they are done doing vblank calculations in time, and it'll actually halt drawing for an entire frame, retaining the original image. This full frame of delay is what causes slowdown. Slowdown happens in multiples of 0.5 from the previous refresh rate. I.e. if your TV refreshes 60 times per second, and your VBlank calculations are causing one frame of delay, then you're only displaying 30 frames per second.

DMA is Direct Memory Access. It describes the way in which the CPU of the console interacts with memory. The Genesis (and SNES) work in that they are a bunch of processors on a large board working in sync with one each other. Each processor can be thought of as, essentially, it's own tiny computer, with RAM and an instruction set. They work largely independent of each other, doing their own thing blindly. The Genesis essentially works like this:

lVTq5MU.png


The chips we are concerned with discussing in this example are the main CPU of the genesis, the 68000 processor, and the graphics chip of the Genesis, the Video Display Processor. This example is being really simplified to make things easier to understand (i.e. no talk of registers). The VDP has a bit of ram called VRAM while the genesis has a much larger amount of system RAM. Typically, the 68000 has a bus running to the main system RAM and can't access the VRAM. To allow independent chips to communicate with each other, independent chips will have a tiny pice of shared ram that they can read from. This is called a control port. Typically, if you wanted to change the way the VDP worked, the 68000 CPU would request some bit of binary game code from the game cartridge ROM, which the 68000 has a bus running to and thus can read. It'd put that instruction to change the VDP into RAM, then move it to the piece of shared memory. The VDP can't read system RAM, but it can read that little piece of shared ram. After moving the instruction to the control port, the 68000 goes back to doing whatever work it needs to do. Then, when the VDP is ready, it'll read from the control port, interpret the command, and change the values in its VRAM accordingly. In a traditional gameloop, these things occur during blanking. This is because the VDP has a unique quirk -- the bus it uses to read VRAM from is also the exact same bus it uses to send data to the television's phosphor gun. Thus, if you do changes to the VDP during vblank, it ensures there is no conflict between reading a value to the screen, and writing a value to the color portion of VRAM.

BMecM81.png


DMA is not a typical mode. Direct Memory Access allows the 68000 CPU to write directly to the VDP's VRAM, bypassing the control port or having the VDP interpret the instruction set. In demoscene, the process of bypassing the established communication interface (i.e. the control port) to communicate directly to the hardware is called Blasting the hardware. The SNES can only do DMA during blanks, but the Genesis has the really cool ability to do DMA any time (again, with a caveat I'll get to later). If you look at the above picture, you should realize a problem with doing this, however. If you do DMA during active display, the VDP is using the bus to draw pixels onto the screen. To do DMA, you use that same bus to move data from the 68000 CPU to the VRAM. The end result is something called CRAM Dots (color ram dots). Most emulators do not display CRAM dots (and those that do sometimes do so with errors), and can usually only be seen correctly on original hardware. What happens is, if you DMA during active display, the value you move to CRAM gets drawn on screen at the position the phosphor gun is currently at. After math (which I'll get into later), depending on the genesis display mode, this means the genesis can change between 161 and and 182 color palette entries per active display horizontal line.

Thus, getting back to the original comment and topic at hand, Labyrinth Zone (and many of the water levels in Sonic) use DMA to change the color palette of the game midway through the screen to simulate transparency. The Genesis has a clock that can be used to roughly figure out how far vertically the raster gun is on screen, so it can check a value vs that clock to tell when on screen to change the color values. Sonic 1, 2, and Sonic CD just blast the VDP chip during active display without regard for the phosphor gun. On real hardware, this causes visual problems from the CRAM dots:

GtoXA37.png


Look at the water line of Labyrinth Zone in this picture which has the sprite layer removed. you can see repeated vertical bands of CRAM dots. These remain fixed on screen and are really ugly to look at. To solve this, Sega added flashing wave sprites drawn at the water line. The flickering nature of these sprites meant that people generally couldn't see the CRAM dots, it effectively hid the problem. This created a secondary problem, however -- the Genesis can only display a fixed number of sprites on each horizontal row. Drawing these flickering water sprites ate up the amount of sprites it could draw in a single frame. Theoretically, sprites may flicker as they are multiplexed at the water line, but in practice this is really hard to see unless you know what you're looking for.

However, unhappy with this flaw, Yuji Naka revised his solution in Sonic 3:

JfYp4Gc.png


Sonic 3, in many stages, does not use flickering sprites to hide CRAM dots, because it doesn't produce CRAM dots at all. To avoid CRAM dots, what Sonic 3 and Knuckles does is wait until Hblank to perform the DMA. This limits the number of colors that can be changed per line, however, as only performing DMA during HBlank means you have little amount of time to do so, and not enough time to change all the colors. Look at the water line in that Sonic 3 picture and you'll see how they carefully mapped the color changes over several lines. Notice how the tree trunk on the right side of the screen changes colors several vertical lines below the water line, and immediately to its left is a bit of background scenery that is bright red that also gets changed a line later than the water line. Look at tails, his colors don't get changed at all because he stops drawing before they'd be affected. Look at Sonic, his color changes happen about 1 row of pixels below the water line.

Now, the caveat and math I spoke about earlier in this post. The astute among you may realize that "Blasting" the hardware sounds familiar. This, in fact, is where the term "Blast processing" comes from. Yes, blast processing is a real thing, and no, it doesn't just refer to blasting the hardware. Blast processing is a real effect, that is directly tied to the way the VDP uses a single port to draw to the screen and access CRAM. By very carefully timing DMA to VRAM over, and over, and over again, for every single pixel visible on screen, you can fill the entire screen with CRAM dots. What's neat about CRAM dots is that they display whatever color is pushed to VRAM, regardless of palette. Normally, the VDP can only draw one of 16 colors from a selected 61 colors out of 512 onto screen. But CRAM dots can be any single one of those 512 colors all at once. So if you fill the entire screen with CRAM dots, you can display the entire genesis palette on screen all at once. Blast processing is a very specific technique to do just that, allowing the Genesis to display a huge palette onscreen at once. No commercial games ever used this because it wasn't until the last few years that a method was found to sync the chips in the genesis well enough to allow this technique. If you are interested, I'll quote a post of mine from earlier this year going into the nitty gritty about the timing:

Blast processing is amazing. I mean the actual, real deal Blast Processing, the technique described to the Sega marketing guy that he didn't understand at all. It is an extremely technical bug, that people knew about for decades, but was considered flawed until 2012 when primarily 3 coders - Oerg88, Nemesis, and ChillyWilly, developed a method to perfect the technique. I've been trying to write a blast processing demo myself without looking at their source codes for several weeks now and I still can't replicate their steps without cheating. But the process is so insane it's worth talking about.

So, firstly, what is blast processing? Most people assume the term refers to the general speed of the Sega Genesis CPU and how that speed allows the Genesis to do more runtime (read: CPU bound) special effects than the Super NES, thinking "blast" as a term for quickness. Blasting, in fact, is a demoscene term that means sending data to a piece of hardware without an interface. In this specific case, "blast processing" referred to a way the programmers at Sega could "blast" data at the Video Display Processor, the chip that outputs the final image to the television, ignoring hardware data structures and interfaces the chip has built in intended to let people control it, using a method called "Direct Memory Access" to produce strange, even fantastic results way, way beyond what the Sega Genesis hardware could supposedly do.

There is a very good gamehut (head and lead programmer from Traveler's Tales) episode where he discusses stumbling upon the technique back in the 90's and why he, and sega, abandoned it:



That video gives a pretty good broad overview of the concept, but skips a lot of the technical nitty gritty that makes the entire process a lot more amazing. And, more importantly, it skips the recent (relatively) discoveries which now makes the effect viable. In short, to recap what the video above talks about, and go into a bit more detail, the Sega Genesis VDP exposes a few hardware ports to the Sega Genesis CPU and DMA controller which allows software to modify the video hardware. This is intended, for example, to let the CPU change palette colors in the VDP's Color RAM. The problem is that these exposed ports are singular working in parallel, rather than dual ports. Meaning there is only one port, for example, to both read and write to Color Ram in the VDP. When you write to the Color Ram, you are simultaneously using the same ports the VDP uses to display color on the screen. This causes an error known as CRAM dots when you change a color -- the current position where the VDP is drawing to the screen will display a pure RBG value that equates to the 15-bit value being written to CRAM

Now, the Genesis displayable screen is actually much larger than the normal television display, extending far to the left and right (and up and down). As such, most programmers can hide palette changes so they occur when the VDP is drawing offscreen pixels, hiding the artifiact. But some games can't get the timing down and CRAM dots are visible on a real TV screen. What Sega, and Traveler's Tales, and modern demoscene coders realized is that, if every time they changed a palette color, they could produce a 15-bit color pixel on the screen, then they could write loops that would do this over and over again and cover the entire screen in 15-bit color pixels, producing direct color images with a color palette of over 4,000! Way, way, WAY beyond the master palette of 512 the Genesis can produce, and the 61 it normally is only able to display. More importantly, these images can be treated as direct color framebuffers -- where each pixel can be directly accessed and drawn. Old video games systems prior to around the time things like the 32X and 3DO and Jaguar started appearing, used Tiles and cel structures to quantize screen space and data, making it possible to full entire screens with images with only a little bit of memory, at the expense of losing fine pixel granularity. Which is a long way of saying that the Sega Genesis was never designed to let you select and draw single pixels on screen. Instead, the smallest element you are supposed to be able to draw with is an 8x8 tile, which either has to be aligned to a grid (and thus can only move left, right, up, and down in 8 pixel steps), or as one of 20 hardware sprites that can be drawn in any location, but don't provide nearly enough to fill an entire screen. Being able to directly plot pixels is what allows systems to do 3D plotting, forr example.

So this blast processing ability was super, super useful. It gave the Genesis a drawing palette that was 63 times bigger than normal, and let it manipulate the screen directly in enviable ways that wouldn't become the norm for an entire console generation later. Sega knew about it, and knew it was actually a potentially killer technique, but it was never used in a single commercial game. Why? Because the entire method requires extremely strict timing, and the Genesis provides no method of syncing the CPU to the VDP. You have to understand that the CPU is a microprocessor, and the VDP is also a microprocessor, and they run entirely independent of each other. It's almost like they are two separate computers, sitting on a single motherboard, each with it's own RAM and timing. The ports I described earlier are areas of shared ram between the CPU and VDP that let them communicate, but they are asynchronous. The CPU sends data through the port, and then when the VDP is ready, it reads it, and vice versa. It's normally up to the programmer to make sure the read and writes don't collide, and that is done by scheduling these commands liberally, with expectations of large gaps between when one CPU writes and one reads. As such, any methods of synchronization between the two chips is vague. This is compounded because there are very slight timing variations between different models of Sega Genesis. And, unlike other retro systems, the Genesis didn't provide any syncing commands for video horizontal video signals. By contrast, consoles like the C64 and Atari 8-bit computers could provide commands that would make a CPU wait until it got a signal from the television to begin drawing a horizontal line.

These variations in timing means that Traveler's Tales, for example, began experimenting with manual calibration -- letting users use the controller d-pad to set values that slightly altered timing until the picture came in correctly. But they decided that was too cumbersome, and it was essentially impossible to write an application that could discern for itself where to start drawing, to be able to sync up the DMA routines. If your timing is off, what happens is that the image jitters left and right per scanline, and these jitters can be huge depending on how off the timing is. Sometimes like 20 pixels off. That ruins the picture.

Well, in 2012, as mentioned earlier, 3 long time demoscene developers finally figured out a method to sync CPU instructions to VDP instructions based off of VBlank. The entire process of figuring out the precise steps to sync any Megadrive this way is fascinating. Firstly, understand how the Megadrive itself keeps timing of its chips. There is a master clock that cycles at 53.693175 MHz on every megadrive. This value was chosen because it times with the other chips in whole values:

53.693175 / 15 = 3.57 MHz, the speed of the Z80

53.693175 / 7 = 7.67 MHz, the speed of the 68000 cpu

53.693175/10 = 5.36 MHz

This last figure is the speed of a clock inside the VDP called the Pixel Clock. This Pixel Clock controls the master timing for the entire VDP chip. Every tick of the Pixel Clock, a second counter called the Serial Counter is ticked twice. Thus, for every pixel that is processed by the VDP, the serial clock ticks twice. This was discovered by Nemesis using a logic sniffer and lots and lots of notes:

vram%20logic%20analyser.jpg


These are his timing notes from sniffing the bus to the VDP over a full frame, noting every VDP operation takes 4 serial clock ticks. The actual logic is very complex, because VDP operations are nested, and actually take 7 serial clock ticks to execute, but for the purposes of timing, only the start of each operation is important, and the first tick of a new operation always happens on the 4th tick of the last operation, making 2 operations work in succession for the final 3 ticks of each cycle. This was discovered by Nemesis mapping the timing access patterns like so:

VDP%20VRAM%20timing%20H32%20-%20small.png



VDP%20VRAM%20timing%20H40%20-%20small.png


With a hard reference of how long operations took in the VDP relative to the master clock, a formula of specific commands done in a precise order was discovered that could narrow down execution to precise synchronization. The first step in the process is to give the command to the Genesis CPU to wait for vblank from the television. Despite not having fine sync controls, the Genesis CPU can tell when Vblank occurs on the television. However, it does so vaguely, and gives an error range of 12 or so pixels. Meaning the Genesis can think VBlank begins on any pixel clock count from 0-12. That's a large range that can induce jitter.

The next step in the process is to fill up a number of First-In, First-Out memory queue slots the VDP has. To speed up operations and make scheduling more easy, the VDP doesn't just let you write one command to it at a time, but rather you can buffer 13 commands in a field of memory that is FIFO, and the VDP will execute them in order sequentially, popping them off the queue. The timing of this operation, with the timing of the VDP vs the timing of the master clock just so happens to evoke an effect similar to the Collatz Conjecture. The Collatz Conjecture is an algorithm with a special ratio that, when repeated over any integer N, will eventually devolve into a finite pattern of 4, 2, 1. The exact ratio of the VDP timing does not produce a finite pattern, but it does eventually devolve into a 2, 1, 2, 1, 2, 1, 2, 1 alignment pattern over several operations (and the ratio eventually desyncs after too many iterations). Thus, if you fill the FIFO memory with enough commands, it'll devolve into that alignment pattern, and the amount of memory the FIFO can hold is too low to desync. So after these two steps, we are now within a 2 pixel clock cycle alignment of our goal.

The final step of the puzzle relates to refresh cycle slots. If you refer to this image:

VDP%20VRAM%20timing%20H32%20-%20small.png


You see that on Serial clock count 156, it performs once a "refresh cycle." This is because the Sega Genesis uses dynamic ram for its color ram rather than static ram. Dynamic Ram is cheaper than static ram, but it requires a jolt of electricity every few moments to keep the data inside of it alive, vs static ram which keeps the data alive until power is cut entirely. These "refresh cycle" slots are moments when the genesis sends some electricity to the DRAM. This is an extremely high priority process, taking priority over every other command. If you perform a DMA within this Refresh Cycle, it'll wait until a 2 serial clock boundry is hit to perform it because it needs to refresh DRAM, which will eliminate the potential erroneous 2 pixel clock cycle offset (if it exists) that is caused by filling the FIFO method. This is because the math works out to the DMA command is able to be either 1 or 2 serial clock tick bound given our error, meaning those ticks can happen over any 2 serial clock tick moment. But the Refresh Cycle takes a full pixel clock tick, which itself works out to 2 serial clock ticks. Regardless of what our DMA command's alignment is - 1 serial clock tick or 2 - it won't execute until definitely 2 serial clock ticks go by because the Refresh Cycle needs to occur. You have to carefully time your DMA command from your CPU after you fill the FIFO queue to wait specific amount of time to hit this reset cycle reliably, however, and it's been worked out that a specific number of NOP commands (no operation) called by the CPU will align perfectly to the reset cycle. This is currently where I'm stuck, I've been trying to decypher the math to figure out how many NOP commands align to the reset cycle. Chilly Willy, Nemesis, and Oerg88 know the number, of course, but part of the fun is working out the math from the timing on your own. I know the number of NOP commands is circular , occuring again X number of ticks again and again. So if the number of NOP commands is, say, 5, and the cycle is 20, then it'd occur at 5, 25, 45, 65, etc. But again, I haven't worked out the cycle.

All this works out, however, to perfectly timing the VDP to the CPU to the Television screen, without any external input from the player. Once that's done, the effect is unlocked. The Megadrive doing blast processing is really awesome, but sort of useless because it halts the 68000 CPU, you can't do anything while it's drawing to the screen because the CPU is being used to blast the pixels to the VDP. So it's mainly good for static images on the Genesis.

BUT with a Sega CD attached, the proposition becomes entirely different. Since the Sega CD has it's own 68000 CPU that operates independently, it can use the Genesis 68000 CPU as a blast processor to draw a direct color 4096 color image to the screen, while it's own CPU handles the game logic, music, and everything else. This is how things like the Sega CD port of Wolfenstein 3D is done.

There are drawbacks, however. Because there is a refresh cycle of DRAM every several "cels", this manifests on screen by a vertical row of pixels being twice as fat. These vertical rows happen 5 times per frame, so there are 5 rows of "stretched" pixels across the screen. This can be hidden by carefully managing your image output, or by use of noise generation to hide the artifact. Another draw back is that blast processing halves the horizontal resolution, so you're working at a 160x200 resolution. Actually, the effect draws off screen and if you have an HDTV which does not hide overscan, you can actually output a widescreen image from the genesis this way at 180x200. Despite these draw backs, Blast Processing is a super useful video mode, especially when it can be enabled and disabled at will. It's especially great for large, screen filling colorful images.

Blast processing is absolutely amazing stuff. It's a weird instance where Sega knew about it, advertised the hell out of the feature, but totally didn't get it and never once used it. But it WASN'T marketing fluff, and it really could let the Genesis (and especially Sega CD) do some honest to god amazing stuff for the hardware.

This effect was thought impossible to achieve with sync by Nemesis originally. It was Oreg88 who stumbled upon the repeated pattern the VDP timing of the FIFO queue would devolve into after repeated operations, and Chilly Willy with Oreg who figured out the exact number of NOPs needed to sync everything. Really, really great cooperative teamwork there.

You can see this trick all over Overdrive 2 by Titan:



And a quick pic for those who don't want to watch a video:

tSzt6zK.png

Now, the caveat to both Blast Processing and DMA in general: while the 68000 CPU can DMA to VRAM at any cycle, the sega genesis uses DRAM for all of the ram in the system. DRAM is dynamically refreshed random access memory. It is the opposite of SDRAM, static random access memory. They describe the way in which RAM is able to retain the value. Deep down inside the electronics, all these "bits" of information are actually electrical currents. There exists types of circuits that trap electrical currents in loops with a small transistor gate that can be opened or closed to release the electrical current. By trapping electrical currents in these small loops, you can "store" a bit of information. There are a couple of different ways to trap memory. Both SRAM and DRAM need an electrical current to be provided occasionally to keep the electrical currents trapped in these loops, but SRAM needs this electrical current much, much less frequently. This makes SRAM, for example, good for saving games using small watch batteries, as the SRAM will draw electricity from that battery extremely slowly, enough to last for years. DRAM, by contrast, needs to be refreshed very consistently. You, essentially, need to pump it with electricity every couple of microseconds to keep the information in it alive. This need for electricity makes DRAM much cheaper than SRAM, which is why consoles use DRAM for their memory. So, to make sure its ram does not die, there is a clock on the genesis that is timed to constantly refresh DRAM. The VDP itself has DRAM that also needs to be refreshed. When the VDP DRAM is refreshed, it halts all processes to the VDP, including DMA. When this occurs, for a single pixel, DMA is disabled and you get a slightly horizontal stretching only at that pixel. The areas between these refresh periods are known as DMA Access slots, as those are the moments when DMA is usable. This is really, really technical stuff and not really necessary to understand how DMA works, but it's the caveat I mentioned before about being able to "DMA anywhere." You can DMA anywhere there is a DMA Access slot, which is everywhere except the single pixels where DRAM is refreshed.[/QUOTE]
Is there a way to save a post? Cause good lord this is a good one.
 
Oct 25, 2017
6,877
In layman's terms: an old crt "draws" a scanline, starting at the top and left to right. When it finishes drawing a scanline, the gun stops and realigns for the next scanline. It does this about 480 times.

What the game is doing is: once it reaches a certain point in the frame and the raster stops and gets ready to draw the next scanline, the game program says "hey! Switch the palettes". So when the next scanline starts drawing, the sprites are a different color.

I don't know how hard this is to actually do when programming a game, but this was a fantastic explanation that was easy for me to understand. Thank you.
 

Deleted member 12790

User requested account closure
Banned
Oct 27, 2017
24,537
I don't know how hard this is to actually do when programming a game, but this was a fantastic explanation that was easy for me to understand. Thank you.

Well, the way most games do it is not quite how that guy's post works. Most games on the genesis do not wait for HBlank, they just DMA whenever and let people deal with CRAM dots or use flashing graphics to hide them. For most people, that's good enough and generally easy to do. However, to actually do this trick the way that poster described -- contraining the entire trick within HBlank, as Sonic 3 does, is actually incredibly, enormously complex from a programming perspective. This is because, as I mentioned, you use those HBlank periods to do all sorts of calculations. Take a look at this labyrinth zone pic again:

GtoXA37.png


look very closely at the water line. You'll notice the CRAM dots are vertical bars of 3 pixels -- that means the palette changes over 3 full scanlines. Even ignoring the hblank period, changing all those colors still takes 3 full scanlines due to the amount of work the Genesis does during HBlank. Again, you use HBlank to do all sorts of calculations. If you want to do your palette swap entirely in HBlank, you have to do everything else fast enough to leave enough time for those palette changing DMA commands. These Hblank periods are time specific, you only have X number of CPU cycles to work with, no more and no less. Compare the labyrinth zone pic to the angel island zone pic:

JfYp4Gc.png


And you see it's still taking multiple lines to change all the colors -- more lines than Labyrinth Zone even, but not a ton more like you'd maybe expect. What this means is that it's doing all the DMA palette swapping in HBlank, meaning the time every single other kind of calculation is taking is reduced somehow between sonic 1/2 and Sonic 3. This is verifable -- Sonic 3 is actually a massive, massive rewrite of the engine that dramatically optimized things, specifically so effects like this could be accomplished. Doing palette swaps entirely in hblanks actually takes a genius programmer who knows an insane amount about the hardware, and more generally, game design.

There are way, way too many optimizations in Sonic 3 to discuss, you could literally fill up an entire website talking about the changes, but I'll go over a couple of the most major changes and how they speed things up enough to make palette changing in hblank possible. These speed optimizations are mainly related to reducing the amount of things the game engine needs to check. In the end, the more objects you need to check, the longer it takes. Checking less objects means speedier checks. This, naturally, leads to a short conversation about what "objects" are in Sonic the Hedgehog, and how the games handle them.

Objects are separate entities from things that get drawn to the screen. The things being drawn to the screen are sprites, areas of memory that hold pixels that can be drawn to screen. The term "sprite" has been basically perverted over the years by laymen, and it's important to understand the literal definition of a sprite. Most people use the term "sprite" in the way that the term "object" is more adept. An "object" is a small area of memory reserved for variables that define everything about a thing in the game. Example, Sonic, as an object, is his X position (a number), his Y position (his number), his state (a number representing running, walking, ducking, etc), the width of his hit box (also a number), the height of his hitbox (a number), the current frame of animation he displays (a number), etc. When you render Sonic, the VDP's sprite registers take information from his object definition to determine which pixels should be contained in these hardware sprites. Again, hardware sprites are JUST pixels on the screen, they aren't objects, they don't contain any real information. The difference between a sprite and a "non"-sprite in terms of video hardware is that individual sprite pixels can be drawn anywhere on screen, while background planes can only be drawn in tiles 8x8 pixels at a time.

So Sonic 1 and Sonic 2 use fixed sized level layouts. The sizes of the levels are 128x16 chunks big - a chunk is a type of memory compression object used to save space. In practice, this works out to 8192x1024 pixels big for Sonic 2 (sonic 1 uses slightly different dimensions). Every level in these games, no matter how small or large, is that big. They use fixed sizes for levels to make the math for handling objects a bit easier, at the expense of being a bit messier and requiring more space. The way the games work is that the game has an area of memory set aside to hold every object that is currently active in a defined area. The level itself is a map of positions, and every object in the level has a position. As the camera scrolls in the level from left to right, it checks the level map and every single enemy in the game that is within the range of the camera's X position (as thick as the camera's width) gets loaded into this area of memory for objects. That is to say, the game loads objects into memory in vertical strips, from the top of the level to the bottom of the level, all at once. Once you scroll left or right, if an object leaves the camera's range horizontally, it gets removed from object memory.

This is useful for these games, because the games scroll infinitely up and down (as seen in levels like labyrinth zone) and thus, if you're at the top of the screen, what is "above" you in terms of screen space is actually the very bottom of the level, and thus objects need to be loaded entirely in vertical strips to support this. However, it has the drawback of having lots, and lots, and lots of unseen and untouchable objects loaded into memory.

When Sonic 1 and 2 do collision detection, which they do every single frame, they need to examine Sonic's X, Y, W, and H values against other object's X, Y, W, and H values to see if they collide. Sonic 1 and 2 do this for every single object in memory, regardless of whether or not they can actually be collided with. So if there are 50 objects in memory, and only 20 are collidable, the engine still starts at object 1, checks if it collides, then goes to object 2, and so forth, until all 50 are checked.

This is slow!

Sonic 3 changes stuff up by firstly introducing variable sized levels. Each level "segment" can be any width or height. This lets them design much more sprawling, bigger levels while simultaneously using less memory. At points in each level, invisible objects (i.e. objects that aren't drawn by sprites --- see why knowing the difference between a sprite and object is important?) trigger the game to change the level size. So that, say, the first straight away in Angel Island Zone is only (example, not actual value) 500 pixels wide by 300 pixels tall, and once you reach the opening an invisible object changes the level dimensions to be much taller and wider. This means only small chunks of the level are loaded at once. This has a good benefit of limiting the number of objects spawned into memory. In addition to there being variable level sizes, rather than spawning every object within range of the camera's horizontal position, it also consider's the camera's verical position. This is a long way of saying that Sonic 3 doesn't spawn every object into memory in the level in vertical strips, but rather there is a sort of sliding window in Sonic 3 and if objects are within range of that window, they spawn, and when they exit range of that window, they despawn. This keeps object count down.

To further reduce checks, Sonic 3 instead moves towards what is today known as a proto-entity component system. Rather than 1 large chunk of memory for every object in the game, Sonic 3 splits up memory in to several smaller chunks, with each chunk being known as a system. These chunks of memory do more limited checks. Which system's memory bucket these objects are located in depend on their code to spawn the object. This means objects, when they spawn, can tell the game where to put them in memory. This way, if only 20 objects are collideable, the game can put those 20 objects into a special memory list that is checked by collision detection, rather than having the game check every object regardless of whether or not they can be collided. Doing, say, 20 checks per frame vs 50 is a huge speed increase.

Little changes like this all over the game resulted in a lot more time being left over in HBlank to do these DMA commands. It's very, very hard to do palette changes entirely in HBlank, because it means every single aspect of the rest of your game needs to be fast. What Sonic 3 does in this regards is extremely impressive for the hardware.
 
Oct 26, 2017
8,686
Just wanted to make a small addition/clarification to Krejlooc's fantastic posts, in regards to the physical operation of a CRT electron gun. The gun is essentially an emitter of electrons which are later focused into a beam, with the intensity of the beam varying at the source based on the video signal provided to the moniter.

The main thing I wanted to clarify is that it's not the gun itself that moves in order for the beam to hit different parts of the screen, as having such a rapid mechanical movement (going back and forth and up and down over several hundred scanlines every 1/60th or 1/50th of a second for the entire lifetime of the monitor) would be prone to failure.

Only the beam itself is being deflected back and forth and up and down via a varying electromagnetic field.
(Read up on the Lorentz force if you're not sure how this works: essentially any electric current - in this case the electron beam - behaves like a magnet. It not only creates its own magnetic field, but is influenced by magnetic fields in its surroundings).

There are two sets of coils - on the left and right, and the top and bottom - of the beam's path, and varying the currents in these coils determines if the magnetic field around the beam is stronger above, below, or on one of the sides of the path taken by the beam, thus causing the beam to bend in the required direction.

Here are some gifs that hopefully help clarify:

7SSZ.gif


crt.gif