Last active
January 3, 2024 10:24
-
-
Save ISSOtm/7a7a082fb2d73b5e7c645e62571b352d to your computer and use it in GitHub Desktop.
Fancy VBlank handler skeleton for Game Boy, with detailed explanation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SECTION "VBlank vector", ROM0[$0040] ; (address as required by the hardware.) | |
push af ; Avoid overwriting registers, that would mess up the code being interrupted! | |
; Always restore registers from their shadows, to reset parameters even in the presence | |
; of raster effects. (This does imply that those registers should be written as late as | |
; possible during a frame's processing.) | |
ldh a, [hLCDC] | |
ldh [rLCDC], a | |
jr VBlankHandler | |
VBlankHandlerJump: | |
SECTION "VBlank handler", ROM0 ; See at the end of this section for its placement. | |
VBlankHandler: | |
ldh a, [hSCY] | |
ldh [rSCY], a | |
ldh a, [hSCX] | |
ldh [rSCX], a | |
; ...restore other registers from their shadow... | |
; Avoid touching any incomplete buffers. | |
ldh a, [hProcessingFrame] | |
and a | |
jr nz, .lagging | |
; Note that from now on, we are sure that the caller is `WaitVBlank`; thus, we can trash | |
; all registers freely (and consider those side-effects of `WaitVBlank` itself). | |
; ...End-of-frame VBlank processing... | |
; For example, OAM DMA: | |
ldh a, [hOamAddrHigh] | |
and a | |
call nz, hOamDma | |
xor a | |
ldh [hOamAddrHigh], a | |
; Flag the next VBlank as lagging, unless otherwise overridden. | |
ld a, 1 | |
ldh [hProcessingFrame], a | |
; The caller is WaitVBlank, which we now want to return from. Doing this makes the next | |
; `pop af` pop off this handler's return address, so `reti` will return from `WaitVBlank`. | |
pop af | |
.lagging | |
pop af | |
reti | |
; The left-hand side of the comparison is the address `VBlankHandler` would have if "here" (`@`) was at address $100. | |
if $100 - (@ - VBlankHandler) < VBlankHandlerJump + 128 | |
align 16, $100 ; Ensure there is no gap before the above and the header. | |
else | |
; We have to have a gap so the `jr VBlankHandler` can reach its target; but make it as small as possible. | |
align 16, VBlankHandlerJump + 127 + (@ - VBlankHandler) | |
assert VBlankHandler == VBlankHandlerJump + 127 ; (This is what the line above aims to guarantee.) | |
endc | |
; (`align 16` effectively acts as an intra-section `org`.) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SECTION "VBlank wait routine", ROM0[$08] ; (Any `rst` location is fine, or you can put this anywhere and use `call`.) | |
WaitVBlank:: | |
xor a | |
ldh [hProcessingFrame], a | |
; The VBlank handler will force this function to exit (via stack manipulation) | |
; due to `hProcessingFrame` being zero; this is why the following loop is infinite, | |
; yet this function eventually returns. (See below for the rationale for this trickery.) | |
.wait | |
halt | |
jr .wait | |
; BIG EXPLANATION AHEAD!! ** CW: LOTS OF BORING NERDY WORDS ** | |
; READ AT YOUR OWN RISK: THE AUTHOR SHALL NOT BE HELD RESPONSIBLE FOR ANY RESULTING FEELINGS INCLUDING | |
; BUT NOT LIMITED TO "BORED AS F", "MY HEAD ASPLODE", OR "I FEEL SO ENLIGHTENED". ALSO, QUACK. | |
; | |
; You're probably wondering, why do the above song and dance instead of the ubiquitous: | |
if 0 ; (Ensure the snippet below is syntax-highlighted, but not compiled.) | |
WaitVBlank:: | |
xor a | |
ldh [hVBlankOccurred], a | |
.wait | |
halt | |
ldh a, [hVBlankOccurred] | |
and a | |
jr z, .wait | |
ret | |
endc | |
; | |
; Well, Brownie Points and Looking Kewl certainly play a role here (look ma!), | |
; but there's also a Very Serious reason. | |
; The above is actually bugged! Let's walk through a scenario where the bug is triggered. | |
; | |
; 0. Two interrupts are selected in rIE: VBlank and (say) timer. | |
; (Of course, IME=1; otherwise, using `halt` at all is a non-starter.) | |
; 1. The timer interrupt is requested, and its handler executed. | |
; 2. The timer handler returns just before VBlank; let's say 4 M-cycles before. | |
; 3. `ldh a, [hVBlankOccurred]` is executed, and it reads 0 still. | |
; 4. 1 cycle remains until VBlank, so `and a` is executed; it sets the Z flag, since a == 0. | |
; 5. VBlank occurs, so the VBlank handler is called. `hVBlankOccurred` gets set to 1 *now*. | |
; 6. After the handler returns, `jr nz, .wait` is executed, but since the Z flag is set, the jump is taken. | |
; 7. `halt` is executed again, so the function will keep waiting until the next interrupt! | |
; We have missed the VBlank handler! | |
; | |
; This kind of bug is called a "TOCTTOU" (Time-Of-Check-To-Time-Of-Use, ask Wikipedia if you're curious). | |
; The "check" is `ldh a, [hVBlankOccurred]`, the "use" is `halt`; the bug stems from interrupts being able to | |
; occur in-between those two. | |
; The strategy used to fix this, is to make the VBlank handler itself responsible for breaking out of the | |
; `WaitVBlank` function. [^1] | |
; | |
; So why has the above "bugged" loop been used in production throughout the GB's entire life span | |
; and for lots and lots of homebrew? Well, the conditions needed to trigger that bug are *very* fringe; | |
; and most interrupt setups make it impossible. | |
; But, imagine you end up fulfilling all of those conditions by (un)luck, and you don't have all of the above | |
; in mind. I can guarantee the debugging session is *not* going to be short nor sweet. | |
; Might as well nip the problem in the bud, yea? | |
; | |
; So... why use Fancy New Thing instead of Ol' Reliable? Well, I've explained above why Ol' is not so Reliable; | |
; so why use it instead of New Reliable? | |
; Moreover, the New function provides several *more* advantages: | |
; - The VBlank handler can detect lag, and avoid performing operations if the frame is not finished yet. | |
; For example, the OAM DMA can be skipped, and thus your OBJs won't move before you finished setting up | |
; the new camera position (they'd appear to jitter a bit). | |
; - The function is smaller (Old: 10; New: 6), which means it can fit in a `rst` and thus be called | |
; faster and using fewer bytes. | |
; - It just looks fucking rad, and you can show off your wizardry. Isn't that the most important? ;D | |
; | |
; Footnotes: | |
; [^1]: You can argue that VBlanks may still be missed until the `ldh [hProcessingFrame], a` is executed, | |
; but this is also a problem with the original function; and if you think about it hard enough, | |
; you'll realise that it's essentially unavoidable. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment