The Super Nintendo has limited resources.
This is the first thing you notice when programming a Super Nintendo, especially when creating a game engine that has a lot of sprites in it.
The SNES sprite hardware has access to the following:
- 128 Objects (sprites)
- 16 KiB of VRAM (512 8x8 tiles)
- 8 Palettes of 16 colours
It’s not a lot in today’s world. We need to be careful as to how we manage it.
The easiest way of managing sprite resources is to use a fixed tileset for the entirety of a level. Each level would contain a list of tilesets to load into VRAM/CGRAM. Resources are now managed by the designer/level editor and not the game engine.
The main limitation of this method is that you can only upload tiles to VRAM when either the screen is off or all the sprites are offscreen.
Unfortunately 16KiB of VRAM is not a lot when you consider that a level could have many different types of enemies in it.
Most Super Nintendo games solve this problem by assigning a small section of VRAM (usually 4 16x16 tiles worth) for the main character. As the player character usually uses a unique tileset arrangement and is only shown once, it is a waste of resources to upload the entire player tileset to VRAM if it were only uses a small part of it.
All of this places a limit on the complexity of the sprite’s graphics. In order for a game to have larger sprites or more animation frames, you will need to reduce the number of enemies can be in a level at any given time. This is why most games have a limited set of enemies per level and have a dedicated room/level for the bosses.
Some games take advantage of this limitation by having lots of small levels and reloading the VRAM tileset upon each level transition (ie, Zelda: A Link to the Past, Super Metroid).
When programming a Super Nintendo, it is all about the compromises.
Some games use a completely different method. These games would allocate a fixed size of VRAM1 for each character when it appears on screen. That VRAM would then be released when the character moves outside the active window2.
Every time the frame of an entity changes onscreen, the tiles used in that frame will be uploaded to VRAM during VBlank.
As with the fixed tileset, this one also has limitations.
The main limitation of this method is that we do not have enough CPU time to decompress graphics during gameplay. This means you need to have the entire tileset of all entities in play uncompressed either in RAM or ROM.
Secondly, there is also a limit to the number of enemies on the screen3.
Thirdly, there is a limit to the number of tiles that can be uploaded to VRAM during VBlank. There is only enough time to upload ~5½ KiB of data to the PPU with DMA during VBlank on an NTSC SNES4. Games with dynamic tilesets need a way to limit their sprite tileset changes per frame.
Overcoming the transfer limitation
The simplest method is to just refuse to change an entity’s frame if the DMA buffer is full. However that could lead to cases where the DMA buffer is always full and an entity will never have its frame changed.
The next method is to stagger the entity’s animations in-between frames. For example:
- Frame 1: process the animations of entities 0-3
- Frame 2: process the animations of entities 4-7
- Frame 3: process the animations of entities 8-11
This does place a hard limit on the frame-rate of the animations (in the above example 20 fps for NTSC, 16⅔ fps for PAL); with the limited ROM space a SNES cart offers, this is not a real problem.
Another method is to have a list of which entities failed their animations in the current display frame. Upon the start of the next display frame, they will be processed first and then the game loop resumes as normal.
A Hybrid Approach
The fixed tileset approach enforces a strict limit on the entity’s tilesets and the dynamic approach enforces a limit on the number of entities on screen. My game engine, UnTech, is using a different approach.
In UnTech entities can be tagged as having a fixed tileset, this means that every frame and animation uses the same set of tiles. Because of this, the VRAM allocator can search for and eliminate duplicate tilesets.
When a entity’s metasprite is activated (on-screen, or about to go on-screen) and the tileset has not been seen by the allocator then a new slot is created and the DMA buffer is updated to upload the tileset on the next VBlank. Because the DMA buffer may be full, allocation would be delayed one or two frames until the DMA buffer is free. This should not be noticeable as most entities would be activated off-screen and it would take half a dozen frames to walk on-screen.
When the entity is deactivated (moves outside the active window or dies), the slot is dereferenced and if there are no more entities using the tileset it is freed from VRAM.
This methodology allows the UnTech engine to dynamically load/free fixed tilesets as needed without needing to trigger a level transition. It has no animation-lag because the entire tileset is loaded into VRAM and there will no delay waiting for the DMA buffer to free.
UnTech will keep a reference-counted allocation table to manage all of this. Actually, it is a double-linked list array of structures reference-counted allocation table. Due to the limited number of registers in the 65816 this mouthful is approximately 2.25 scanlines faster at detecting duplicates and allocating VRAM than the naive approach.
Unfortunately, some entities will have very large tilesets (the player, bosses, special NPCs) and would use too much VRAM. Those entities are labelled as dynamic, whose tilesets could change at any time.
The handling of dynamic tilesets is simple enough to explain. When a dynamic tileset entity is activated the next free slot is removed from the reference table and is now owned by the entity. The entity’s animation engine is free to DMA tiles to that slot knowing it is only owned by one entity.
While I believe it to be the best of both worlds, it comes at a cost. As expected the tilesets must be uncompressed in ROM5 and in order to prevent memory fragmentation UnTech has a limited set of tileset sizes.
UnTech only supports tilesets that are:
- 1x 16x16px tile in size
- 1x 16px VRAM row in size (32 8x8 tiles or 8 16x16 sequential tiles)
- 2x 16x16px tiles in size (which will not be adjacent to each other)
- 2x 16px VRAM rows in size (which will not be adjacent to each other)
I haven’t figured out how I should solve the VBlank DMA limit for dynamic tilesets yet. I’ll tackle that problem when I encounter it in a few months time.
Donkey Kong Country allocates 1KiB of VRAM (as 32 8x8 tiles) per character. Secret of Mana allocates 1536 bytes (as 12 16x16 tiles) per character. ↩
The active window is a bit larger than the display. This is to handle the case where a character is constantly transitioning on and off-screen. ↩
Secret of Mana has a limit of 3 Players and 3 Enemies on screen when in a combat level. ↩
Less if you take too much CPU time to setup the various transfers. ↩
I believe this will not be a problem for me. Secret of Mana has 324KiB of uncompressed sprite tiles. ↩