Super Mario 64 has been decompiled

tehz

Member
Oct 30, 2017
8
Another noob question: Would this make the game easier to mod, compared to what we had before?
That's a question that will require a surprisingly nuanced answer.

The biggest thing that a C source, or even a disassembly, brings over standard ROM hacking is flexibility. If you want to create some Kaze-level super hack with crazy custom mechanics and with new enemies and with big modifications of the engine, and so on, doing it with this code will be far easier. If you just want to replace some levels with your own custom ones, there's already a large number of good tools that work fine (console compatibility aside) with just the standard ROM. Since no tooling exists for this code leak, doing things like that will be harder until people have time to create the tooling.

Once the tooling's there, which could take a while, using the C code will make anything above a super-trivial ROM hack far easier.

edit: also, having the source code will make it far easier to document and understand the game. The knowledge gained from that documentation may or may not be relevant to any given mod someone decides to do. SM64 is already one of the most examined games (see any panen video or kaze hack), so there is already a good amount known about this game.
 
Last edited:

KojiKnight

Member
Oct 25, 2017
8,548
I wouldn't put it past Nintendo to have lost the source code if it weren't for the DS port. Also I highly doubt we'll get a port of SM64 anytime soon.
Nintendo has been way better than most at archiving their stuff. The entire NES Mario series was ported first to SNES then gbc/a for example.

But yeah, Nintendo probably isn't porting vanilla Mario 64 any time soon but an emulated rerelease is inevitable at some point.
 

Advc

Member
Nov 3, 2017
2,035
To add to what caff said - to put it in more layman terms, the game you get on a disc/download is like a very "mixed down" version of the game's actual files/content.

Imagine when you hear a music track on the radio - you just hear a WAV or an MP3 right? It's just a sound file. A flat waverform. Some 1s and 0s.

But on the producer's computer are all the "stems" - each individual sound file for each instrument/layer of the song at very, very high resolution. These are all "flattened" into one sound for the final output. In addition, more importantly, there will be an intelligent "project" file for whatever production software they use, which grabs all those "stems" and puts them in at the right places, with the right programming logic, with the right effects and compressors, etc, etc - and they can hit the Export button to "compile" all the sounds and logic into one flat audio file.

A game file is pretty much the same. On a disc/download, you've got the very purest "output" of the game's content. Just the engine logic required to run, just the assets required to show, just the script/logic files required to make it happen - all mixed down into a few assets and a flat "executable" for the game's running.

But on the developer's computer? A mass of other files - the full game engine, many versions and iterations of assets, ten times more script files used to pull everything together.

So the fact they've managed to retro-engineer a "decompile" of SM64 basically means they, for the first time ever, have managed to get SOME version of what was on the developer's computer at the time of final compile. They can pull apart all the parts and scripts and really look at them straight-on, without any obfuscation. Basically they've opened the black box that is the game's software. This will let them dig into stuff better - and more importantly (yet more dangerously) they can recompile/reconfigure the game basically however they want. This will include making .exe files which will run immediately on any machine with any hardware, the ability to tweak/adjust any systems and assets, etc, etc.

For Nintendo - pretty terrifying.
Great explanation. I'm a total noob on this stuff but I find it fascinating regardless. It's amazing what a dedicated and smart fan can do nowadays.
 

God_Of_Phwoar

Member
May 29, 2018
2,107
Near London UK
Am I right in thinking that emulation allows you to upscale, change the palette, add filters, add hacks such as infinite lives... But you can only make MODS when you get access to the source code?
 

Skyfireblaze

Member
Oct 25, 2017
3,603
For everyone asking about mods and how this makes things easier, I'm no developer myself but from my understand see it like this. Imagine SM64 as a cake where you don't know the ingredients or the recipe. Now before this you could still alter the cake by sprinkling stuff on it or covering in chocolate, you can also go ahead and cut the cake and rearrange it but at the base-level it always stayed the same cake and you couldn't change anything about that.

Now what happened is, the people responsible for that basically tried various ingredients and recipes, baking things over and over trying different things till they finally got something that resembles the original SM64 cake made by Nintendo perfectly. The might have done some things a little different than Nintendo like maybe adding milk first while Nintendo added the milk last but at the end of the day the output is 100% identical. And since we now know the ingredients and the recipe we need to arrive at a SM64 cake we can now also alter the ingredients and the recipe at a base-level allowing for much more intricate things.
 

Ahnez

Member
Oct 27, 2017
565
This is amazing lol
As someone who knows only very basic assembly, I can't even imagine how much work was put into this
 

Krejlooc

Dreamcast Porno Party
Banned
Oct 27, 2017
16,526
This is quite unfinished, many of the routines still have machine disassembled symbols for them, for example:

extern s32 func_80252F98(struct MarioState *, u32, u32, s16);
extern s32 func_80252FEC(struct MarioState *);
extern s32 func_802530D4(struct MarioState *);
extern s32 func_802531B8(struct MarioState *);
extern s32 func_8025325C(struct MarioState *);
funny though, there are references to gLuigiObject in the code lol
 

Garlic

Member
Oct 28, 2017
725
This is cool, but I do wish some other games could do with like 1/1000th of the attention that Mario 64 does
 

MP!

Member
Oct 30, 2017
2,499
Las Vegas
time to port it to PS1

Someone do this

also reverse engineer FFVII and port it to N64 and maybe we can unify the split timelines that happened when cloud was put in smash
 

parski

Member
Nov 13, 2017
28
I started playing around with porting it to Switch yesterday but those unmapped identifiers were too frustrating. Might take another stab at it when (if) they release a finished source code. I wanna see how it runs and study the engine at runtime to see if everything is tied to the runloop interval like in Ocarina of Time (since the engines are related) or if it can allow for a fairly simple 60 fps.

Ideally I’d like to:
  • Port to Switch
  • 720p60
  • Widescreen
Fixing the camera would be cool too but I wouldn’t even know where to start. I just hope the team didn’t lose their motivation due to the leak.
 

UltraJay

Avenger
Oct 25, 2017
723
Cool stuff, but I would want to see the real code with the comments in it.

The lead dev quit the industry after the game went gold due to the stress of it, his code comments are probably amazing.

"Miyamotosan said he wants the baby penguin to be throwable off the cliff"

"@todo add Luigi select in here"
They recreated source code. They didn't get the original. Even if the process was a 1 to 1 reversing comments wouldn't exist as they don't execute and the compiler just skips those lines. Source code is only used to generate machine code. Out isn't "hidden" in the game's code or anything.
 

elzeus

Member
Oct 30, 2017
1,315
Raytraced sm64 would be bonkers. It's a shame Nintendo doesn't go for power anymore...
 

Krejlooc

Dreamcast Porno Party
Banned
Oct 27, 2017
16,526
Just out of curiosity, if this were the case, how easy would it be to modify the code to have it run at 60 fps or unlocked fps?
it is, you can see the main thread loop here:

/sm64/src/game/main.c:

Super Mario 64 has an operating system inside to simulate handling multiple threads running at once. The main game begins by creating a thread here:

Code:
    create_thread(&gIdleThread, 1, thread1_idle, NULL, gIdleThreadStack + 0x800, 100);
    osStartThread(&gIdleThread);
thread1_idle is a method that initializes the hardware:

Code:
static void thread1_idle(UNUSED void *arg)
{
#if VERSION_US
    s32 sp24 = osTvType;
#endif

    osCreateViManager(OS_PRIORITY_VIMGR);
#if VERSION_US
    if (sp24 == TV_TYPE_NTSC)
        osViSetMode(&osViModeTable[OS_VI_NTSC_LAN1]);
    else
        osViSetMode(&osViModeTable[OS_VI_PAL_LAN1]);
#else
    osViSetMode(&osViModeTable[OS_VI_NTSC_LAN1]);
#endif
    osViBlack(TRUE);
    osViSetSpecialFeatures(OS_VI_DITHER_FILTER_ON);
    osViSetSpecialFeatures(OS_VI_GAMMA_OFF);
    osCreatePiManager(OS_PRIORITY_PIMGR, &gPIMesgQueue, gPIMesgBuf, ARRAY_COUNT(gPIMesgBuf));
    create_thread(&gMainThread, 3, thread3_main, NULL, gThread3Stack + 0x2000, 100);
    if (D_8032C650 == 0)
        osStartThread(&gMainThread);
    osSetThreadPri(NULL, 0);

    // halt
    while (1)
        ;
}
This sets up the video drawing mode, but the most important part is the creation of a main game thread:

Code:
create_thread(&gMainThread, 3, thread3_main, NULL, gThread3Stack + 0x2000, 100);
    if (D_8032C650 == 0)
        osStartThread(&gMainThread);
peaking at the method static void thread3_main(UNUSED void *arg) reveals the main system loop:

Code:
  while (1)
    {
        OSMesg msg;

        osRecvMesg(&gIntrMesgQueue, &msg, OS_MESG_BLOCK);
        switch ((u32)msg)
        {
        case MESG_VI_VBLANK:
            handle_vblank();
            break;
        case MESG_SP_COMPLETE:
            handle_sp_complete();
            break;
        case MESG_DP_COMPLETE:
            handle_dp_complete();
            break;
        case MESG_START_GFX_SPTASK:
            start_gfx_sptask();
            break;
        case MESG_NMI_REQUEST:
            handle_nmi_request();
            break;
        }
        Dummy802461DC();
    }
Super Mario 64 doesn't offer any methods of variable timestep in the loop it runs from a cursory glance, it just runs over and over again as fast as it can. This is pretty easily confirmed on a real N64 by using, say, a game shark and spawning too many objects in one level. The game slows down to process them, because each internal tick is taking longer to finish. An operating system that controls all the modules of the engine communicate through a messaging queue, and the loop just runs over and over again waiting for a system to send a message for different events to be handled asynchronously it seems. There are semaphore implementations in the code, so I assume these different subsystems communicating through this messaging system have a blocking scheme going on to prevent something crazy.

Regarding how easy or hard it'd be to futz with the framerate of something like this, after digging through it for a day, I'd imagine your very first step would be to create some sort of fixed timestep for the physics system as those look like they update at a fixed expected rate. A fixed timestep system is one that subdivides real time into "tics" to be processed. How it works is that you sample a time stamp at the beginning of a single full frame of your game, then at the end of that full frame, you sample a second time stamp. Then, you find the difference in time between the two stamps to see how long your frame took to create in real time. From that real time difference, you convert that to see how many fixed intervals have passed and update the physics by that many steps.

Example, say I want my physics to update at 120 hz, i.e. 120 times a second. That means each "tic" of my physics calculations are 1000/120 = 8.333 ms (1000 is used because there are 1000 ms in every second). Say my drawing takes place at 30 hz (i.e. 30 times a second). That means each draw takes 33.3333 ms. Say my physics calculations are simple, every physics "tic" I move a ball 1 pixel to the right. When I draw frame 1, the ball is at pixel 1. By the time I draw the next frame, 33.333 ms have passed in real time. In the time 33.333 ms takes, I could run (33.3333/8.33333) = ~4 physics tics. So, for the next frame, I update my physics 4 times - the ball moves right 1 pixel on tic 1, then 1 pixel on tic 2, then 1 pixel on tic 3, then 1 pixel on tic 4. So in all, it's moved right by 4 pixels from one draw frame to the next.

Doing this decouples time specific functions, it keeps time constant regardless of a larger update interval. This works both ways with little effort. You can check against a time threshold and prevent a tic from happening if too small of a step occurred. Say your physics operate at 120 hz, but you manage to get it drawing at 240 hz. You'd want 1 physics update for every 2 draws in that case, which is mathematically simple to do. Using fixed timestep, you can decouple the time sensitive parts of Mario 64, like it's physics engine, from the stuff you want to operate at a variable interval, like the framerate.

This isn't quite a magic bullet though. If you have a high framerate, but physics or controller polling is still done at, say, a 15 hz interval, then the game will feel sluggish and floaty. To make every system in the game operate independently requires much, much more work. Several systems would have to operate at a variable timestep, which requires much more engineering than implementing fixed timestep.

Also, as I mentioned in an earlier post, this is really, really incomplete. OP kind of jumped the gun here, I'd say a good 30-40% of the code is still indecipherable machine translation. So there could be parts of the engine that are timestep dependent that one might not even see or notice, because it's obfuscated in function by just some random naming symbols. Only the annotated parts of the source code are discernible, the parts that aren't are basically gibberish.

edit: checking out the code for the profiler, every system is run in it's own thread - audio, video, etc. The profiler is already set up for fixed timestep in the way it profiles these systems, so something like I describe for the main game loop is totally doable.
 
Last edited:

Blackpuppy

Member
Oct 28, 2017
1,476
^^ thanks for the detailed post! I’m going to have to re read it a couple of times to fully digest it.

If I understood correctly, an uncapped frame rate might be difficult BUT.. what if you were satisfied with just getting to 60fps... couldn’t you just go through the code and double all the tick rates and polling? (30 htz to 60htz, 15htz to 30 htz etc)
 

Krejlooc

Dreamcast Porno Party
Banned
Oct 27, 2017
16,526
If I understood correctly, an uncapped frame rate might be difficult BUT.. what if you were satisfied with just getting to 60fps... couldn’t you just go through the code and double all the tick rates and polling? (30 htz to 60htz, 15htz to 30 htz etc)
Nope, what I'm saying is that your first step in decoupling framerate is to get the rest of the game operating at consistent rates using a programming pattern that turns calculations into small steps that can be simulated more or fewer times per cycle to account for variances in time. From what I see, Mario 64 could accommodate this. The number of simulations per cycle is derived mathematically, I used simple easily divisible figures to make the math more obvious.
 

Blackpuppy

Member
Oct 28, 2017
1,476
Nope, what I'm saying is that your first step in decoupling framerate is to get the rest of the game operating at consistent rates using a programming pattern that turns calculations into small steps that can be simulated more or fewer times per cycle to account for variances in time. From what I see, Mario 64 could accommodate this. The number of simulations per cycle is derived mathematically, I used simple easily divisible figures to make the math more obvious.
Ahhh. Gotcha.

Edit: I just saw your edit above! Thanks for the detailed look. I have a real layman’s knowledge of programming so I appreciate you taking the time to explain this in depth.
 

Ex-Psych

Member
Oct 25, 2017
746
This is fascinating to see.

That youtube video definitely gave me a nice idea of what to expect.
 

JahIthBer

Member
Jan 27, 2018
3,429
Hopefully we get a decompiled Zelda 64 as well one day, so many things you could do with that, while Mario 64's source code is more of a preservation thing, the rom hacks for it are pretty advanced & it even has it's own model editor/importer. the best i could see for this is ports, but probably an android port on the play store before a PC version with 144fps & improved controls.
 

Burbank

Member
Sep 9, 2018
281
Yes, it means someone can make a native PC port with updated visual effects, modern controller support and full mod support. No more fiddling with an emulator.
Now that would ruffle some feathers over at Nintendo HQ. This game could look nice ray-traced and I wanna see a simple playable 3D Mario-Maker mod.

Are similiar attempts being made for any other N64 games?
 

JahIthBer

Member
Jan 27, 2018
3,429
Now that would ruffle some feathers over at Nintendo HQ.

Are similiar attempts being made for any other N64 games?
Metroid Prime & the older Pokemon games, generally Nintendo will ignore it if it's not popular, but lets say Mario 64 gets a PC port & some popular speedrunner on youtube makes a video about it & it goes viral, then yeah Cease & Desist time. i guarantee this is going to happen.
 

Filipus

Avenger
Dec 7, 2017
1,281
it is, you can see the main thread loop here:

/sm64/src/game/main.c:

Super Mario 64 has an operating system inside to simulate handling multiple threads running at once. The main game begins by creating a thread here:

Code:
    create_thread(&gIdleThread, 1, thread1_idle, NULL, gIdleThreadStack + 0x800, 100);
    osStartThread(&gIdleThread);
thread1_idle is a method that initializes the hardware:

Code:
static void thread1_idle(UNUSED void *arg)
{
#if VERSION_US
    s32 sp24 = osTvType;
#endif

    osCreateViManager(OS_PRIORITY_VIMGR);
#if VERSION_US
    if (sp24 == TV_TYPE_NTSC)
        osViSetMode(&osViModeTable[OS_VI_NTSC_LAN1]);
    else
        osViSetMode(&osViModeTable[OS_VI_PAL_LAN1]);
#else
    osViSetMode(&osViModeTable[OS_VI_NTSC_LAN1]);
#endif
    osViBlack(TRUE);
    osViSetSpecialFeatures(OS_VI_DITHER_FILTER_ON);
    osViSetSpecialFeatures(OS_VI_GAMMA_OFF);
    osCreatePiManager(OS_PRIORITY_PIMGR, &gPIMesgQueue, gPIMesgBuf, ARRAY_COUNT(gPIMesgBuf));
    create_thread(&gMainThread, 3, thread3_main, NULL, gThread3Stack + 0x2000, 100);
    if (D_8032C650 == 0)
        osStartThread(&gMainThread);
    osSetThreadPri(NULL, 0);

    // halt
    while (1)
        ;
}
This sets up the video drawing mode, but the most important part is the creation of a main game thread:

Code:
create_thread(&gMainThread, 3, thread3_main, NULL, gThread3Stack + 0x2000, 100);
    if (D_8032C650 == 0)
        osStartThread(&gMainThread);
peaking at the method static void thread3_main(UNUSED void *arg) reveals the main system loop:

Code:
  while (1)
    {
        OSMesg msg;

        osRecvMesg(&gIntrMesgQueue, &msg, OS_MESG_BLOCK);
        switch ((u32)msg)
        {
        case MESG_VI_VBLANK:
            handle_vblank();
            break;
        case MESG_SP_COMPLETE:
            handle_sp_complete();
            break;
        case MESG_DP_COMPLETE:
            handle_dp_complete();
            break;
        case MESG_START_GFX_SPTASK:
            start_gfx_sptask();
            break;
        case MESG_NMI_REQUEST:
            handle_nmi_request();
            break;
        }
        Dummy802461DC();
    }
Super Mario 64 doesn't offer any methods of variable timestep in the loop it runs from a cursory glance, it just runs over and over again as fast as it can. This is pretty easily confirmed on a real N64 by using, say, a game shark and spawning too many objects in one level. The game slows down to process them, because each internal tick is taking longer to finish. An operating system that controls all the modules of the engine communicate through a messaging queue, and the loop just runs over and over again waiting for a system to send a message for different events to be handled asynchronously it seems. There are semaphore implementations in the code, so I assume these different subsystems communicating through this messaging system have a blocking scheme going on to prevent something crazy.

Regarding how easy or hard it'd be to futz with the framerate of something like this, after digging through it for a day, I'd imagine your very first step would be to create some sort of fixed timestep for the physics system as those look like they update at a fixed expected rate. A fixed timestep system is one that subdivides real time into "tics" to be processed. How it works is that you sample a time stamp at the beginning of a single full frame of your game, then at the end of that full frame, you sample a second time stamp. Then, you find the difference in time between the two stamps to see how long your frame took to create in real time. From that real time difference, you convert that to see how many fixed intervals have passed and update the physics by that many steps.

Example, say I want my physics to update at 120 hz, i.e. 120 times a second. That means each "tic" of my physics calculations are 1000/120 = 8.333 ms (1000 is used because there are 1000 ms in every second). Say my drawing takes place at 30 hz (i.e. 30 times a second). That means each draw takes 33.3333 ms. Say my physics calculations are simple, every physics "tic" I move a ball 1 pixel to the right. When I draw frame 1, the ball is at pixel 1. By the time I draw the next frame, 33.333 ms have passed in real time. In the time 33.333 ms takes, I could run (33.3333/8.33333) = ~4 physics tics. So, for the next frame, I update my physics 4 times - the ball moves right 1 pixel on tic 1, then 1 pixel on tic 2, then 1 pixel on tic 3, then 1 pixel on tic 4. So in all, it's moved right by 4 pixels from one draw frame to the next.

Doing this decouples time specific functions, it keeps time constant regardless of a larger update interval. This works both ways with little effort. You can check against a time threshold and prevent a tic from happening if too small of a step occurred. Say your physics operate at 120 hz, but you manage to get it drawing at 240 hz. You'd want 1 physics update for every 2 draws in that case, which is mathematically simple to do. Using fixed timestep, you can decouple the time sensitive parts of Mario 64, like it's physics engine, from the stuff you want to operate at a variable interval, like the framerate.

This isn't quite a magic bullet though. If you have a high framerate, but physics or controller polling is still done at, say, a 15 hz interval, then the game will feel sluggish and floaty. To make every system in the game operate independently requires much, much more work. Several systems would have to operate at a variable timestep, which requires much more engineering than implementing fixed timestep.

Also, as I mentioned in an earlier post, this is really, really incomplete. OP kind of jumped the gun here, I'd say a good 30-40% of the code is still indecipherable machine translation. So there could be parts of the engine that are timestep dependent that one might not even see or notice, because it's obfuscated in function by just some random naming symbols. Only the annotated parts of the source code are discernible, the parts that aren't are basically gibberish.

edit: checking out the code for the profiler, every system is run in it's own thread - audio, video, etc. The profiler is already set up for fixed timestep in the way it profiles these systems, so something like I describe for the main game loop is totally doable.
What a good writeup. This is so important to understand how it works so people understand why many times you can't just pick old games and "Run them at 60fps".

I remember this was a problem with the old halo games when they were porting them to the MCC since increasing the fps would mess up the physics.