Friday, January 28, 2011

More N64 emulator updates

During the last weeks, I worked a bit on improving my mupen64 port, here are the things I did.

As SM64 started to work well, I switched to the Zelda rom for my testing, it is a much more complex game to emulate graphics wise, so obviously it was completely buggy and painfully slow on the first run. The biggest problem was that unlike SM64, most of the rendering was done with my slowest pixel shader, and fixing the bugs would have made it even slower so I decided to do a complete rewrite of the shader. This time I designed it around something I just discovered: constant boolean registers, it allows flow control without a big performance hit. I took me a few tries to get it fast and accurate, but now that pixel shader is almost as fast as the old one on simple cases while being more accurate and much faster on complex cases. I also made my old shader as accurate as I could, it is now used for some rare cases the new one can't emulate. With a few more fixes to libxenon and the emulator (implementing 2D rendering for example), this makes Zelda reasonably fast and playable.

Next game was Mario Kart 64, this time it was fast and looking good on the first try, but crashed after a few races with some 'out of memory' message. It turns out something very important was missing from the libxenon 3D driver: a way to free what you allocate ! (texures/vertex buffers/...) So I replaced the very basic GFX memory allocator with some malloc-like one I found in libxenon sourcecode, and modified the emulator texture cache to actually free old textures when needed.

I think it's time for a new video so here it is :)

Sunday, January 9, 2011

Xenos

As you probably already know, I added support for Jasper EDRAM init in libxenon not long ago.

Next was adding support for more display resolutions. To add one in libxenon, you basically need 2 things: one ana chip registers dump and one xenos GPU registers dump. From the GPU dumps, you can guess video timings. So you add those timings to libxenon along with the ana dump and I thought it was supposed to work.
It turns out it wasn't that easy, you have to edit ana dumps a bit before they work in libxenon. I had to do some reverse engineering on the official kernel to find out exactly how.
Anyway I was able to add 1280x720, 1440x900 and 1280x1024 (all VGA). 1440x900 is rendered as 1440x896 because EDRAM isn't big enough to render it. This is the highest resolution I can add for now because unlike the official games, libxenon doesn't render 3d as 1280x720 and then scale it up, instead it renders to the native resolution directly.

Next was adding HDMI support. I quickly had an idea for this: logging I2C/SMbus and GPIOs accesses inside an official kernel (I thought that was the 3 ways of talking to HDMI hardware). I need to thank cOz for doing the kernel patching and logging. From those logs I was able to guess which registers I needed to write to activate HDMI output.

So we now have jasper support, HDMI, and more available resolutions in libxenon :)

Now I think I'll work on updating Xell a bit, so my next post will probably be about it.

Friday, January 7, 2011

N64 Emulator updates

I have already made quite a few improvements to my mupen64 port since the video.

About half an hour after making it, I fixed almost all the missing screens/skies and the clipped mario by adjusting the Z buffer range (previous range was 0..1, now it is -1..1).

I then improved N64 gfx combiners emulation (combiners are sort of early primitive pixel shaders). I use 360 pixel shaders to emulate them. At first it was really slow because I used many switches and loops in it and it seems doing that in a pixel shader isn't such a good idea. I got everything back to playable speeds by using mainly 3 techniques:
  • a color lookup table to emulate combiner 'source' (ie vertex color/texture color/constant/...)
  • a math formula that handles all the possible cases for the combiner operation (ie mul/add/sub/...)
  • having different pixel shaders (one fast that can only do simple things, one intermediate, and one slow that emulates everything) and switching between then when needed.
So now gfx emulation is quite fast but the emu still runs slowly when it emulates floating point intensive scenes like the mario head demo at the begininng of SM64 so I start looking at how mupen64 emulates floating point operations, I quickly discovered that the whole floating point unit was running in interpreter mode, oops!
A few #define later the emu was up to 50% faster.

Next I had an idea: why not try to get the X360 GPU to render my current frame in background instead of actively waiting for it to finish rendering.
Usually you do this:

/* resolve (and clear) */
Xe_Resolve(xe);
/* wait for render finish */
Xe_Sync(xe);

Now I do this:

/* resolve (and clear) */
Xe_Resolve(xe);
/* begin rendering in background */
Xe_Execute(xe);

and then I call Xe_Sync() at the last time right before beginning my next frame
I got a huge speed boost with this, Super Mario 64 now runs at around 100fps ingame !

Hello

Welcome my 360 blog,

Obviously it isn't about my (boring) life, instead I will try to post updates about what I work on the Xbox 360 so it will probably be about open source libraries, emulators, reverse engineering and hacking.

Just to make it clear, my end goal is to revive the free/legal homebrew scene on the 360 because it's the only next-gen console without a proper one, it's a shame even the PS3, which has been hacked only recently, has a bigger one.