Tuesday, July 29, 2014

Endianness Fixes

I had some time this weekend where I wasn't at home able to work on ResidualVM. This provided an excellent opportunity to take a look at some of the issues that ResidualVM has on big-endian architectures like my PowerBook G4. I'm currently running Gentoo Linux on my Powerbook, and was able to play other OpenGL games without issue and Grim appears to work fairly well in ResidualVM. However, EMI is a mess!

The LucasArts intro has the wrong colors!
Even worse, the set wed has incorrect colors and alpha channel issues!
As can be seen above, the colors for many objects appear to be swapped from the intended colors. Let's investigate these issues, starting with the intro screen.

When debugging graphics issues like the intro screen, I often examine the colors between the working, little-endian version big-endian version to see if there the colors are easily identified as being swapped. Unfortunately, this wasn't immediately obvious as being the case. The colors were completely different:
No direct byte swap will make the right turn into the left...
However, I then checked and found that when loading bitmaps, two formats are present in ResidualVM, a packed 1555 format and a 32 bit format. With this information, it seems that the data was specifying an alpha channel, which was modifying the color components that were swapped. After some experimentation, I found that the native endianness is the preferred byte order for the Escape from Monkey Island binary format containing a bitmap image, a .til file. This worked well for 32-bit .til files, as found in the Windows version of EMI. However, the PS2 version uses 16-bit .til files. I found that it was easiest to convert the 16-bit input to a 32-bit output and doing the bit manipulation when the image was loaded.

With the .til files fixed and the intro displaying the correct colors, let's re-examine the set wed, to see if that was effected by the change:
That's a whole lot better!
While the output is now a whole lot closer, there's still a bunch of issues to be addressed. Let's focus on the background first. In the background, the enemy pirate ship is bouncing in the waves. This effect is actually achieved by an Actor object rotating and moving up and down on a timer. The Actor is drawn with a Sprite object, which are drawn using an image from a .tga file. After some inspection, I found that the desired behavior is to again read from the file in the native format. Another tweak was required to change the display format from BGR to RGB and this problem was solved. Here's the output now:
Getting closer!
Of note, this change also fixed the backgrounds for dialog boxes and other components that use .tga files. The next most obvious issue is that Guybrush and the brazier are both rendered as completely black with no color for the textures. I noticed that in the Act 1 title card, the actors were actually visible, although dim. This gave me the hunch that the lighting was to blame for the dark actors. To test this, I tried commenting out the lighting code when drawing the actor and found that the texture was drawn mostly correctly, albeit without lighting:
With lighting commented out
I then inspected the values used to dim Guybrush and found that the in the lighting object, the x, y, and z values were all NaN, causing the output to be black. As most endian issues appear when using data from the other endian systems, I looked at the loadBinary method for lighting and found that the floats used by the lighting were read in directly, rather than being interpreted by get_float() which performs the endian swap if needed. After adjusting both the Light and Shadow loadBinary methods, I got the correct output for the set wed as seen below:
Looks like it's all working correctly now!
The only outstanding issue right now on big-endian machines is that the 1555 ARGB color for the video frame isn't unpacked correctly. I suspect that an approach similar to how I fixed the 16 bit .tga files would work here as well, but I need to see if there's a better way to do this first. This work is in PR #974.

Tuesday, July 22, 2014

More on Text Positioning

Continued from the previous entry.

Let's take stock of the current state of the text positioning changes after the last blog post. First, let's look at the differences in the set wed when Guybrush says "There's an enemy pirate fighting over there.":

State of the text difference at the start of this post in the set wed
With the changes from before, we can see that there is less difference than when we started, but that the text for Guybrush's speech bubble is positioned to the right by 1 pixel and towards the bottom by 5 pixels. Likewise, the text for "Look at enemy pirate" is closer, but still not quite right. It appears to be 11 pixels too high, towards the top compared to the retail version.

Another place with a similar issue is the text at the top of the inventory. It appears that the text is drawn 12 pixels too high, resulting in this difference:
 Misplaced Inventory text
As an aside, the differences in object rotation in the inventory appear to be related to be related to the rotation and attachment work I've been blogging about. That work wasn't in this branch when I took the screen shot!

Anyway, there's definitely a problem with the placement of text on the Y axis, with a similar problem displayed in both sets. As it turns out, there's code that specifically moves the text in the Y direction by 12 pixels in ResidualVM. This code is found in textobject.cpp and appears to have been added with this commit. Commenting out this adjustment results in the text for the inventory being placed in the correct location and bringing the command line in the set wed within 1 pixel of the correct location. Let's take a new screen shot with this change:
Current text difference at this point
As can be seen, the action sentence at the bottom is very nearly correct and only off by one pixel. However, the speech bubble got worse. Now, the ResidualVM text is still 1 pixel to the right, but 17 pixels lower than it should be.

My first thought was that there was simply a blank line inserted below the text, but the font height for EMI was 26 pixels, not 17. I next checked the bounding boxes for the actors to see if they made sense by drawing a green line around Guybrush representing the bounding box whenever he spoke. Here's what the output of that was:
Guybrush's bounding box
However, in order to get the bounding box to draw properly, I had to subtract the Y components from 480. In order to make the code simpler, I removed the subtraction from the Y component when computing the bounding box from the bounding box calculation and re-ran the test. Here's the result after modifying the bounding box code:
The green line is the bounding box for Guybrush's actor
Although the bounding box looks a little worse (see: Guybrush's feet and hair), I checked the text position with this new bounding box and found that it was off by 26 pixels, which if you remember, was the height of the font. Subtracting 1 from the number of lines that the text was pushed up by gets the fonts for this scene rendered very closely to the original:
Very close now!
We can see that the fonts are still off by 1, but for now, this is close enough! So, let's check to see if there are problems in other scenes. Unfortunately, there are:
The text is in the wrong place again!
Here, you can see that Guybrush's speech is rendered at the wrong place in the set gmi.

As a check, I also tried this branch with Grim Fandango and found that it too had fonts drawn too low, with text 13 pixels too low and offset by 2 pixels to the right for the top line and 1 for the bottom line:

Grim also has differences in the font location
As an aside, bgK also tested this with the French version of Grim, which was reported to be problematic with using the kerned font sizes. I'm happy to say that the crashing issues from that appear to be resolved!

In the next blog post, we'll take a look at these two issues and see if we can resolve them.

Monday, July 21, 2014

Making the Correct Rotations Work

Continued from the previous entry.

In the previous posts, I've managed to get the relative position and rotation values to match the ones from the retail version of EMI. In this post, I'm going to discuss how to present these rotations on the screen, fixing the messed up image we left off with in the last post:
Where we left off...
Let's start by taking a look at the apitrace output of this scene again. I started by running the retail version of the game with the options "-w -gl" to start in a windowed mode using OpenGL. After capturing the apitrace of jumping ahead in the script and moving to the set pph using the jump target "160",  I then loaded the trace into qapitrace. First, I had the trace build thumbnails of all of the frames so that I could identify the first frame that drew the scene pictured above. With the scene identified, I then looked through the data captured for this frame to inspect the rotation matrices that were responsible for rotating and placing the rendered objects.

I decided to focus on the trap first because it is a constant object in the scene and isn't moving. To find the part of the trace responsible for drawing the trap, I browsed through the API calls, checking the Surfaces tab to see if the trap had been rendered yet:
APITrace output, examining the drawing of the trap
With this section of trace found, I could then inspect the OpenGL state (in the Parameters tab) when this was rendered, providing us with the transforms used to place the object in the scene. With these transforms in hand, I then proceeded back to the ResidualVM code and checked the matrices against the ones I found in the trace. From this analysis, it appears that the Projection Matrix is correct, but the ModelView Matrix is incorrect. Let's take a look at what ResidualVM is doing to arrive at the ModelView Matrix for the model of the trap.

First, I inspected the value of the ModelView matrix at the start of drawing the trap and found that it was set to an identity matrix. This was then multiplied by another matrix using glMultMatrixf, then translated using glTranslatef, then multiplied once more, again with glMultMatrixf. In the retail version, the only command run is to load the matrix using glLoadMatrix, indicating that the matrix was probably prepared ahead of time or done in software. That said, with the correct result for the ModelView matrix, we can go back through the steps taken in ResidualVM and correct the problem.

So, where in the code do we find this sequence for setting the ModelView? In the OpenGL path (engines/grim/gfx_opengl.cpp), there's a function called startActorDraw which prepares the ModelView matrix for drawing the object. In the non-overworld case, it multiplies the current camera rotation, then translates with the camera position. Then, it takes the Actor's final matrix, transposes it and multiplies the ModelView by the result. There are a few likely possibilities for where this implementation is wrong:
  1. The camera rotation is wrong because we changed the Euler Order for the actor as part of this fix
  2. The actor's final rotation matrix is incorrect
Since I suspect that the final rotation matrix is correct because of the work we've done up to now, I'm going to first start with checking into the camera rotation. To begin, where does the camera rotation come from? In the same file, there's two functions that are responsible for setting up the camera, positionCamera and setupCamera. When the set is loaded, there are a number of parameters that define the camera for each setup in the set:
  • position - The point position of the camera
  • interest - The point that the camera is pointed at
  • roll - The roll of the camera
  • fov - The field of view of the camera
  • nclip - The near clipping plane of the viewing frustum
  • fclip - The far clipping plane of the viewing frustum
However, after some investigation, I found that the interest and roll parameters are really just quaternion components, unlike in Grim. The existing ResidualVM code was using them as Quaternion  components, but they were using the original Grim Fandango names for them.

Also, when creating a set with these parameters, the retail version of EMI creates an actor object that's kept as part of the setup to represent the camera. This allows it to keep track of the position and rotation information in the same way as the actors in the set.

Following the usage of these parameters in ResidualVM to the function setupCamera, I compared the calculated frustum to the apitrace results and found that ResidualVM was computing this correctly. So the issue was most likely related to the positionCamera function. So, how can we fix it?

I started by examining the retail version again in IDA Pro and found that there are a number of camera functions to retrieve the camera rotation and position. Unfortunately, these weren't yet functional in ResidualVM, so I added them as a part of this work. These functions are:
  • GetCameraYaw
  • GetCameraRoll
  • GetCameraPitch
  • PitchCamera
  • RollCamera
  • YawCamera
To implement this functionality, I decided to keep the rotation of the camera internally as a matrix, and calculate the Euler Angles as needed. When writing out the value for saved games, I use the Quaternion representation of this rotation so that it's compatible with the existing version. Why did I choose to keep the rotation as a matrix? In all of the graphical code that uses the camera rotation, it previously had to convert between a quaternion and a matrix each time the rotation was used. By keeping the rotation as a matrix, we avoid that overhead for the common case and only convert when reading or writing the setup rotation values for saved games. After checking these functions against the retail version, we see that the camera rotation values are placed and rotated properly.

Still, at this point, after all of this work, the graphics themselves haven't been fixed. We haven't change any of the output since the beginning of this post, just the representation in the code. So, how are the rotation matrices computed for display? First, here's the desired ModelView matrix for the trap, extracted with apitrace:

Crawdad Trap: Retail ModelView Matrix
-0.516999 -0.234287 0.823299 0.0
-0.0168343 0.964411 0.263873 0.0
0.85582 -0.122562 0.502544 0.0
-2.60927 -1.20781 -19.2888 1.0

What can we see just from this inspection? Well, we can see that there's a 3x3 rotation matrix in the upper right corner, and a 1x3 position vector in the last row. Let's compare this with the matrix from ResidualVM:

Crawdad Trap: ResidualVM ModelView Matrix
-0.542569 0.212863 -0.812594 0.0
-0.016834 0.964411 0.263873 0.0
-0.839843 -0.156848 0.519676 0.0
-2.60927 -1.20781 -19.2888 1.0

First, we can see that the ResidualVM position agrees with the position from the Retail version of EMI. We can also see that the middle row of the rotation matrix matches, but the others are slightly off. What might be causing this problem? Well, if we inspect the rotation for the trap, we can see that only the Yaw parameter is rotated, which corresponds to a matrix where the middle row is an identity. This explains why the middle row matches! Now, how about the other rows?

Let's start by going back to startActorDraw in gfx_opengl.cpp. I first checked to see what the current value of the ModelView matrix was before this function acted on it and found that the matrix was an identity, but the 3rd row was multiplied by -1. This was caused by the previous call to glScalef(1.0, 1.0, -1.0). I also checked the stored rotation for the trap's actor and compared that with ResidualVM and found that both ResidualVM and the retail version had the same values. With this information in hand, I then tried changing the order of the operations, as a rotation followed by a translation might not have the same result as the opposite operation. In the end, performing the rotation first, then computing the translation resulted in the correct rotation result. Here's the output now:
Looks pretty good?
Awesome! It looks like it's working properly again! How about the pole after detaching? What are the differences remaining in this scene?
I knew something was still missing! :(

At first glance, it looks fine, but a glaring difference remains: the pole! It's completely missing in ResidualVM. It appears that there's still some users of getFinalMatrix which seems to be producing an incorrect result. In the next post, we'll dig into those and see if we can fix those rotations as well.

As an aside, one interesting observation that can be made from the apitrace results is that ResidualVM makes a lot more OpenGL calls to produce the same screen. It appears that in the original version, instead of always drawing each vertex as ResidualVM does, the game instead sometimes passes a pointer to a list of vertices and uses the following sequence to draw the object:

     glEnable(GL_CULL_FACE)
     glFrontFace(GL_CW)
     glBindTexture
     glTexCoordPointer
     glColorPointer
     glVertexPointer
     glDrawElements 

This indicates that the original engine passes these coordinates in a list rather than individually. It would probably be a good idea to make the same optimization in ResidualVM as well because as of right now, the original takes ~5k calls to draw the set pph, while ResidualVM takes ~58k. Maybe a later project? We'll see!

Tuesday, July 15, 2014

Positioning textObjects

There are a lot of differences with the positioning of textObjects between the retail version of EMI and ResdiualVM's implementation. Let's take the scene wed and compare the text positions. First, here's the scene in the retail version: 
Retail screen shot from the set wed
And here's the same scene in ResidualVM:
ResidualVM screen shot from the set wed
And finally, here's the difference between the two:
Difference between the two images
We can ignore the ship in the background, since it's moving, and I didn't capture it in exactly the same place. We also can ignore the shadows, Akz has already addressed that problem. That leaves us with the text errors, exactly what we're trying to work out!

While it's obvious to the eye that the text "There's an enemy pirate over there." is in the wrong place and has a different wrapping point, there's also a mismatch between where the text is placed for "Look at enemy pirate".

Additionally, in some cut scenes, with subtitles on, the text is placed in the wrong location. A good example of this problem is the scene during travel to Jambalaya Island, where the exclamations of the crew are drawn right on top of each other! Another common issue is that the text is placed in the wrong place, obscuring the game play. A good example of this is when reading the directions to get to Pegnose Pete's house.

In ResidualVM, text elements are drawn using a textObject. In the Lua scripts, textObjects are created by calling the Lua/C function MakeTextObject. The parameters for the textObject are the string that's going to be displayed and a parameters table, with various options for configuring the text. We can remove this textObject from the screen by calling KillTextObject and passing the handle that was returned by MakeTextObject.  Let's try drawing some text with the retail version of EMI and observe the behavior. Let's start by just writing some text to the 0, 0 coordinate position:

textObj = MakeTextObject("Hello Blog Readers!", { x = 0.0, y = 0.0, font = verb_font, fgcolor = White })

Here's the output of running that command:
There's the text!
An interesting aspect to the positioning of the text is that it appears to be using a coordinate system with a basis in the center of the screen. Next, lets try exploring the extents of the text placement. Do negative values work? (Yes!) What is the highest and lowest X and Y value for placing text that still appears on the screen? (It appears that -1, -1 is the lower left corner and 1,1 is the upper right corner). Is this mechanism the only way text is added to the screen? (Nope, there's also SayLine, PrintLine and other methods, so when debugging, the method used will be important).

With this information in hand, I started by rebasing Botje's changes (to allow the X and Y values to be passed as floats) against the current master. These changes, made the text appear more correct for scenes where the X and Y values are passed from the Lua, such as the directions to Pegnose Pete's House, but this change didn't effect most of the erroneous text placement.

I then inspected the SayLine Lua/C function and found that it sets a variable when X and Y aren't passed in from the Lua script, which has the actor set the position relative to the actor's bounding box. Simply using the other side of the bounding box for the Y position moves the actor's speech to a much more correct location:
After the bounding box side switch
The next task I tackled was the incorrect word wrapping, as seen in the speech bubble, "There's an enemy pirate fighting over there.". Although the ResidualVM version actually looks a little bit better than the retail version here, there are other scenes where the word wrapping is worse, so let's try to fix it!

The word wrapping is done in the function setupText, which is found in the textobject.cpp file. Inside this function, the line width is computed by adding each character width in the line together. Strangely, this uses the maximum value between the character width and something called the dataWidth. Without a comment, it's hard to tell what dataWidth actually means, so, I first started by identifying when this code was added to ResidualVM. This was a harder task than anticipated because the file was moved twice over the course of development. To trace its history, I first reset git to the first commit that moved the file and then, pushed the HEAD of the repository back one more commit. This restored the original version of the moved file and let me explore its history further. After one more file move, I found that this was the commit that added the dataWidth field. Unfortunately, there wasn't much information in the commit message either, so I decided to try to test some strings in the retail version and measure what the width of the displayed text in pixels.

So, it appears that the dataWidth is the actual width of the character, but the width includes adjustments to the spacing (or kerning) between the letters to make them look nice. As an example, let's look at a simple string, "Wa":

     MakeTextObject("Wa", { x = 0.0, y = 0.0, fgcolor = White))

Which results in:
Demonstrating the font kerning
As can be seen, the letter "a" actually overlaps with the letter W by two pixels. It appears that ResidualVM gets this right and does use the correct values for drawing the characters as this matches perfectly with the retail version. So, our issue is with determining when to word wrap, which depends on the width of the rendered text. Let's examine the width variables to figure out what combination results in the correct width being calculated.

For each font character, there's the dataWidth and width sizes, which of these is the actual character width and which is the kerned width? Here's the dataWidth and width values for "W" and "a":

Value       W     
width 18
dataWidth 21
Value a
width 7
dataWidth 9

So, it appears that the "width" value is the kerned value, while dataWidth is the actual size of the font bitmap image. Checking this against the image above shows that the dataWidth values match the pixel size of the letters exactly. So, if we add the dataWidth values together, our characters will be wider than what's actually displayed to the screen. We can fix this by always using the width value instead of the maximum of the two as is done now.

So, let's take a look at the result difference after this change, when Guybrush is speaking the line "There's an enemy pirate fighting over there.":
Getting closer!
We can see now that the lines are the same length now, and appear to be in the correct location on the X axis, but shifted in the Y axis. So, first, is there a common shift between all of these lines? Or in other words, are the lines shifted by a constant value? Unfortunately, it appears that this isn't the case. The top line of Guybrush's speech is 1 pixel too far to the right and 7 pixels too far towards the bottom of the screen, while the bottom line is offset by 1 pixel to the right and 13 pixels too far towards the bottom of the screen. The text "Look at enemy pirate" is also offset by 14 pixels too high towards the top of the screen.

Let's look into the difference in line spacing between the two lines for Guybrush's speech. It turns out that in ResidualVM, the height of the font is modified when loaded if the character is taller than the specified font height. As before, this is intentional because of kerning. We do want some of the characters to sink lower or rise higher than the rest because it's pleasing to the eye. Simply commenting the section of code that modifies the height results in the correct behavior, so I looked up why that work around was inserted in the first place, which led to this commit. So, it appears that there was a crash bug associated with the height. I suspect that the problem would be fixed if the renderer uses the dataHeight instead of the height to draw the text.

As this blog post is getting long and I need to check with the other developers about this code, we'll wrap it up here for now. I'll continue in the next blog post with fixing the remaining font issues. All of the changes in this post can be found in PR #964.

Monday, July 14, 2014

Fixing the Position

Continued from the previous entry.

So now, let's return to position! First, let's examine the output of each of the actors for getpos(), getworldpos() and getrot() in the retail version of EMI, after running the rotation script:

actor1 Output
getpos (1, 4, 5)
getworldpos (1, 4, 5)
getrot (5, 15, 25)
actor2 Output
getpos (0.854459, -6.7806, -5.41233)
getworldpos (3, -1, -2)
getrot (3.98862, 13.1276, 7.01816)
actor3 Output
getpos (-6.7911, 1.63313, -0.462462)
getworldpos (-3, -4, 2)
getrot (6.82211, -9.04857, -16.7055)

After a lot of debugging with IDA Pro, I discovered that the retail version actually maintains the computed or provided values for the Euler Angles, the same rotation as a Quaternion and a 3x3 Rotation Matrix. Additionally, it keeps the relative position (to the attached actor) and the world position. Instead of doing this in ResidualVM, these values are computed when needed.

It's important to note that there are functions in the retail version which keep these values synchronized, so the matrix, quaternion, or Euler Angle that you're inspecting might not be up to date. So, in the retail version, the status of each of the different rotation representations are stored in a bitfield which should be checked before assuming that a rotation value is correct. With those caveats, I then inspected the three actors' Quaternion and Matrix representations after attachment and compared them to the internal values in ResidualVM. For example, here is the 3x3 Rotation Matrix representation of the rotation of Actor 2 after attachment, taken from the retail version of EMI:

actor2: Retail Rotation Matrix
0.7370587 0.49632958 -0.45869532
-0.47202796 0.86379832 0.17618749
0.48366731 0.086646488 0.87095153

So, how did I fix the rotation? After each change to the rotation, I added breakpoints to check the Rotation Matrix value and changed ResidualVM to match these intermediate values. One major structural change that I made was to separate the generation of a rotation matrix from the generation of the final matrix (including the actor's position for translation). As these operations must be applied in the same order, it was better to do them separately.

Now, with a working rotation implementation, we can finally fix the position. As discussed before, getworldpos() should reflect the original location of the actor before attachment. To find the world position, we take the relative positions of each of the attached actors and combine them to find the actual position coordinates.

Finally, what are the position coordinates supposed to represent? It's the actor's relative position to the parent actor when attached, or the global position, when unattached. With correct rotation, the previous math that implemented the position actually matches with the retail version and no further work was required!

In addition to attaching, detaching doesn't work quite right in the current code either. When detached, in ResidualVM, an actor simply reverts to a rotation of (0,0,0) and doesn't have the correct position, instead of using the same position and rotation, but in a global basis. We can fix the rotation problem by simply getting the rotation quaternion for this actor, which already sums up all of the combined rotation of all of the actors and retrieving the Euler Angles from this quaternion. This fixes the problem and matches the retail version!

All of the the above changes can be found in the updated PR #948.

To test this implementation, klusark created a lua script that iterates over a bunch of angles and writes out the result to a file. I added the script to the local.m4b file and ran it on both the retail version and ResidualVM, then compared the two with another script that checks one against the other and ignores rounding errors. While there are some rounding differences, the output now appears to be correct!

So, in the end, after all of this work, does everything look correct? Was the location of the pole in mot fixed?

Nope, not rendering correctly still! And we broke the other actors!
Unfortunately, no. We still need to correct the rendering position of the actors now to match the changes that we made to the position and rotation of the actor. We'll tackle this in the next blog post.

Sunday, July 6, 2014

Rotation, Position and Debugging

I've been having a hard time trying to reconcile approaches that seem mathematically correct with the actual output from the Retail version of EMI. To figure out exactly what was going wrong, I decided to go back to the debugger and probe the internal values generated by the Retail version of EMI to compare with the results that are being produced by ResidualVM.

To begin with, I needed to figure out how the Retail version maintained the actor's position. To find this out, I investigated the Lua entry point for SetActorRot to find where in memory the rotation information was stored. After some inspection, it turns out that in EMI, the rotation information is converted from the Euler Angles that were passed in from the Lua to a rotation matrix located 0xA0 bytes from the beginning of the allocated Actor. How do we know where the start of the allocated Actor is? For C++, the X86 ABI uses the convention that the object being referenced is passed in the ECX register when calling an object method. We can be fairly certain that in this case, it is the Actor object because the Lua/C++ interface passes in the actor handle as the first parameter, and when that actor's handle is used to retrieve the Actor object, the return value is saved into the ECX register.

How did I know that the rotation was stored in a rotation matrix? I observed that after the Euler Angles were passed in, a series of floating point operations occurred, resulting in writing to nine values in the Actor object. Writing nine values was likely to be a 3x3 matrix, so I checked these values for Guybrush in the set wed by setting the debugger to break when SetActorRot is called and the actor handle matched the integer stored in hActor for Guybrush, which seems to always be 4.  To be sure that this was Guybrush, I also checked the actor's position at offset 0x94, which matched with the results from guybrush:getpos().

Here's the rotation matrix for Guybrush in the scene wed, before any movement, in the Retail version:

Guybrush Rotation Matrix
-0.34202039 0.0 -0.93969256
0.0 1.0 0.0
0.93969256 0.0 -0.3402039

This rotation matrix was created by setrot(0, 110, 0). So, I then tested and found that there were multiple Euler Angle Orders that can reproduce this matrix. This is because there's only one rotation axis that's being changed, so there is ambiguity between the other two.

So, lets investigate the more complex attaching and detaching scenario that I outlined in the previous post. We'll start with only the first actor, and check that rotation matrix first. First, I needed to identify the actor handle ID, so I ran the rotation script and checked the value of hActor for at1, at2 and at3. The actor handles for each, after running the script for the first time in the set wed, are: 106, 107 and 108, respectively. Recall that the first actor is rotated with the call setrot(5, 15, 25). This results in this matrix in the Retail version:

Actor1 Rotation Matrix
0.87542605 0.40821794 -0.25881907
-0.40056604 0.9123922 0.084185995
0.27051076 0.029975513 0.96225017

In this case, there should be no ambiguity, with only one rotation that produces the correct result because each of the rotations are different. The rotation that produces the correct result in ResidualVM is:

     Math::Matrix4 m;
     m.buildFromXYZ(getRoll(), getYaw(), getPitch(), Math::EO_ZYX);
     m.transpose();

With this approach now reproducing the same results, in the next post, we'll examine attaching again with the extra information we've learned here.

Wednesday, July 2, 2014

The Simplified Quaternion Solution

Continued from the previous entry.

It turns out that the Euler Angle Order changes from the last blog post had a bug, causing the heads in Grim Fandango to not point in the right direction. This was odd because I had shown that the output was the same with both the original approach and the new Rotation3D approach, so what was the bug? After some time in my debugger, it turns out that the original implementation maintains the values in the matrix for position (the last column in the 4x4 rotation matrix) when building a new rotation. My new code simply discarded these values. With this changed in PR #944, the heads rotate properly in Grim Fandango again! Sorry about that oversight!
Manny is looking in the right direction again!
With Euler Angle Order taken care of by the changes in the Rotation3D class, I then focused on finishing the new Quaternion implementation without the added complexity of Euler Angles. The few remaining uses for Quaternions are largely in the EMI portions of the engine, which helps to reduce the impact that this patch has on the other supported games.

In the change set for improving Quaternions, I removed the direct implementation of conversion from Euler Angles to Quaternions. I then replaced it by creating a rotation matrix from the angles using the  new functionality we added to Rotation3D and then converting that to a Quaternion. Using this approach, there is only one implementation of Euler Angle conversion instead of the two (different!) versions that were in ResidualVM previously. I also added new functionality useful for Quaternion work and cleaned up the Quaternion class by improving the comments and fixing style issues. This work is in PR #943.

With a working Quaternion implementation, we can now finally work on fixing attaching and detaching for good! From the work before, I had thought that the position was computed properly. Things seemed to be positioned properly, Guybrush rode the manatee and there was no problem, right?

Well, testing with more complicated position vectors has shown that the code here isn't exactly right! Why didn't I notice the problem before? The examples that we used to demonstrate attaching/detaching like the candles, the raft and the manatee all have one attached part that's zeros or have values that are the same, hiding the problem.

So, what's a good vector for testing? Instead of using something from the game, we'll test with a vector that has different components for all of the vector elements. The following table shows the three actors we'll be testing, with the position and rotation vector for each:

actor1
Position(1,4,5)
Rotation(5,15,25)
actor2
Position(3,-1,-2)
Rotation(15,25,35)
actor3
Position(-3, -4, 2)
Rotation(12,16,18)

It would be easy to write a script that sets up this scenario, but instead, I've made it so that tests can be run on objects from the GRIM engine in ResidualVM, as discussed in the previous post. In the tables below, I've attached these actors, with actor3 attached to actor2, who is attached to actor1.

actor1Retail OutputResidualVM
getpos(1, 4, 5)(1,4,5)
getworldpos(1, 4, 5)(1,4,5)
getrot(5, 15, 25)(5, 15, 25)
actor2Retail OutputResidualVM
getpos(0.854459, -6.7806, -5.41233)(0.437805, -7.30678, -4.7349)
getworldpos(3, -1, -2) (-0.746586, -3.64594, 1.19355)
getrot(3.98862, 13.1276, 7.01816)(15, 25, 35)
actor3Retail OutputResidualVM
getpos(-6.7911, 1.63313, -0.462462)(-1.46815, 1.76189, -0.770648)
getworldpos(-3, -4, 2)(0.271631, -2.42397, -0.629527)
getrot(6.82211, -9.04857, -16.7055)(12, 16, 18)

Due to the influence that the rotation of the actor has on the position, I decided to fix the rotation first, then investigate if the position code needed to be modified further. In this next table, the actors' rotation is checked between the Retail version and ResidualVM:

actor1Retail OutputResidualVM
getrot(5, 15, 25)(5, 15, 25)
actor2
getrot(3.98862, 13.1276, 7.01816)(15, 25, 35)
actor3
getrot(6.82211, -9.04857, -16.7055)(12, 16, 18)

So, what does this tell us? Well, we need to change the rotation from a global rotation (the original angles) to a relative one (relative to the rotation of the parent object). How do we do this? By removing the rotation of the parent from the child. With Quaternions, there's no subtraction operator. Instead, we invert the rotation we want to remove, then combine it with the current by multiplying both rotations to find the difference.
This difference is our new rotation. Unfortunately, it was still wrong! After some testing and checking against the retail version, I found that the Euler Order was incorrect. Changing the Euler Order to XZY and swapping the input parameters around gets us here:

actor1Retail OutputResidualVM
getrot(5, 15, 25)(5, 15, 25)
actor2

getrot(3.98862, 13.1276, 7.01816)(3.98862, 13.1276, 7.01815)
actor3

getrot(6.82211, -9.04857, -16.7055)(2.79709, -9.00632, -16.1663)

Well, it's an improvement. The angle is now calculated properly for the first and second actors. So, why didn't the third attached actor come out correctly? For quaternions, the order of multiplication is not commutative. As it turns out, the order that the parent rotations were being applied was backwards, resulting in an incorrect rotation quaternion. After fixing this, the result matches the retail version. This was submitted as part of PR #948. In the next post, I'll discuss fixing the position and the changes required to make it work correctly.