Cowboy Programming Game Development and General Hacking by the Old West

August 29, 2007

Speeding up slow Vista

Filed under: Game Development — Mick West @ 12:09 pm

Vista must have been designed by committee, everyone wanting to get their own pet features in there, and not caring about the effect on the overall user experience. I had some trepidation about Vista when I saw a preview video a few months before it was released, and in about 15 minutes all they were able to show was:

1) Transparent Title Bars on windows
2) Buttons that glow a bit when the mouse is over them
3) Preview windows on the task bar
4) A 3D window manager on windows-tab

That’s it? That’s what I’m expected to pay $200 for? Sheesh! Anyway, long story short, I buy it, install it, discover that is in fact just about all you get, and Vista is a lot slower user experience than XP, despite my computer having a Vista Experience Index Base Score of 5.1, with a 5.9 for graphics (the highest possible score).

So how to speed it up? Well, first I upgraded from 2GB to 4B Ram, but I can’t really say it makes much of a difference. In fact Windows can’t actually give that extra 2GB to your applications, but I thought it would at least keep it for itself, and use it for caching or something. It seems it needs a lot of the address space for trivialities, so I suspect a lot of my megabytes are sitting idle.

So what else, Google is a good place to start, here’s a useful article that suggest things you can do.

http://www.extremetech.com/article2/0,1558,2110595,00.asp
Some handy tips there, Under Control Panel->Uninstall a Program->Turn Windows features on or off, I remove:
Table PC Optional Components (Handwriting? No thank you)
Telnet client (I’d use PuTTY if I wanted telnet)
Telnet Server (why?)
Windows DFS Replication Service (Don’t use it)
Windows Fax and Scan (Fax is DEAD!)
Windows Meeting Space (Don’t use it)

This, of course makes Vista churn for ten minutes and then another ten minutes to reboot. While this is going on, I also discover from reading more of the ExtremeTech article, that I can use a USB drive as extra disk cache, using an actual feature of Vista called “ReadyBoost”. Sounds like a good idea. For some reason my “High Speed” 1GB Sony memory stick pro duo is not suitable (read speed 1943 KB/s), but my rather old 1GB Lexar thumb drive is (read 4483 KB/s, write 6913KB/s). It’s flashing away right now, but I’m unsure of what it is doing. I think I might try buying a really fast USB drive and see if it helps.

I also went into the Control Panel\System and Maintenance\Performance Information and Tools, then there a bunch of options on the left you can adjust for improvements: First the “Manage Startup Programs”, I turn off anything I don’t want running at startup. In my case this was Picasa, Steam, and a few others to do with some specific hardware devices I had, but did not really use.

Then, and this seemed to be a big one, under “Adjust Visual Effects”, I turned off EVERYTHING except for “Show thumbnails”, “Show translucent”, “show window contents” and “smooth edges of screen fonts”. This made quite a significant difference in the speed of navigating folders. This is basically turn off all the flashy features that I saw in that video. Windows now looks boring again, but runs faster, which is what I want.

Under Adjust Indexing Options, I limited it to the Start Menu, and my documents folder.

Under Adjust Power Settings, I changed it from “Balanced” to “High Performance”, which according to their little infographic, gives you TWICE THE PERFORMANCE as “Balance”. I doubt that, but I want speed! Speed! Speed! I’m bamboozled as to what it’s doing here. Why not default to the fastest setting?

Next we have “Open Disk Cleanup”, which scans my C: drive (may take a few minutes), then tells me of a whole range of files I can delete, including 8 GB of temporary files, WTF? Anyway, I select 13GB of files and delete them. Does this speed up my computer? No, not really, but I was getting a little low on disk space, so it’s all good. After it deleted that I ran it again, and clicked on the “More Options”, where it handlity told me I could free up more space by removing programs I do not use. Now this is something I don’t do much, in part because it takes so long. But now my computer is zipping along, I might delete that demo of Harry Potter, et al. I also deleted “all but the most recent restore point”, although it failed to tell me how much space this saved.

Onwards, I click on the enticing “Advanced Options”, and then “View performance details in Event Log”, this turns out to be quite the gold mine of information. Some little elf in the system does actually care about performance, and makes a note whenever the computer slows down, and even figures out why. Now I’d love for that elf to actually TELL ME, but no, it just makes a note. But the notes are sometimes telling.

-\Garmin\Training Center.exe – This process is doing excessive disk activities and is impacting the performance of Windows: (Hmm, I’ll shut that one after using it now. Normally I just let things sit there. I’d already made it not start automatically).

One from a few days ago:

The Desktop Window Manager is experiencing heavy resource contention.
Reason : Graphics subsystem resources are over-utilized.
Diagnosis : A consistent degradation in frame rate for the Desktop Window Manager was observed over a period of time.

Hopefully that’s fixed now, with me turing off all those fancy “Aero” features. I just wish they would tell me. And, hey, look, there’s an option – “Attach Task to This Event”, whcih adds a task to the Task Scheduler, triggered by this event, to pop up a window. Hmm, we’ll see hoe that goes – if the system is slowing down, it seems unlikely that popping up a window is going to help much.

Next, “Open reliability and Performance Monitor”, this is kind of like the performace tab on the task manager, (and you can get here from there), but gives a lot more detail about what is doing what. Right now I see my documents are still being indexed, and firefox is using 100MB.

Finally, “Generate System Health Report”. Supposedly will tell me how well my system is doing and recommend ways of improving it. That might have been useful YESTERDAY before I started removing everything. But anyway, does not tell me anything useful.

One final tip, for which I really should get a new UPS battery, “Enable write caching on the disk“. Makes me slightly nervous, but, okay, I’ll order a new battery (my USP has a red light telling me I should do this).

Blimey! Photoshop just started up in under ten seconds, and that’s after a reboot, so it’s not in the cache! It re-started in four seconds. View a PDF – 1 second, yay!

I’d also removed the anti-virus software, the firewall and user account settings some time ago.

Finally, my windows Vista is nearly as good as my Windows XP!

Actually, I think it’s possibly better. There are actually useful things in Vista. I like the fast indexed searching (once I set indexing to just the folder I want). I like being able to just type in the name of a program in the start menu (even just a bit of it, like “middle earth”). Now, although I’ve not used it much, it’s does actually seem, possibly, to be faster than XP.

May 11, 2007

Why is Vista Slow?

Filed under: Game Development — Mick West @ 5:47 pm

(The following is just my tale of woe and rants, for speed-up tips, see here)

I’ve always been a little hasty in upgrading my computer to the latest version of Microsoft Windows, and this time I let it go a while to see if there were many reported problems. Nothing major seemed to transpire, so I did the upgrade.

I’ve been using it for a month or so now, and my impression is that it was a mistake to upgrade. It’s pretty. But it’s slow. Mind numbingly inexplicably frustratingly slow.

An operating system has two primary functions: It has to provide a common interface to the hardware for the software that you use. It also has to provide a fluid user experience.

I read somewhere that they spend a billion dollars on Vista. This is quite disturbing. A billion dollars to take a perfectly good operating system, and make it look prettier, but slow everything down.

I tried to like it. I tried to get along with the “do you know this program that you are trying to run, and you have been running every few hours for several days, but maybe someone changed it, or something?” hand-holding. But after I was asked this for the hundredth time I finally gave up and switched it off. It’s just so stupid. Sure the basic idea is sound, but why can’t I have a “Don’t ask me about this particular program again” button.

But the simple slowness is the biggest problem in my mind. Take something as simple as opening a folder, like, to look at the contents. I’ll even do it the way Microsoft wants me to do it. Click on “Start”, click on “Documents. And wait. For three seconds.

THREE SECONDS! Do you know how much a typical computer can do in three seconds. Never mind that, how much can MY computer do in three seconds. It’s a 3.6 Ghz dual core pentium extreme edition. With four gigabytes of ram. It can load 50 Megabytes of uncached data from the hard drive in three seconds, it can perform at least FIVE BILLION FLOATING POINT OPERATIONS.

So what am I asking it to do? Simply show me the contents of a folder. What the heck is it doing?

This kind of things goes on and on. Open internet explorer (to a blank page, no internet access required): three seconds. Open a folder with some video in it and get an error “COM Surrogate has stopped working”. Press ctrl-alt-delete (three seconds).

Here’s one that really bugs me. I have a folder open showing some photos. I right click on white space in the folder to bring up the context menu so I can change the view. FIVE SECONDS! WTF! That is just totally insane. What is it doing?

Obviously, in all these cases it is doing something that someone at Microsoft thought was a good idea. The problem is: they have let the functionality get in the way of the responsiveness. When I open a folder I don’t give two shakes about all that crap going on behind the scenes. I expect the operating system to DROP EVERYTHING (with the exception of any Audio or Video critical things, like playing music, or video chat) and bring the full resources of the machine to the task I have just demanded of it. It should display that window.

I’m in charge here. I want Vista to do what I ask, and I want it to display a window listing my files, and do it in less than a tenth of a second. There is no technical reason why not. Video games create an entire frames worth of presentation in 1/60th of a second. I’m prepared to allow a little time for hard disk seeking, but what exactly is the computer doing that is so important that it takes THREE SECONDS to show me what’s in a folder, and FIVE SECONDS to bring up a menu.

What’s this rant doing in Cowboy Programming? Well, game programmers need to remember that the player is the boss. The most important thing you can do as a game programmer is translate the intentions of the player directly into results, with no impediments. No pauses. Never.

March 12, 2007

Optimized Asset Processing

Filed under: Game Development,Inner Product — Mick West @ 12:10 pm

This article originally appeared in Game Developer Magazine, December 2006.

OPTIMIZING ASSET PROCESSING

The fundamental building block of any game asset pipeline is the asset processing tool. An asset processing tool is a program or piece of code that takes data in one format and performs some operations on it, such as converting it into a target specific format, or performing some calculation, such as lighting or compression. This article discusses the performance issues with these tools, and gives some ideas for optimization with a focus on minimizing I/O.

THE UGLY SISTER

Asset conversion tools are too often neglected during development. Since they are usually well specified and discreet pieces of code, they can be easily tasked to junior programmers. It is generally easy for any programmer to create a tool that works to a simple specification, and at the start of a project the performance of the tool is not so important, as the size of the data involved is generally small and the focus is simply on getting things up and running.

However, towards the end of the project, the production department often realizes that a large amount of time is being wasted in waiting for these tools to complete their tasks. The accumulation of near-final game data and the more rapid iterations in the debugging and tweaking phase of the project make the speed of these tools be of paramount importance. Further time may be wasted in trying to optimize the tools at this late stage, and there is a significant risk of bugs being introduced into the asset pipeline (and the game), by making significant changes to processes and code during the testing phase.

Hence it is highly advisable to devote sufficient time to optimizing your asset pipeline at an early stage in development. The process of doing this should include the involvement of personnel who have advanced experience in the types of optimization skills needed. This early application of optimization is another example of what I call “Mature Optimization” (see Game Developer Magazine, January 2006). There are a limited number of man hours available in the development of a game. If you wait until the need for optimization becomes apparent, then you will already have wasted hundred of those man-hours.

THE NATURE OF THE DATA

Asset processing tools come in three flavors: converters, calculators and packers. Converters take data which is arranged in a particular set of data structures, and re-arrange it into another set of data structures which are often machine or engine specific. A good example here is an texture converter, which might take texture in .PNG format, and convert it to a form that can be directly loaded into the graphic memory of the target hardware.

Secondly we have asset calculators. These take an asset, or group of assets, and perform some set of calculations on them such as calculating lighting and shadows, or creating normal maps. Since these operations involve a lot of calculations, and several passes over the data, they typically take a lot longer than the asset conversion tools. Sometimes they take large assets, such as high resolution meshes, and produce smaller assets, such as displacement maps.

Thirdly we have asset packers. These take the individual assets and package them into data sets for use in particular instances in the game, generally without changing them much. This might involve simply gathering all the files used by one level of the game and arranging them into a WAD file. Or it might involve grouping files together in such a way that that streaming can be effectively performed when moving from one area of the game to another. Since the amount of data that is involved can be very large, the packing process can take a lot of time and be very resource intensive – requiring lots of memory and disk space, especially for final builds.

TWEAKING OPTIMIZATION

You may be surprised how often the simplest method of optimization is overlooked. Are you letting the content creators use the debug version of a tool? It’s a common mistake for junior programmers, but even the most experienced programmers sometimes overlook this simple step. So before you do anything, try turning the optimization settings on and off, and make sure that there is a noticeable speed difference. Then, in release mode, try tweaking some settings, such as “Optimize for speed” and “Optimize for size” . Depending on the nature of the data, and on the current hardware you are running the tools on, you might actually get faster code if you use “Optimize for size” . The optimal optimization setting can vary from tool to tool.

Be careful when testing the speed of your code when doing things like tweaking optimization settings. In a multi-tasking operating system like Windows XP, there is a lot going on, so your timings can vary a lot from one run to the next. Taking the average is not always a useful measure either, as it can be greatly skewed by random events. A more accurate way is to compare the lowest times of two different settings, as that will be closest to the “pure” run of your code.

PARALLELIZE YOUR CODE

Most PCs now have some kind of multi-core and/or hyper-threading. If your tools are written in the traditional mindset of a single processing thread, then you are wasting a significant amount of the silicon you paid for, as well as wasting the time of the artists and level designers as they wait for their assets to be converted.

Since the nature of asset data is generally large chunks of homogeneous data, such a lists of vertices and polygons, then it is generally very amenable to data level parallelization with worker threads, where the same code is run on multiple chunks of similar data concurrently, taking advantage of the cache. For details on this approach see my article “particle tuning” in Game Developer Magazine, April 2006.

TUNE YOUR MACHINES

Anti-virus software should be configured so that it does not scan the directories that your assets reside in, and also does not scan the actual tools. Poorly written anti-virus and other security tools can significantly impact the performance of a machine that does a lot of file operations. Try running a build both with and without the anti-virus software, and see if there is any difference. Consider removing the anti-virus software entirely.

If you are using any form of distributed “farm” of machines in the asset pipeline, then beware of any screensaver other than “Turn off monitor” . Some screensavers can use a significant chunk of processing power. You need to especially careful of this problem when repurposing a machine – as the previous user may have installed their favorite screen-saver, which does not kick in for several hours, and then slows that machine down to a crawl.

WRITE BAD CODE

In-house tools do not always need to be up to the same code standards as the code you use in your commercially released games. Sometime it is possible to get performance benefits by making certain dangerous assumptions about the data you are processing, and about the hardware it will be running on.

Instead of constantly allocating buffers as needed, try just allocating a “reasonable” chunk of memory as a general purpose buffer. If you’ve got debugging code, make sure you can switch it off. Beware of logging or other instrumenting functions, as they can end up taking more time than the code they are logging. If earlier stages in the pipeline are robust enough, then (very carefully) consider removing error and bounds checking from later stages if you can see they are a significant factor. If you’ve got a bunch of separate programs, consider bunching them together into one uber-tool to cut down on load times. All these are bad practices, but for their limited lifetime the risks may outweigh the rewards.

MINIMIZE I/O

Old programmers tend to write conversion tools using the standard C I/O functions: fopen, fread, fwrite, fclose, etc. The standard way of doing things is to open an input file and an output file, then read in chunks of data from the input file (with fread, or fgetc) , and write them to the output file (with fwrite or fputc).

This approach has the advantage of being simple, easy to understand, and easy to implement. It also uses very little memory So you quite often see tools written like this. The problem is that it’s insanely slow. It’s a hold-over from the (really) bad old days of computing, when processing large amount of data mean reading from one spool of tape, and writing to another.

Younger programmers will learn to use C++ I/O “streams” , which are intended to make it easy for data structures to be read and written into a binary format. But when used to read and write files, they still suffer from the same problems that our older C programmer has. It is still stuck in the same serial model of “read a bit – write a bit” that is excessively slow, and mostly unnecessary on modern hardware.

Unless you are doing things like encoding MPEG data, you will generally be dealing with files that are smaller than a few tens of megabytes. Most developers will now have a machine with at least a gigabyte of memory. If you are going to be processing the whole file a piece at a time, then there is no reason why you should not load the entire file into memory. Similarly, there is no reason why you should have to write your output file a few bytes at a time. Build the file in memory, and write it out all at once.

You might counter that that’s what the file cache is there for. It’s true, the OS will buffer reads and writes in memory, and very few of those reads or writes will actually cause physical disk access. But the overhead associated with using the OS to buffer your data versus simply storing it in a raw block of memory is very significant.

For example, listing 1 shows a very simple file conversion program that takes a file, and writes out a version of the file with all the zero bytes replaced with 0xFF. It’s simple for illustration purposes, but many file format converters do not do significantly more CPU work than this simple example.

Listing 1: Old fashioned file I/O

[source:cpp]
FILE *f_in = fopen(“IMAGE.JPG”,”rb”);
FILE *f_out = fopen(“IMAGE.BIN”,”wb”);
fseek(f_in,0,SEEK_END);
long size = ftell(f_in);
rewind(f_in);
for (int b = 0;b
char c = fgetc(f_in);
if (c == 0) c = 0xff;
fputc(c,f_out);
}
fclose(f_in);
fclose(f_out);
[/source]

Listing 2 shows the same program converted to read in the whole file into a buffer, process it, and write it out again. The code is slightly more complex, yet this version executes approximately ten times as fast as the version in Listing 1.

Listing 2: Reading the whole file into memory

[source:cpp]
FILE *f_in = fopen(“IMAGE.JPG”,”rb”);
if (f_in==NULL) exit (1);
fseek(f_in,0,SEEK_END);
long size = ftell(f_in);
rewind(f_in);
char* p_buffer = (char*) malloc (size);
fread (p_buffer,size,1,f_in);
fclose(f_in);
unsigned char *p= (unsigned char*)p_buffer;
for (int x=0;x
if (*p == 0) *p = 0xff;
FILE *f_out = fopen(“IMAGE.BIN”,”wb”);
fwrite(p_buffer,size,1,f_out);
fclose(f_out);
free(p_buffer);
[/source]

MEMORY MAPPED FILES

The use of serial I/O is a throwback to the days of limited memory and tape drives. But a combination of factors means it’s still useful to think of your file conversion as an essentially serial process. Firstly since file operations can proceed asynchronously, you can be processing data at the same time as it is being read in, and begin writing it out as soon as some is ready. Secondly: memory is slow, and processors are fast. This can lead us to think of normal random access memory as a just a very fast hard disk, with your processor’s cache memory as your actual working memory.

While you could write some complex multi-threaded code to take advantage of the asynchronous nature of file I/O, you can get the full advantages of both this and optimal cache usage by using Window’s memory mapped file functions to read in your files.

The process of memory mapping a file is really very simple. All you are doing is telling the OS that you want a file to appear as if it is already in memory. You can then process the file exactly as if you just loaded it yourself, and the OS will take care of making sure that the file data actually shows up as needed.

This gives you the advantage of both asynchronous IO, since you can immediately start processing once the first page of the file is loaded and the OS will take care of reading in the rest of the file as needed. It also makes best use of the memory cache, especially if you process the file in a serial manner. The act of memory mapping a file also ensures that there is the very minimum amount of moving data around. No buffers need to be allocated.

Listing 3 shows the same program converted to use memory mapped IO. Depending on the state of virtual memory and the file cache, this is several times faster than the “whole file” approach in listing 2. It looks annoyingly complex, but you only have to write it once. The amount of speed-up will depend on the nature of the data, the hardware and the size and architecture of your build pipeline.

Listing 3: Using memory mapped files
[source:cpp]
// Open the input file and memory map it
HANDLE hInFile = ::CreateFile(L”IMAGE.JPG”,
GENERIC_READ,FILE_SHARE_READ,NULL,OPEN_EXISTING,FILE_ATTRIBUTE_READONLY,NULL);
DWORD dwFileSize = ::GetFileSize(hInFile, NULL);
HANDLE hMappedInFile = ::CreateFileMapping(hInFile, NULL,PAGE_READONLY,0,0,NULL);
LPBYTE lpMapInAddress = (LPBYTE) ::MapViewOfFile(hMappedInFile,FILE_MAP_READ,0,0,0);
// Open the output file, and memory map it
// (Note we specify the size of the output file)
HANDLE hOutFile = ::CreateFile(L”IMAGE.BIN”,
GENERIC_WRITE | GENERIC_READ,0,NULL,CREATE_ALWAYS,FILE_ATTRIBUTE_NORMAL,NULL);
HANDLE hMappedOutFile = ::CreateFileMapping(hOutFile, NULL,PAGE_READWRITE,0,dwFileSize,NULL);
LPBYTE lpMapOutAddress = (LPBYTE) ::MapViewOfFile(hMappedOutFile,FILE_MAP_WRITE,0,0,0);
// Perform the translation
// Note there is no reading or writing, the OS takes care of that as needed
char *p_in=(char*)lpMapInAddress;
char* p_out = (char*)lpMapOutAddress;
for (int x=0;x
char c = *p_in;
if (c == 0) c = 0xff;
*p_out++ = c;
}
// Close the files
::CloseHandle(hMappedInFile);
::CloseHandle(hMappedOutFile);
::CloseHandle(hInFile);
::CloseHandle(hOutFile);
[/source]

RESOURCES

Noel Llopis, Optimizing the Content Pipeline, Game Developer Magazine, April 2004
http://www.convexhull.com/articles/gdmag_content_pipeline.pdf

Ben Carter, The Game Asset Pipeline: Managing Asset Processing, Gamasutra, Feb 21, 2005
http://www.gamasutra.com/features/20050221/carter_01.shtml

March 5, 2007

Falling Sand Game from the ’80s

Filed under: Game Development — Mick West @ 6:09 pm

I just found out there is a mini casual-game genera called “Falling Sand Games“, based upon a Japanese programmer’s game from 2005 ish.

The funny thing is, I remember writing a very similar “game” back in about 1986, on the Sinclair ZX Spectrum, with help from my friend John Lord. Unfortunately no trace of this original version remains. A couple of years after I converted it to the Atari ST, and eventually in about 1992 it was one of the first programs I wrote on the PC, in assembly language. The screenshot to the right is a photo of my screen today, I can’t figure out how to do a screen capture of a mode-x program.

Here’s the Executable: snow.exe (which does not work on Windows Vista, bah humbug!), it’s 45K because it contains a lot of other code because I dumped all my PC toy programs into one source file. The original PC version was about 6K of code.

Here’s the source: ALL.ASM, it’s also got the source for a load of other programs, like a simple cloth simulator, Brownian motion, and fireworks.

It runs in full screen “Mode-X”. Draw lines with the mouse, left click to draw lines, right click to erase. Press the spacebar to start the snow falling.

The earliest evidence of this on the internet is my Neversoft web page back in 1998, from archive.org.

February 22, 2007

What is invalid parameter noinfo and how do I get rid of it?

Filed under: Cowboy Programming — Mick West @ 6:40 pm

_invalid_parameter_noinfo shows up as a problem most often as an “Unresolved external symbol” when you get some mix up between DEBUG and RELEASE modes. But it’s also a performance problem in situations where it’s compiled in unnecessarily.

So what is it? Well, it’s basically the function that gets called when the Microsoft Visual C++ compiled code detects a type of run-time error, such as a vector index out of range, or incrementing an iterator past the end of a container.

There are two functions: _invalid_parameter, which gets called in debug mode, and _invalid_parameter_noinfo, which gets called in non-debug mode. If you have a module compiled without _DEBUG defined, and you link with the debug library, then you will get the “Unresolved external symbol” error. This could be a problem with inconsistent #includes. See this thread on devmaster.net for some practical experiences.

What is _invalid_parameter_noinfo then? Well, first ask what _invalid_parameter is. _invalid_parameter is basically an assert. When an error is detected, _invalid_parameter is called, and is passed debug information, like the file name and line number pointers. It then calls _crt_debugger_hook and _invoke_watson. In non-debug mode (release mode), the debug info is not available, so invalid_parameter_noinfo simply calls invalid_parameter with NULL parameters. It’s actually an optimization, saving having the NULL parameters passed from every bit of your code, instead you code “just” needs to call this one function

You might also be having problems with _invalid_parameter_noinfo if you have code that crashes (on this function) on in release mode. That’s most likely some form of release-only bug, such as uninitialized memory, and the call to _invalid_parameter_noinfo is the end result. DO NOT IGNORE IT, or try to work around it. You need to find out exactly why it is being called.

But suppose your code works fine, and _invalid_parameter_noinfo is never called, you might be peeking through the disassembly, trying to figure out why it is so slow, and you see all these calls to _invalid_parameter_noinfo

Consider this code:

void	CVerletPoint::SatisfyConstraints()
{

	vector::iterator i;
	for (i = mp_constraints.begin(); i != mp_constraints.end(); i++)
	{
		(*i)->Satisfy(this);
	}

}

A nice simple bit of code that just iterates over a vector of CVerletConstraint objects, and calls some function on each one. It compiles to this:

00405C20  push        ebx  
00405C21  push        ebp  
00405C22  push        esi  
00405C23  mov         ebp,ecx 
00405C25  mov         esi,dword ptr [ebp+20h] 
00405C28  cmp         esi,dword ptr [ebp+24h] 
00405C2B  push        edi  
00405C2C  lea         edi,[ebp+1Ch] 
00405C2F  jbe         CVerletPoint::SatisfyConstraints+16h (405C36h) 
00405C31  call        _invalid_parameter_noinfo (407A81h) 
00405C36  mov         ebx,dword ptr [edi+8] 
00405C39  cmp         dword ptr [edi+4],ebx 
00405C3C  jbe         CVerletPoint::SatisfyConstraints+23h (405C43h) 
00405C3E  call        _invalid_parameter_noinfo (407A81h) 
00405C43  test        edi,edi 
00405C45  je          CVerletPoint::SatisfyConstraints+2Bh (405C4Bh) 
00405C47  cmp         edi,edi 
00405C49  je          CVerletPoint::SatisfyConstraints+30h (405C50h) 
00405C4B  call        _invalid_parameter_noinfo (407A81h) 
00405C50  cmp         esi,ebx 
00405C52  je          CVerletPoint::SatisfyConstraints+5Fh (405C7Fh) 
00405C54  test        edi,edi 
00405C56  jne         CVerletPoint::SatisfyConstraints+3Dh (405C5Dh) 
00405C58  call        _invalid_parameter_noinfo (407A81h) 
00405C5D  cmp         esi,dword ptr [edi+8] 
00405C60  jb          CVerletPoint::SatisfyConstraints+47h (405C67h) 
00405C62  call        _invalid_parameter_noinfo (407A81h) 
00405C67  mov         ecx,dword ptr [esi] 
00405C69  mov         eax,dword ptr [ecx] 
00405C6B  mov         edx,dword ptr [eax] 
00405C6D  push        ebp  
00405C6E  call        edx  
00405C70  cmp         esi,dword ptr [edi+8] 
00405C73  jb          CVerletPoint::SatisfyConstraints+5Ah (405C7Ah) 
00405C75  call        _invalid_parameter_noinfo (407A81h) 
00405C7A  add         esi,4 
00405C7D  jmp         CVerletPoint::SatisfyConstraints+16h (405C36h) 
00405C7F  pop         edi  
00405C80  pop         esi  
00405C81  pop         ebp  
00405C82  pop         ebx  
00405C83  ret              

Yikes! What exactly is going on there? Lots of run time error checking, that’s what. Why is it doing this? Well, it’s to make your code “secure”, it makes it so that if something goes out of bounds, then the program will halt, preventing it from doing any harm (or being exploited by a hacker), and allowing you to debug it.

Is this a good thing? It depend on what you want. If you are writing code with lots of convoluted data structures and container interactions, then maybe this is something you want. But for code that operates on a data structure that does not change from frame to frame, then this code is just getting in the way. If it works the first frame, it will work forever. In release mode I do not expect this kind of error checking, and it certainly look like it would hurt performance. It is tests that always return true, skipping over a function that is never called.

So, you can turn it off with:

#ifndef _DEBUG
#define _SECURE_SCL 0
#endif

You will need that either defined in every file, or have _SECURE_SCL defined as 0 in the release build process.

Effect: Our code from above now shrinks to:

00405AF0  push        esi  
00405AF1  push        edi  
00405AF2  mov         edi,ecx 
00405AF4  mov         esi,dword ptr [edi+20h] 
00405AF7  cmp         esi,dword ptr [edi+24h] 
00405AFA  je          CVerletPoint::SatisfyConstraints+21h (405B11h) 
00405AFC  lea         esp,[esp] 
00405B00  mov         ecx,dword ptr [esi] 
00405B02  mov         eax,dword ptr [ecx] 
00405B04  mov         edx,dword ptr [eax] 
00405B06  push        edi  
00405B07  call        edx  
00405B09  add         esi,4 
00405B0C  cmp         esi,dword ptr [edi+24h] 
00405B0F  jne         CVerletPoint::SatisfyConstraints+10h (405B00h) 
00405B11  pop         edi  
00405B12  pop         esi  
00405B13  ret

Much better. six tests have been eliminated. saving at least 12 lines of assembly code. And the big news, the framerate of my blob program goes from 150fps to 170fps.

Check here for an investigation of different ways of iterating over STL vectors:
http://einaros.blogspot.com/2007/02/iteration-techniques-put-on-benchmark.html

So, is turning off _SECURE_SCL a bad cowboy practice? Well for games I think it’s quite reasonable to switch it off for a “FINAL” build (i.e. the one you are going to release to the consumer). Leaving it on might be a useful debugging tool. Just be aware of the potential for significant performance degradation in instances like the one above. That kind of case might be ripe for some kind of refactoring with error checking that is only performed when the container changes.

« Newer PostsOlder Posts »

Powered by WordPress