RELATIONAL DATABASES

Relational Databases are sometimes viewed as being in the domain of business applications and web development. You would use a relational database for boring applications such as inventory, accounting, or implementing a shopping cart system for a commercial web site. Databases can be viewed by game-programmers as old-fashioned, large, slow, and replete with imponderable terminology such as “inner joins”, “foreign key” and “tuple”. However, modern databases are actually fast, easy to use, and can be quite useful in various stages of game development. While relational databases can be immensely complex and powerful, their robustness and ubiquity means they can also be useful for relatively simple tasks. In this article I’ll attempt to de-mystify databases a little, show how they can easily be incorporated into development code, and discuss a few potential usages.

WHAT IS IT?

A relational database is simply a set of tables with named columns where each row is an individual record. Game programmers essentially use many forms of relation databases at run-time to store and organize things like game objects and various resources. Those databases are generally rather ad-hoc, and tuned for a specific purpose and operating environment, and are implemented using custom code. While technically this is a database, the programmer would probably not refer to it as such. The usages of relational databases I discuss below are not intended to replace these custom in-game “databases”, but rather to add new functionality to be used during the development process.

While the examples I give are all for Windows based development, the nature of communication with a database server is basically text based, so a minimum amount of work would be required to implement similar functionality on console platforms.

SETUP

Databases are run by a server, so the first thing you need to do is set up a server. The server can be local, in that it’s part of your code, and you access it directly via a relatively low level API. Alternatively the server can be remote, and you access it via a network connection. This distinction can be blurred a little as you can have a server on your local machine, accessed via the network (using localhost) – but that’s still essentially a remote server, just somewhat quicker. Here we’ll be discussing remote servers.

There are several ways of setting up a database, and what you settle on will vary with your needs and situation. If you have developers in various locations, then you might benefit from having your database hosted by a third party, as this should ensure everyone has sufficiently fast access. If your developers are all on the same network, then you’d more typically have the databases hosted on the network server. If you are a lone developer, then you’d be more likely to have the database on your local machine, to take advantage of the additional speed.

Setting up a database is very easy. If you have remote hosting you are often supplied with a web interface such as phpMyAdmin that allows you to create databases and users. On a local network your network server will often already have some database server software installed, and you can just add a database to that. Lacking this, you can very simply install a database by downloading and installing the MySQL software, which takes only a few minutes to get up and running.

Once you have a database up and running, it’s very important that you have some way of testing your connection and the database, so you can more easily debug problems with your code. A useful tool here is HeidiSQL, a free program that let’s you connect to your database server, and setup, examine and modify databases in a visual manner. There is also more fully featured software such as PremiumSoft’s NaviCat, which performs similar functions. In the examples below I give the SQL query definitions for the database tables. While it’s quite possible to set these up using a command line tool or web interface, it’s generally easier to use a tool like NaviCat, as it allows you to more easily adjust individual parameters in your tables.

A database server can have various users. If you are just doing some initial experimentation, then you can just log in as the “root” user that you set up when you installed the server. However, as you expand the usage of the databases, then you will want to add additional users with fewer privileges to prevent inadvertent modifications to the database.

Once this is set up, you can now connect to your database from your code. The simplest way (from game code) is to use the C API. The code to connect is shown in listing 1. The SERVER_NAME would be the URL or ip address of your server, or “localhost” if it’s on your local machine. This setup needs only be done once when your program runs, and the other examples assume this has already been done, and there is a valid value in the “handle” variable. Error checking is omitted for clarity, but is something you will need to add, especially if connecting over the internet.

LISTING 1 – Code to connect to a database

#define SERVER_NAME “localhost”
#define DB_USER “user_name”
#define DB_USERPASS “password”
#define DB_NAME “db1″
MYSQL *handle=NULL;
handle = mysql_init(NULL);
mysql_real_connect(handle,SERVER_NAME,
DB_USER,DB_USERPASS,DB_NAME,0,NULL,0);

UNIQUE ASSET IDS

A useful example of using a database during development is for the generation of asset IDs. I touched on this briefly in my article “Practical Hash IDs” (Game Developer, December 2005). The idea is that for the purpose of efficiency (both speed and space) it’s best to refer to assets using a unique 32 bit ID. In the previous article, I suggested using 32 bit CRCs for the ID. That approach has a number of advantages, but there is still the problem with collisions, and if you are going to use databases in a broader manner it makes more sense to generate the IDs using the database.

The simplest way to do this is to have a table that consists of an ID and an asset name (as a string). The ID field will be an integer, and set to autoincrement. (See Listing 2) Then whenever you add an asset name to the table, a unique ID will automatically be generated. To find the 32-bit ID number of any string, we simply look to see if it’s in the database, and if not, we add it. Then we just query the database for this string. See Listing 3 for a function that implements this.

Listing 2 – SQL that defines the simple table of IDs

CREATE TABLE `table1` (
`id` int(11) NOT NULL auto_increment,
`name` text,
PRIMARY KEY (`id`)
)

Listing 3 – Get a unique ID for a string identifier

uint32 GetID(const char *name)
{
char select_query[1024];
char add_query[1024];
MYSQL_RES *result=NULL;
MYSQL_ROW row;
uint32 id = -1;
sprintf(select_query,“SELECT * FROM table1 WHERE name=’%s’”,name);
if (!mysql_query(handle,select_query)) {
result = mysql_use_result(handle);
row = mysql_fetch_row(result);
if (!row) {
sprintf(add_query,“INSERT INTO `table1` (`id`,`name`) VALUES (NULL,’%s’)”,name);
mysql_query(handle,add_query);
mysql_query(handle,select_query);
result = mysql_use_result(handle);
row = mysql_fetch_row(result);
}
id = atoi(row[0]);
mysql_free_result(result);
}
return id;
}

These IDs are typically baked into the data as part of the build process, but can also be used directly in the code in the exact same way as was outlined in “Practical Hash IDs”. Note here that what we are doing is not a run-time process. The database is only intended to be used during game development, for the initial creations of ID by the team. If you are using CRCs, then it’s quite easy to modify this code to check for collisions.

TRACKING ASSERTS

A common issue during game development is what to do about asserts and warnings in the code. Warnings are often ignored by non-technical staff, and manual solutions such as “when you see this warning, come and tell me” are not very reliable. The line between asserts and warnings is often blurred in order to facilitate uninterrupted development. Asserts (which should indicate some fatal error which requires immediate attention) sometime have an “ignore” option, which gets switched on by the creative staff, who don’t care so much about tracking down your bugs. In a development environment with a large number of people, there could be a lot of asserts or warnings being fired off by your code, and it swiftly becomes very difficult to separate signal from noise, and to get the information to the correct person. Clearly some automated system would be useful.

Here we can easily use a simple database table to track these things. Listing 4 and 5 show a simple implementation of this.

Listing 4 – SQL Table for storing asserts

CREATE TABLE `asserts` (
`assert` text,
`message` text,
`file` text,
`line` int(11) default NULL,
`machine` text,
`time` timestamp NULL default CURRENT_TIMESTAMP
)

<<LISTING END>>

Listing 5 – Assert replacements that log asserts to a database table

char assert_buffer[1024];

void assert_printf( const char* text, … )

{

va_list a;

va_start(a, text );

vsprintf( assert_buffer,text,a);

va_end(a);

}

void SQLAssert(const char *assert, const char *file, int line)

{

printf (“%s, %s, %d\n”,assert_buffer, file, line);

char query[2048];

sprintf(query,”INSERT INTO `asserts` (`assert`,`message`,`file`,`line`,`machine`) VALUES (‘%s’,'%s’,'%s’,'%d’,'%s’)”,

assert,assert_buffer,file,line,GetMachineName());

mysql_query(handle,query);

if (0 /*don’t ignore*/ ) __asm int 3

}

#define NewAssert( test)\

if( !(test)) { \

assert_buffer[0]=0 ; \

SQLAssert(#test,__FILE__,__LINE__); \

}

#define NewAssertM( test, params )\

if( !(test)) { \

assert_printf params ; \

SQLAssert(#test,__FILE__,__LINE__); \

}

// Usage, note extra parentheses:

// NewAssertM(p==NULL,(“p not NULL (%p)”,p));

// NewAssert(p==NULL);

<<LISTING END>>

The macro NewAssert is a drop-in replacement for the standard assert() macro. The only parameter is a test that must return true. If it returns false, then the macro calls SQLAssert with #test (a string containing the actual test code), and the standard file and line numbers. SQLAssert then formats a string that will add these to the database.

In addition there is a more sophisticated assert macro, NewAssertM, that takes an additional parameter which is actually a list of parameters enclosed in parentheses. These are passed to the assert_printf() function which treats it as a sprintf into the assert_buffer. The assert_buffer is then passed to the database. This allows you to add an arbitrary string to the assert info in the database, usually this would contain the values of various variables involved in whatever you are testing. See the example usage at the end of listing 5.

So what we have now is an assert macro (or a warning macro) that you can track every single instance where it fires, and no longer have to rely on the artists and level designers (or even the testers) to accurately report what is going on. You can even leave it in for beta versions, and gather a large amount of data from a geographically diverse set.

The example shown includes a field for “machine”, which is intended to hold the machine name. Using this you can identify if a particular warning is going off a lot for one particular user. You could quite easily extend the assert logging to hold any additional information that might be useful, such as the IP address, or the current level name. Since this is a standard database, it’s very easy to query, extract reports into spreadsheets, and even generate graphs and web-pages from the information in the database. Interesting metrics can be generated, such as which source files trigger the most assertions, or even what day of the week has the highest rate of problems.

The 'time’ field in the table is set to “default CURRENT_TIMESTAMP”, which means that whenever the assert fires, this field is set to the server’s current date and time. This can be very useful in tracking down bugs, as you can see when an error or warning first occurred, and attempt to correlate that with whatever was changed around that time. This can be useful for prioritizing things. If a particular assert has been triggering for several days (or weeks) and nothing is being done about it, then it might be that it needs to be downgraded to a warning, or an informational message (or you might need to fire someone.) This kind of high level overview of issues can be useful when there are a large number of developers on a team.

OTHER USES

The two examples above make very little usage of the vast power of a relational database server. The data structures are essentially flat, and there is nothing “relational” about them. Normally a database would have multiple tables, cross indexed with each other to avoid data duplication. However, there is nothing wrong with using a database in this simple manner. It might seem like overkill to use it for logging – which you might do to a CSV text file – but it does not cost us anything to do it like this, and you immediately get the benefits of multiple remote connections to a robust repository, and very sophisticated filtering and report generation. The fact that we are barely using any of the features of the database is beside the point.

With that in mind, there might be other obvious areas where a database could be used instead of plain logging. User input could be logged if your database is fast enough. Many gameplay metrics could be logged in early test versions, such as how long it takes to complete each goal in each level. This data could be collated over thousands of runs, and used to fine-tune the difficulty level of a game. Remember that once you have the basic database set up and the connection software in place, it is very easy to add arbitrary new tables, and to start recording whatever you like.

Resources:

MySQL – A powerful open source database server. dev.mysql.com

Paul DuBois – Writing MySQL Programs Using C, MySQL, Third Edition, Sample Chapter: http://www.kitebird.com/mysql-book/ch06-3ed.pdf