View RSS Feed

hannibal

All Benchmarks Lie

Rate this Entry
Disclaimer: This is probably not 100% accurate, but it is some function of the truth.
There is a lot of misinformation out there about products, especially high end PC products. Many of them are reported as anecdotes, some purport to be conclusive scientific experiments benchmarking performance. This blog entry deals with the latter. The Truth is that benchmarks are not entirely useful. The main problem is that these benchmarks are often relative measures of performance, and the PC platform is so varied it is hard to get an apples to apples comparison. There are two reasons for this hardware and software. Let us first deal with the latter.

PC games do get better over time, but they have been overtaken by consoles with respect to the rate at which the get better. The reason for this is the problem of software optimization. Consoles are fixed objects, they not only have the same processors, and video cards, but they have the same system bus components and even hard drives. When things are so uniform, you can in theory tailor all of your software by hand to the platform. This has gotten some people in trouble. (Sony) Another advantage of the console method is the fact that after the boot ROM on disk, the game is basically interacting with the hardware directly without any overhead (i.e. Processing Cycles, Bus Bandwidth) being used to talk to the hardware. (see Attachments)

For PC's things are not so simple. The bus components for different Original Equipment Manufacturers (ASUS, Dell, ASrock, MSI, DFI, etc.) perform differently under different circumstances. Different chipsets do things differently and AMD and Intel CPUs do not always have the same instruction sets. The result is that an X58 motherboard is very different from a NForce750i SLI motherboard. When writing software for PCs, you need an operating system to make sure all of the components work together (windows) then in order to play the game you need a Hardware Abstraction Layer (HAL) , why? Because you never know what video card it is going to be interacting with. The HAL in windows is known as DirectX, while each DirectX version imposes different minimum requirements on cards, however they do not in fact define exactly how the card should act under a given set of circumstances. I.E. bus type, speed and Processor. There is still a wide variation even amongst video cards with the exact same Graphics Processing Unit (GPU) since different manufactures clock them differently.

Things get much worse when you choose to talk about CPUs. The thing you first need to know is that Moore's law for Intel CPUs broke around 2007 and putting two processors on the same die is not the same thing as doubling the speed. Why? Using two, or four or six processors at one time is much harder one might think, this is because it is very hard (NP hard) to create a compiler that can determine how to parallelize code. (Don Knuth will show us how in the next volume of the art of computer programming) That means that all optimized threaded code has to be written by hand. This gets worse, processors are very complicated objects and no two families of processors have the exact same design. The result is that if you were to choose to optimize your software for one processor say an i7 your game would run like greased lightning on that processor but it would be horribly slow on Core2 processors or any AMD processor. Also, the guys who can do that type of optimization need 6 figures to get out of bed in the morning. So what do you do, you buy your team some very expensive Software Development Kits (SDKs), and you produce the best code you can with those toolkits.

tl;dr
Writing games/3D software for the PC platform is akin to herding cats.

So now we come to why benchmarks lie. The good news is that graphics cards benchmarks are not hard to do. As long as you stick to benchmarking card with the same system bus interface like PCI Express, then you throw the cards into the machine and use the standard drivers and you can get a decent apples to apples comparison. Even multi-GPU test have become rather civilized recently since Intel's X58 and P55 chipsets support both ATI CrossfireX and Nvidia SLI, the only difference between the two is the actual design of the systems, not the system bus or the processor.
The bad news is CPU and System benchmarks are bullshit. Comparing i7 processors to each other as long as they are all LGA1366 or they are all LGA1156 on the same motherboard is fine. Comparing Core2's to each other is fine. But comparing i7 to Core2 or AMD to Intel, is very nearly impossible, for the litany of reasons stated above. The golden rule of science is that the only way to get a result is to keep all things equal and only change one thing. That means that if you are comparing processor you should have all of the same system specs except the processor. This is only possible for processors using the same socket and system bus specification. I will give you an example According to 3DMark Vantage my CPU score is 24,444. Now if you go to this chart (http://www.tomshardware.com/charts/2...-CPU,1398.html) my CPU would be second from the bottom, however on this chart (http://www.guru3d.com/article/core-i...ssor-review/13) my CPU would be third from the top. That is a contradiction. Which one is true, well, neither.

The problem is there are too many different variables to determine what it is we are actually measuring. The only thing we can take away from this is that you must run the benchmark yourself, with the free version of 3DMark to figure out what is going on.

What is the good news

The good news is that no matter what game you want to play there are some things that you can do to get your system to the right place:
1. AIM for 60 FPS. it is almost humanly impossible to detect frames greater than this especially since your monitor is only 60Hz

2. Video Card is more important than anything. If you processor is a nice dual or quad core it is probably fine as of this writing (7/14/2010) The authors definition of Good Core2Duo E8400 or above, and Phenom x4 or above.

3. Consider a hybrid system. A cheap NVIDIA card for PhysX processing with an AMD card could be both affordable and powerful.

4. Ignore Crysis benchmarks, unless you want to run Crysis. Crysis, and games like Metro 2033 are beautiful games, but the reason for this is they take HDR rendering to an extreme. This taxes your system unnecessarily, play these games on medium settings and you should be fine.

5. Do not be swayed by new features. Things like DX11 compatibility and Hardware Tessellation are great buzz words. But there are very few DX11 games out, and no games using Hardware Tessellation, so unless you want to watch Demos all day don't go out and buy the latest and greatest card. New cards almost always have quality and driver issues, and sometimes you can get away with using a ATI 4870x2 for a couple of years.

6. You get what you pay for. High end stuff usually can last up to a year longer than mid range stuff. CPUs can last up to 4 years as of this writing (7/14/2010).
Attached Thumbnails Attached Images Going Live-pc-stack-png Going Live-console-stack-png 

Submit "All Benchmarks Lie" to Digg Submit "All Benchmarks Lie" to del.icio.us Submit "All Benchmarks Lie" to StumbleUpon Submit "All Benchmarks Lie" to Google

Updated 07-14-10 at 02:03 PM by hannibal

Categories
Video Games , Provocative Thought , Programming

Comments

    WileECyte's Avatar
    There aren't a ton of game titles out there using PhysX. Every game that TTP hosts, uses a Havok based engine, which doesn't use the GPU. Unless you're going to play a lot of games based on the Unreal Engine 3, there's not a huge benefit of a hybrid system.
    Imisnew2's Avatar
    Good writeup.

    Good pointers on the benchmark tests.
Title