Results 1 to 10 of 25
Thread: HardOCP Article on Canned benchmarks VS actual gameplay
-
02-11-08, 10:00 AM #1
HardOCP Article on Canned benchmarks VS actual gameplay
Excellent read on the difference between canned benchmarks versus real world results in games. Breaks down the Crysis benchmark and shows how what the benchmark shows vs what you can expect when actually playing can and will be radically different.
http://enthusiast.hardocp.com/articl...50aHVzaWFzdA==
Introduction
HardOCP has been thoroughly mired in real world testing scenarios for years now, and there is no doubt that we have taken a lot of heat from various sources. Most of it is generated through ignorance and the human quality of being resistant to change. Some of it is generated by websites and forums that will proclaim that there is no need for tools beyond a timedemo or cutscene in order to understand graphics cards’ performance.
Derek Wilson, Anandtech 01/28/08: I'll still stand by the fact that it is not necessary to look at gameplay situations in order to build an accurate picture of the relative performance of a graphics card.
There is no doubt that there are also some highly intelligent, easy going folks, that simply do not find value in our real world testing as they think it to be too subjective. All of this is good with us as it has spurred a lot of conversation and site growth over the last few years and in our opinion has certainly made HardOCP a more valuable resource to the enthusiast hardware and gaming communities, or at least parts of them. We value differing opinions and have posted links to opinions other than ours on our news page for almost a decade now. We understand the world of computer hardware is hardly ever as easy as black and white. We have never been a website that has tried to be all things to all people and never will be.
I have no problem stating that we think the real world computer hardware testing methodology is vastly superior to most of the synthetic and canned benchmarks we see today. That is not to say that synthetic and canned benchmarks do not have their places in testing, we just don’t usually find those metrics to be indicative of what the end user has in terms of actual experiences. Some website’s want to tell you the relative performance of a graphics card based on a timedemo that in no way represents playing the game. That is not what we want to focus on here at HardOCP.Conclusions
I think we have proven that timedemo benchmark results do not represent real world gameplay. We have also seen one card enjoy a benchmark advantage when comparing those benchmark numbers to actual gameplay framerates. If you are looking at other sites' framerate results and thinking it is showing you a framerate, resolution, or graphical setting you should expect in a particular game, you are likely being totally mislead.
Let’s take Anandtech’s canned Crysis GPU benchmark results for example. They show a 3870 X2 averaging over 31fps at 1680 resolution with High Quality settings and about 28fps at 1920 resolution at High Quality settings in Crysis. We trust Anadtech to relate benchmark data to us that is correct. But I also believe the benchmarking tools it has used in no way relate to real world video card performance you are going to experience at home playing the game. We have actually used the ATI HD 3870 X2 to play the game Crysis and our gaming experience in no way mirrors those graphical settings at those framerates. What if we play Crysis with the resolution and quality settings represented in their review?
To put it plainly, it was a painful gaming experience.
Anandtech’s results in no way suggest to the reader what the video card might actually perform like in Crysis or what resolutions or quality settings might be configured. As quoted on page one of this article:
Derek Wilson, Anandtech – 01/28/08: “ I'll still stand by the fact that it is not necessary to look at gameplay situations in order to build an accurate picture of the relative performance of a graphics card.”
Do you want video card reviews that suggest “relative performance of a graphics card” based on timedemo benchmarks when some cards benchmark better than others, or do you want an evaluation of those video cards' in-game performance in the latest and greatest computer games that you are going to be playing with it?
HardOCP is very firm in its commitment to give our readers video card evaluations that will allow them to make good purchasing decisions based on real world expectations of the product. We have no interest in showing you “relative performance” based on a “benchmark.” We sit down and spend hours and hours playing the games on each video card and then share our thoughts and analysis. The simple fact of the matter is that HardOCP’s video card evaluation experiences cannot be replicated by clicking a single mouse button and putting a number on a graph.The Bottom Line
Timedemo benchmarking of video cards is broken. We have proven this on the preceding pages with today’s most graphically intensive gaming title. Many will argue that timedemo benchmarking is the only scientific approach to video card performance analysis that can be trusted. Why you would want to trust a performance metric that is in no way shape or form going to relate to your gaming experience is beyond me. There is also no doubt that there are some games out there that benchmark perfectly in relation to their real world gameplay. We just don’t know what they are, and quite frankly we don’t care. Today's Crysis benchmarks that in no way reflect real world gameplay are enough validation for us to keep on doing it our way. If you want someone’s idea of overall "relative performance" of a graphics card based on timedemo benchmarks, HardOCP.com is not for you. We are going to make sure that our video card evaluations give you a solid idea of the actual gaming performance you will experience at home when playing the game.
Im not trying to rain on the ATI parade here, the exact same thing applies to Nvidia cards as well. This article is about canned benchmarks, not one brand or another. Its one of the reasons I never bothered linking to my 3dmark06 score, its just a number and does not equate to anything I am actually playing.
I just wish more sites took the time to do reviews like this as opposed to just the standard benches.
-
-
02-11-08, 10:46 AM #3
Re: HardOCP Article on Canned benchmarks VS actual gameplay
I dont think anyone said canned benches are irrelevant, in fact I said:
Canned marks are good for comparing one card vs another, but the do not tell the full story.
Read lot of reviews. No one site should determine what you should buy perdiod.
But far and away in my experience no other site breaks down what you can expect when you stick "HARDWARE_THINGY_038" in your machine with more detail than hardocp.
-
02-11-08, 11:42 AM #4
Re: HardOCP Article on Canned benchmarks VS actual gameplay
that's an unfair criticism. "Canned" benchmarks, like any game's time demo (such as crysis above), or even 3dMark06, are meant for relative comparisons. Any application review that HardOCP is going to spit out is going to be the same relative comparsion, and I'd say identical to the performance relativity established by the canned benchmarks.
translation: If something scores 8000 3dMark06's, but smoething else scores 12,000 3dM's.....relatively, the 8000 card is not as good. If hardOCP were to slap any given "application" review of these two cards, it is entirely likely, and perhaps even EXPECTED that their application review should mimic the same finidings in the 3dMark.
Same kind of trend would be seen in CPU processing reviews, where you take a winbench or synthetic application "canned" benchmark, and comapre that to a real world decoding application for any given DVD->>WMV or mpeg.
I'd challenge HardOCP to produce results that don't closely match the synthetic benchmarks, before I decide to believe an "application" review is worth more than a standard canned benchmark.
applications have too many variables, settings, driver qwirks.....inconsistent performance. 3dMark is a set standard that doesn't change and even prevents your average user from messing with settings (like resolution or effects) to get a mark.
If it's about comparing one card to another.....I trust a 3dMark more than I trust anything. We all remember the sway chip makers have/have had with game developers.....half life runs better on ATI, Quake runs better on Nvidia, etc....even 3dMark was swaying results a few times in its existence.
BUT.....at least 3dMark was created and maintained to be a fair battleground between the card manufacturers.
HardOCP is making a good point....but such a point is rhetorical. Sure...it's nice to say "if you buy this card, this is your average framerate in _____ game". Assuming you want to play ____ game.
For me....i want to play a lot of games. old games. new games. games that aren't even out yet. So the 3dMark relativity stands the test of time. You have a card that has more 3dMarks than mine, and your computer is going to run _____ game better than me, relative to your increase in 3dMarks (assuming we have the same processor/setup/etc....bear with me). will there be some small % outliers here and there......of course. comparing an 11,500 card to a 12,000 card might show some games favoring one over the other....but they also only have a 3dM difference of less than 5%. Compare that 8000 card to the 12000 card......and the relative performance differences will be maintained.
-
02-11-08, 11:51 AM #5
Re: HardOCP Article on Canned benchmarks VS actual gameplay
Canned marks are good for comparison yes, but ever since the whole driver cheating thing with 3dmark05 I have been very skeptical of 3dmark in general and standard benchmarking as a whole.
I frequent anandtech, and toms, techreport, and bittech and use their reviews to gauge a piece of hardwares relative power; but when I am looking to buy, and want to know exactly what I can expect I go to hardocp.
-
02-11-08, 11:54 AM #6
Re: HardOCP Article on Canned benchmarks VS actual gameplay
HardOCP reviews and articles come across as unprofessional and lack the depth of places like anadtech and toms.
Any application is going to have driver cheats...so I don't blame 3dMark. And I have yet to see any major disagreement from 3dMark scores and application performance, personally as well as on the net.
-
02-11-08, 12:03 PM #7
Re: HardOCP Article on Canned benchmarks VS actual gameplay
I definitely dont blame them, I just am leary of numbers produced through known benchmarks and tend to feel I get more quality data from the less consistent but less corruptible hand run throughs.
And thats odd that you feel that way, I have always thought they sounded less professional due to arrogance, but have more depth than many reviews sites.
That doesnt change the facts that when they should Card X will run games well and can be enjoyed at a given resolution with certain settings, I can expect to see those on my system, where as if I see '30 average fps at 1680 with everything on high' I have no idea if thats a playable 30fps or if its an actual average between 1 and 60 fps - 1 being unplayable.
I dont mean to say dont use canned marks, its just an interesting article from a reveiw site i have learned to trust.
-
02-11-08, 12:07 PM #8
Re: HardOCP Article on Canned benchmarks VS actual gameplay
Originally Posted by Mcstrange
maybe it's a more "personal touch"...but that's just like a car made by hand, instead of machines. Sure, it's something to respect, but are you going to trust someone installing your kids airbags by hand, or a machine doing it? Which one is easier to monitor, control, and QA?
-
02-11-08, 12:23 PM #9
Re: HardOCP Article on Canned benchmarks VS actual gameplay
Just a thought here...
In the 2007 CPU comparison on tomshardware The 3dmark06 program gives the Q6600 processor a higher mark than the E6850. However, in every game they tested save 1, and at every resolution, the E6850 outperformed the Q6600; and in many cases substantially. Or how about things like SLI/Crossfire? I get extra points for having a second video card even though when I actually use it in many cases it hurts performance. What does that say about 3dmark?
I personally much prefer to see how the hardware stacks up in specific application testing: it is much more revealing. 3dmark06 would have you believe the Q6600 outperforms the E6850, which it does for a handfull of multi core optimized applications such as 3dsmax. But even in "multi core optimized games" such as supreme commander, the E6850 beat out the Q6600 but no canned benchmark reveals that.
A couple of cents...
Just some thoughts...
-
02-11-08, 12:48 PM #10
Re: HardOCP Article on Canned benchmarks VS actual gameplay
Originally Posted by SoySoldier
Why would you use 3dMark and SM2/3 results for a CPU test? That's not really picking the benchmark for the task, is it? And declaring the test invalid, because it gives more 3dmarks to a system with a dual core over one with a quad core...makes no sense.
That would be like using PC-mark to determine which graphics card is the best.....graphics has nothing to do with the pcmark, and most synthetic benches. Eventually, the CPU is going to reach the performance limit of the 3d hardware, so no matter what power you add to the CPU, the "3d" aspect of 3dMark is going to come out the same on the upper limits of perfomance.
Of course, that's NOT true for the CPU aspect
http://www23.tomshardware.com/cpu_20...=871&chart=419
http://www23.tomshardware.com/cpu_20...=871&chart=435
http://www23.tomshardware.com/cpu_20...=871&chart=411
where a quad core creams a dual core, obviously, unless you pick out unoptimized benches that aren't supporting quad cores (such as pcMark05 http://www23.tomshardware.com/cpu_20...=871&chart=416 where they both come out equal....which is ridiculous, and evident the benchmark isn't performing correctly).
But comparing a 3dmark score on an E6850 to a Q6600.....is totally unfair. you don't rate CPU's on 3dMarks. You rate systems. That, or you isolate out the piece of hardware in question, and keep all other pieces the same, and run the bench for a side by side comparison. However, the results of that comparison have to take into effect that the 3dMark score is system wide, meaning cooperation between the GPU, the RAM, the motherboard, and the CPU. And when doing that, rate limitations are imposed on the CPU, given the slower ability of other components related to the benchmark (translated: the CPU is too fast for the G-Card, and thus an invalid method of comparison).
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)
Bookmarks