How We Test Laptops for Review

  • MORE

how we test lead

For well over a decade, Laptop has been testing notebooks in our lab to help you decide which ones rise above the rest. During that time, we've both adopted synthetic benchmarks and created real-world tests to give shoppers the most complete picture of a given laptop's performance. Today, we evaluate everything from speed and battery life to display brightness, speaker volume and system heat. We then use this data to determine a system's rating, combined with other factors like design, usability and value.

Below is a breakdown of all the tests we use to evaluate laptops and how we employ the results to compare similar systems before handing down our final verdict.

Notebook Categories

how we test categories

Each score is recorded and compared with the averaged scores of all notebooks in the same category. Those categories include:

● Desktop Replacements (16-inch and bigger displays, weighing 7 pounds or more)
● Mainstream (15- and 16-inch displays, weighing less than 7 pounds)
● Thin-and-Lights (12- to 14-inch displays, weighing less than 6 pounds)
● Ultraportables (10- to 13-inch displays, weighing less than 4 pounds)
● Netbooks (low-cost, highly portable systems; usually 10 to 12-inches and based on low-power processors)

Category Averages

A notebook's results on each test are compared to results from other systems in its category. The category average for any given test and category (example: battery-life test for netbooks) is calculated by taking the mean score from the prior 12 months of test results.

MORE: 10 Laptops with the Longest Battery Life

General Performance Tests

how we test general

These tests measure overall system performance in a single score, stressing the processor, graphics and storage drive.

Geekbench 3

Geekbench, developed by Primate Labs, runs on a variety of platforms including Apple and 32-bit or 64-bit Windows machines. It tests the performance and speed of each of the processor cores of the notebook, as well as memory performance. This test returns both a single- and a multi-core score; we only use the latter.

Laptop Spreadsheet Macro Test

The Spreadsheet Macro Test was designed inside the Laptop labs to stress the CPU. During this test, 20,000 names are matched to their corresponding addresses in OpenOffice. We time how long it takes the notebook to complete this task: the shorter, the better.

Sunspider Javascript

This benchmark, typically run only on Chrome notebooks, measures the Javascript performance of a browser. The shorter the time it takes to complete the test (measured in milliseconds), the better.

WebGL Cubes

Another Chrome-centric test, the WebGL Cubes Chrome experiment by AlteredQualia, renders 150,000 cubes illuminated by three light sources. A notebook better able to handle the graphics workload will have a higher frames-per-second score.


MORE: Laptop Buying Guide: 8 Essential Tips

Graphics Tests

how we test graphics

These benchmarks measure the ability of a notebook to provide smooth video playback, gaming and other video-centric tasks.

3DMark Professional

Futuremark's latest version of its 3D benchmarking tool, 3DMark Professional, tests notebooks on a variety of graphics-related tasks, including physics rendering, real-time lighting and heavy particle effects at 720p and 1080p resolutions. 3DMark Pro is better suited at testing a range of systems with three different tests — Ice Storm Unlimited, Fire Strike and Fire Strike Ultra — each more demanding than the last. For instance, Ice Storm Unlimited best measures entry-level machines with integrated graphics chips, whereas Fire Strike and Fire Strike Ultra are aimed at testing the most powerful dedicated GPUs.

World of Warcraft

A key ingredient in the enduring popularity of "World of Warcraft" is the fact even relatively low-end notebooks can run the game with ease. On both Windows and Mac, we use the game's built-in benchmarking script, first with the graphics effects set to autodetect, and then at the maximum. We run the test at three different resolutions, if applicable: 1366 x 768; 1920 x 1080; and then at the notebook's native resolution, if it's higher. We consider a frame rate of 30 fps or higher to be playable. A score of 30 to 50 frames per second on a mainstream notebook is relatively good.

MORE: Best Gaming Laptop 2014

Bioshock Infinite

Developed by Irrational Games, BioShock Infinite is one of the more graphically demanding games of this generation. We use the game's built-in benchmarking tool to measure the performance of notebooks with dedicated graphics processors. The benchmark runs through two separate scenes at a steady clip. We run this test six times, twice at the lowest settings (at 1366 x 768, 1080p and native resolution, if applicable) and twice at the highest settings with DirectX 11 enabled (at the same separate resolutions).

A frame rate of 30 frames per second is generally accepted as playable. At native resolutions and high settings, a score between 35 and 45 fps is considered respectable.

Metro: Last Light

This game developed by 4A Games is an even more intense test of a notebook's graphics performance than BioShock Infinite. Just as before, we run the included benchmark tool, which runs through a single scene three times and provides the average frame rate across all three runs. To give a rounded idea of performance, we run the benchmark several times. First, we run it twice at 1366 x 768 resolution. Then, we run the benchmark twice more at 1920 x 1080, then twice more at a notebook's native resolution. At each resolution, we run the test once with settings at low, and DirectX 11 tessellation and Nvidia PhysX deactivated, then again with all settings maxed and activated.

As always, 30 frames per second is considered playable. However, given Metro: Last Light's demanding nature, a score between 17 and 27 frames per second at native resolution and the highest settings is considered merely good, but still unplayable.

Battery, Hard Drive, Heat, Audio and Display Tests 

how we test battery

Laptop Battery Test

This test, developed in the Laptop labs, replicates continuous Web surfing over Wi-Fi until the battery is completely drained. Starting with a full battery, a notebook runs a script that visits 60 popular websites in a loop, pausing for 30 seconds on each, then closing and reopening the notebook's native browser with the next page. The test is run with the screen at 100 nits (as measured by the XRite colorimeter), and the notebook's settings are tweaked to prevent it from entering standby mode or going into hibernation.

Laptop File Transfer Test

This benchmarking test was developed inside the Laptop labs. During this test, a 4.97GB folder of mixed media files, including photos, documents, videos and music files of varying sizes, is copied from one folder on the notebook's hard drive to another. We record the speed, and then convert the number to MBps by dividing 5089.28 by the time (in seconds).

Heat Test

To test the system's external temperature, we stream a Hulu video at full screen for 15 minutes, and then use a Raytek MiniTemp laser temperature gauge to measure the temperature (in Fahrenheit) of the touchpad, the space between the G and H keys, and the underside of the notebook. We also measure any other hot spots on the notebook.

In the case of gaming laptops, we play a game, such as BioShock Infinite, for 15 minutes, and then retake the temperatures in those same areas.

We consider anything above 95 degrees to be uncomfortable and anything above 100 degrees too hot.

MORE: We're Improving Our Tests. Here's How

Display Brightness and Quality

To measure the brightness of a notebook's display, we enable the machine's high-contrast white background and then use an XRite colorimeter and the Dispcal app to measure the brightness of each of the four corners of the screen, as well as the center. We then average the five readings to determine the display's overall brightness in nits.

We're using the X-Rite colorimeter for more than just measuring brightness; we're also using it to see how good a screen is at rendering colors. With the Dispcal app, we measure a screen's RGB color gamut and Delta-E.

RGB color gamut is measured on a scale of 1 percent to 100 percent; the closer a screen is to 100 percent, the more colors it can display in the RGB color space. Displays are capable of exceeding the RGB color space, which is not a bad thing, but at the very least, a good display will be able to render 100 percent.

Delta E (dE) measures how accurately the screen displays different colors, with 0 being a perfect match and higher numbers reflecting lower accuracy. While one could get dE numbers for individual colors, our test measures the average of about 70 colors. Generally, a dE of 1.0 is regarded as the smallest difference a human eye can see. 

Keyboard Test

We measure two metrics on keyboards: key travel and actuation. Key travel measures the difference in the height of the key from its resting state to when it is fully depressed. Thinner notebooks will have less travel (perhaps 1 millimeter), while gaming notebooks will have greater travel. Key actuation measures the amount of force, in grams, required for a key to depress. Generally, we prefer key travel between 1.5 to 2mm or greater, and an actuation of at least 50 grams.

Laptop Audio Test

Developed within the Laptop labs, this test uses a decibel meter to measure the maximum loudness of notebook speakers. Using a steady tone file that plays on a loop into the decibel meter from 23 inches away (our determined general distance from a laptop screen when in use), we record the loudness in decibels. We conduct this test within a studio surrounded in foam paneling to minimize the effect of bouncing sound waves on the decibel reading.

Generally speaking, decibel outputs above 80 dB are considered satisfactory. Differences of as little as 3 dB equate to the doubling or halving of perceived loudness. (For example, 83 dB is commonly perceived to be twice as loud as 80 dB by the human ear.)

What Laptop's Ratings Mean

After we complete our lab testing, a product is turned over to a writer who spends a significant amount of time using the device, software or service. The writer and Laptop's editors determine a rating based on design, ease of use, features, performance and overall value. We also take into consideration the target audience of a product, what it is trying to accomplish and how it stacks up compared to the competition. Each product is rated on a scale of 1 to 5 stars, with half-star ratings possible.

The ratings should be interpreted as follows:

how we test meaning stars

how we test meaning ecThis Editor's Choice award recognizes products that are the very best in their categories at the time they are reviewed. Only those products that have received a rating of 4 stars and above are eligible. Laptop's editors carefully consider each product's individual merits and its value relative to the competitive landscape before deciding whether to bestow this award.

Recommended by Outbrain
Add a comment
  • steve Says:

    Re- Audio perceived loundess error:
    +3dB SPL is two times power, but not two times perceived loudness to a person listening..
    +3dB SPL is a barely perceptible loudness subjective increase.
    +10dB SPL is close to a subjectively perceived doubling of loudness.

    And, info on your test tone frequency & a quality factor would be interesting.

  • C Wutzke Says:

    I wonder if this spreadsheet macro test can be obtained or downloaded. I'd like to run it on my aging Dell E6420 with a i7-2760QM and see how it compares.