AnandTech Storage Bench 2013

Our Storage Bench 2013 focuses on worst-case multitasking and IO consistency. Similar to our earlier Storage Benches, the test is still application trace based – we record all IO requests made to a test system and play them back on the drive we are testing and run statistical analysis on the drive's responses. There are 49.8 million IO operations in total with 1583.0GB of reads and 875.6GB of writes. I'm not including the full description of the test for better readability, so make sure to read our Storage Bench 2013 introduction for the full details.

AnandTech Storage Bench 2013 - The Destroyer
Workload Description Applications Used
Photo Sync/Editing Import images, edit, export Adobe Photoshop CS6, Adobe Lightroom 4, Dropbox
Gaming Download/install games, play games Steam, Deus Ex, Skyrim, Starcraft 2, Bioshock Infinite
Virtualization Run/manage VM, use general apps inside VM VirtualBox
General Productivity Browse the web, manage local email, copy files, encrypt/decrypt files, backup system, download content, virus/malware scan Chrome, IE10, Outlook, Windows 8, AxCrypt, uTorrent, Ad-Aware
Video Playback Copy and watch movies Windows 8
Application Development Compile projects, check out code, download code samples Visual Studio 2012

We are reporting two primary metrics with the Destroyer: average data rate in MB/s and average service time in microseconds. The former gives you an idea of the throughput of the drive during the time that it was running the test workload. This can be a very good indication of overall performance. What average data rate doesn't do a good job of is taking into account response time of very bursty (read: high queue depth) IO. By reporting average service time we heavily weigh latency for queued IOs. You'll note that this is a metric we have been reporting in our enterprise benchmarks for a while now. With the client tests maturing, the time was right for a little convergence.

Storage Bench 2013 - The Destroyer (Data Rate)

Wow, this actually looks pretty bad. The 256GB M600 is slower than the 256GB MX100 and I guess it is due to the fact that under sustained workloads, the M600 will have to transfer data from SLC to MLC at the same time it is taking in host IOs, so the performance drops due to the internal IO overhead. The 1TB drive does better thanks to higher parallelism, but even then the M550 and 840 EVO are faster.

Storage Bench 2013 - The Destroyer (Service Time)

Performance Consistency & TRIM Validation AnandTech Storage Bench 2011
Comments Locked

56 Comments

View All Comments

  • Kristian Vättö - Monday, September 29, 2014 - link

    I thought I had that there, but looks like I forgot to add it in a hurry. Anyway, I've added it now :)
  • MarcHFR - Tuesday, September 30, 2014 - link

    Shodanshok,

    In fact DWA is not better for endurance, it's worst.
    - Writting random writes in sequential form is already done on all SSD by write combining.
    - DWA increase write amplification since the data is first wrote in "SLC" mode then rewrote in "MLC mode".

    For 2 bit of data :
    - 2 cells are used for SLC mode
    - then 1 cell is used for MLC mode
    vs
    - 1 cell is used for MLC mode w/o DWA

    Since write speed is rarely a problem in daily usage and since there is counterpart, i don't understand the positive reception for TurboWrite, nCache 2, Dynamic Write Acceleration, etc...
  • shodanshok - Tuesday, September 30, 2014 - link

    Hi,
    it really depends on how the Write Acceleration is implemented. While it is true that badly designed WA caches can have a bad effect of flash endurance, a good designed one (and under a favorable workload) can lessen the load on the flash as a whole.

    Micron is not discussing their pSLC implementation in detail, so let speak about Sandisk NCache which is more understood at the moment.

    NCache works by reserving a fixed amount of NAND die to pSLC. This pSLC slice, while built on top of MLC cells, is good for, say, 10X the cycles of standard MLC (so ~30.000 cycles). The reason is simple: by using them as SLC, you have much higher margin for voltage drop.

    Now, lets follow a write down to the flash. When a write arrive to the disk, it places the new data to the pSLC array. After that we have two possibilities:

    1. no new write for the same LBA arrives in short time, so the pSLC array is flushed to the main MLC portion. Total writes with WA: 2 (1 pSLC / 1 MLC) - without WA: 1 (MLC)

    2. if a new write is recorder for the same LBA _before_ the pSLC array is flushed, the new write will overwrite the data stored in the pSLC portion. After some idle time, the pSLC array is flushed to the MCL one. Total writes with WA: 3 (2 pSLC / 1 MLC) - without WA: 2 (MLC)

    In the rewrite scenario (n.2) the MLC portion see only a single write vs the two MLC writes of the no-WA drive. While it is true that the pSLC portion sustain increased stress, its longevity is much longer than the main MLC array so it should not be a problem is their cycles are "eaten" faster. On the other hand, the MLC array is much more prone to flash wearing, so any decrease in writes are very welcomed.

    This rewrite behavior is the exact reason behind SanDisk's quoted write amplification number, which is only 0.8: without Write Acceleration, a write amplification less then 1.0 can be achieved only using come compression/deduplication scheme.

    Regards.
  • MarcHFR - Tuesday, September 30, 2014 - link

    As you said it's really depend on the workload for nCache 2, write vs rewrite.

    But another point of view is that for example a 120 GB Ultra II with 5 GB nCache 2.0 could be a 135 GB Ultra II without NAND die reserved to nCache 2.0.
  • shodanshok - Tuesday, September 30, 2014 - link

    True, but rewrite is quite pervasive.

    For example, any modern, journaled filesystem will constantly rewrite an on-disk circular buffer.
    Databases use a similar concept (double-write) with another on-disk circular buffer.
    The swapfile is constantly rewritten
    ...

    Anyway, it surely remain a matter of favorable vs unfavorable workload.

    Regards.
  • Cerb - Tuesday, September 30, 2014 - link

    Only some FSes, usually with non-default settings, will double-write any file data, though. What most do is some form of meta-data journaling, where new writes preferably go into free space (one more reason not to fill your drives all the way up!), and the journal logs the writing of the new state. But, the data itself is not in the journal. EXT3/4 can be set write twice, but don't by default. NTFS, JFS, and XFS, among others, simply don't have such a feature at all. So, the additional writing is pretty minimal, being just metadata. You're not likely to be writing GBs/day to the FS journal.

    Databases generally should write everything twice, though, so that they never are in an unrecoverable state, if the hardware is functioning correctly.
  • AnnonymousCoward - Monday, September 29, 2014 - link

    I have yet to get an answer to this question: what's the point of doing purely synthetic and relative-performance tests, and how does that tell the reader the tangible difference of these drives?

    You don't test video cards in terms of IOPS or how fast they pound through a made-up suite. You test what matters: fps.

    You also test what matters for CPUs: encoding time, gaming fps, or CAD filter time.

    With phones, you test actual battery time or actual page loading time.

    With SSDs, why would you not test things like how fast Windows loads, program load time, and time to transfer files? That matters more than any of the current tests! Where am I going wrong, Kristian?
  • Kristian Vättö - Monday, September 29, 2014 - link

    Proper real world testing is subject to too many variables to be truly reproducible and accurate. Testing Windows boot time and app load time is something that can be done, but the fact is that in a real world scenario you will be having more than one app running at a time and a countless number of Windows background processes. Once more variables are introduced to the test, the results become less accurate unless all variables can be accurately measured, which cannot really be done (at least not without extensive knowledge of Windows' architecture).

    The reasoning is the same as to why we don't test real-time or multiplayer gaming performance. It's just that the test scenarios are not fully reproducible unless the test is scripted to run the exact same scenario over and over again (like the built-in game benchmarks and our Storage Benches).

    That said, I've been working on making our Storage Bench data more relevant to real world usage and I already have a plan on how to do that. It won't change the limitations of the test (it's still trace-based with no TRIM available, unfortunately), but I hope to present the data in a way that is more relevant than just pure MB/s.
  • AnnonymousCoward - Tuesday, September 30, 2014 - link

    Thanks for your reply.

    You said it yourself: boot time and app load time can be done. These are 2 of the top 5 reasons people buy SSDs. To get around the "uncontrolled" nature, just do multiple trials and take the average.

    Add a 3rd test: app load time while heavy background activity is going on, such as copying a 5GB file to an HDD.

    4th test: IrfanView batch conversion; time to re-save 100 JPEG files.

    All of those can be done on a fresh Windows install with minimal variables.
  • AnnonymousCoward - Tuesday, September 30, 2014 - link

    To expand on my 3rd test: kick off a program that scans your hard drive (like anti-spyware or anti-virus) and then test app load time.

    You might be overestimating the amount of disk transfers that go on during normal computer usage. Right now, for example, I've got 7 programs open, and task manager shows 0% CPU usage on all 4 cores. It takes the same time to launch any app now as when I have 0 other programs open. So I think the test set I described would be quite representative of real life, and a massive benefit over what you're currently testing.

Log in

Don't have an account? Sign up now