Benchmarking: CPU Performance analysis of i instances| Part 1 Coremark
Amazon recently introduced new types of storage-optimized instances. This new generation of instances is available within the I2 and HI1 families. All provide high storage and better IO performance compared to other instance families in AWS. Flux7 Labs decided to benchmark these new instances to better understand the tradeoffs between them that our customers face.
The I2 and HI1 families provide fast IO due to their SSD-backed instance stores. The instances are optimized for applications with specific storage and random I/O requirements. The HS1 type provides cost-optimized storage of 48TB and is optimized for sequential IO. The following table provides a quick view of the various instances in the I1, HI1 and HS1 families.
Faced with so many options, it’s often difficult to decide which type of instance type to choose. Performance is important, obviously, but cost is also a key consideration. So we decided to run a few basic microbenchmarks on each instance type in order to calculate performance benefits-per-dollar. We chose to use two types of benchmarks: CPU benchmarking using CoreMark software.
CPU Benchmarking Through CoreMark
CoreMark is an industry standard microbenchmark within the EEMBC suite that’s used for testing CPU performance. It has some limitations, as do all benchmarks, so the only way to get an accurate depiction of performance is to run the target application with a representative dataset. CoreMark is quite good at providing simple CPU performance comparisons. You can find more details on CoreMark here and the software can be downloaded at the CoreMark website. It’s strength is it’s simplicity of use, as it provides a single number by which different CPUs can easily be compared.
We ran the CoreMark benchmark 10 times and discarded the fastest and slowest two results in order to remove major outliers. We averaged the remaining 6 benchmarks, and then calculated the standard deviation of all 10 results in order to measure variations found in our results. Although it’s hard to accurately judge performance by just 10 runs on the same instance, we did so in order to get quick results and a fair sense of performance.
For our benchmarking we used CoreMark version 1.0. All benchmarks were run on Ubuntu 12.04 LTS 64 bit OS. The generic command that we used to run CoreMark is this:
For $NUM_CORES we used the number of VCPUs in that instance type. Here’s an example of a result file for a CoreMark run for an 8-core machine:
The number to look for is in the last line of this file, and it’s generally in this form:
CoreMark 1.0 : N / C [/ P] [/ M]
● N → Number of iterations per second with seeds 0,0,0×66,size=2000)
● C → Compiler version and flags
● P → Parameters such as data and code allocation specifics
● M → Type of Parallel algorithm execution (if used) and number of contexts
In the run shown above the CoreMark number is 74484.428099.
Here are the average CoreMark numbers of each instance type, followed by a graph of the same data:
From the chart above we can easily see that i2.8xlarge offers the highest CoreMark score, as expected. However, the price of that performance is steep. Comparing the CoreMark score against the dollar-per-hour cost for each instance, we see that performance-per-dollar is nearly the same for all ‘i’ instances, but the HS1 and HI1 instances are particularly cost-effective. Fortunately, the standard deviation between results was quite small, so we saw very few effects from AWS randomness.
Watch for the next post on FIO tool to benchmark IO.