Monday, December 22, 2014

Disk / Storage Timeline

First cut at timeline of significant events in Disk and Storage, ignoring "historical" devices like floppies and bubble memory. Edward Grochowski's 2012 "Flash Memory Summit" talk tracks multiple storage capacity, price & technology from 1990.

First commercial computers were built  in 1950 and 1951: LEO[UK], Zuse[DE] and UNIVAC[US].
LEO claim the first working Application in 1951.
 [1949: BINAC built by the Eckert–Mauchly Computer Corporation for Northrup]

Ignored technologies include:
Tapes: used in the first computers as large, cheap linear access storage.
Drums: in use a little later and continued for some time, often in specialist roles (paging).

Friday, July 04, 2014

OS/X Time Machine, performance comparison to command line tools.

A performance comparison for Mac Owners:

Q: Just how quick is Apple’s Time Machine?
A: Way faster than you can do with OS/X command line tools.

The headline is that command line tools take 80 minutes to do what Time Machine does in 3-10 mins.

Wednesday, June 18, 2014

RAID-1: Errors and Erasures calculations

RAID-1 Overheads (treating RAID-1 and RAID-10 as identical)

N = number of drives mirrored. N=2 for duplicated
G = number of drive-sets in a Volume Group.
\(N \times G\) is the total number of drives in Volume Group.
An array may be composed of many Volume Groups.

  • Effective Capacity
    • N=2. \( 1 \div 2 = 50\% \) [duplcated]
    • N=3. \(1 \div 3 = 33.3\% \) [triplicated]
  • I/O Overheads & scaling
    • Capacity Scaling: linear to max disks.
    • Random Read: \(N \times G \rm\ of\ rawdisk = N \times G \rm\ singledrive = RAID-0\)
    • Randdom Write: \(1 \times G \rm\ of\ rawdisk = 100\% \rm\ singledrive\)
    • Streaming Read: \(N \times G \rm\ of\ rawdisk = N \times G \rm\ singledrive = RAID-0\)
    • Streaming Write: \(1 \times G \rm\ of\ rawdisk = 100\% \rm\ singledrive\)

Thursday, June 12, 2014

mathjax test & Demo

MathJax setup in Blogger:

MathJax Examples

  1. I had to hunt for the "HTML/Javascript" gadget, down the list aways.
  2. I ended up putting the gadget in as a footer.
  3. You'll have to add that gadget to all blogs you want it to work for.
  4. Preview and Edit mode don't compute the TeX. You need to save the doc, then view the post.
  5. In compose "Options", "Line Breaks", I'm using 'Press "Enter" for line breaks.
  6. The "MyTechMemo" author doesn't use the exact code he suggests, though it works for me. His actual gadget is:
Powered by <a href="">MathJax</a>

<script type="text/javascript" src="">
Alternate Hub Config in gadget, replace just first line.
        TeX: { equationNumbers: { autoNumber: "AMS" } },
         tex2jax: {
                    inlineMath: [ ['$','$'], ["\\(","\\)"] ],
                   displayMath: [ ['$$','$$'], ["\\[","\\]"] ],
                   processEscapes: true }

Using "all", numbers all equations.
"AMS" numbers only specified equations.
<script type="text/x-mathjax-config">
TeX: { equationNumbers: {autoNumber: "all"} }

Monday, June 09, 2014

RAID++: Erasures aren't Errors

A previous piece in this series starts as quoted below the fold, raising the question: The Berkeley group in 1987 were very smart, and Leventhal in 2009 no less smart, so how did they both make the same fundamental attribution error? This isn't just a statistical "Type I" or "Type II" error, it's conflating and confusing completely differences sources of data loss.

Sunday, June 08, 2014

RAID, Archives and Tape v Disk

There's a long raging question in I.T. Operations: How best to achieve data? [What media to use?]
This question arose again for me as I was browsing retail site.


  1. The break-even for 2.5TB/6.25TB tapes is 85 and 140 tapes (compressed/uncompressed), or
    • $13,150 and $17,400 capital investment.
  2. At just 2 times data duplication, uncompressed tapes are not cost effective.
    • Enterprise backup show data duplication rates of 20-50 times.
  3. Compressed tapes are cost-effective up to 5-times data duplication.
    • If you run 10 Virtual Machines and do full backups, you've passed that threshold.

Thursday, June 05, 2014

Retail Disk prices, Enterprise drives, grouped by manufacturer & type

Table of current retail prices for various types of disk with cost-per-GB.
Only Internal drives, Hard Disks.

Disclaimer: This table is for my own point-in-time reference, does not carry any implicit or explicit recommendations or endorsement for the retailer, vendor or technologies.

Most drives are from a single manufacturer, Western Digital, to allow like-for-like comparisons.
Most manufacturers are close to the same pricing for the same specs.
  • There is ~$25 extra for SAS interface over SATA [1TB WD 'RE', SAS vs SATA]
  • There's ~$30/TB extra for higher spec drives [2TB & 3TB, WD SATA, NAS vs RE]
  • WD sell four 3.5" 1TB drives [03, 04, 26, 41]
    • SAS vs SATA, ~$25
    • about double for 10,000RPM over 7,200RPM (Velociraptor vs RE)
    • about 25% less for the Intellipower, 'Capacity' drive
  • While it's cheaper with Seagate to go from 15,000RPM/3.5" to 10,000RPM/2.5", there's no simple relation for the discount.
Western Digital list these "Purchase Decision Criteria" for drives:
  • Capacity [GB]
  • Workload Capability [duty cycle or TB read/write per year]
  • Reliability [MTBF and BER]
  • Cost/GB
  • Performance [sustained throughput,  latency or IO/sec = {RPM, seek time}]
  • Power used [not included by WD]
  • Racking density [not included by WD]