SteveJ's lab-notes: February 2012

Tuesday, February 14, 2012

charging a Nokia phone (C2-01) from USB. Need enough power.

[this piece is a place marker for people searching on "how to" charge from USB.
Short answer: Plug in any modern phone and it's supposed to "Just Work".]

A couple of weeks ago I bought an unlocked Nokia C2-01 from local retailer Dick Smith's.

I wanted bluetooth, 3G capability and got a direct micro-USB (DC-6) connector too.
I bought a bluetooth "handsfree" for my car as well. It came with a cigarette-lighter charger with a USB socket and a USB mini (not micro) cable.

I remembered that all mobiles sold in the European Union were mandated to use a USB charger [MoU in 2009, mandate later] and thought I'd be able to use the car-charger for everything: phone, camera, ...

The supplied 240V external phone charger worked well for the C2-01.
But I couldn't get it to charge from the in-car USB charger.

Turns out the handset charger could only supply 400ma, not the 500ma of the USB standard. Bought another in-car USB charger from Dick Smith's: works fine with both.

What had confused me was the phone wouldn't charge when I tested it with my (old) powered USB hub. Is it old and tired or was the phone already fully charged?? Need to properly test that.

I hadn't tried it with my Mac Mini, jumping to the unwarranted conclusion "this phone doesn't do USB charging". When tested, worked OK directly with the Mac...

There's one little wrinkle.
Devices like the iPad that charge from USB take more than 500ma (700ma?) - which Mac's supply, but are more than the standard. [Why some USB adaptors for portable HDD's have two USB-A connectors.]

I know a higher current has been specified for USB - but can't remember the variant. Is it just the new "USB 3" or current "USB 2" as well?

Sunday, February 12, 2012

shingled write disks: bad block 'mapping' not A Good Idea

Singled-write disks can't update sectors in-place. Plus they are likely to have sectors larger than the current 2KB. [8KB?]

The current bad-block strategy of rewriting blocks in another region of the disk is doubly flawed:

extremely high-density disks should be treated as large rewritable Optical Disks. They are great at "seek and stream", but have exceedingly poor GB/access/sec ratios. Forcing the heads to move whilst streaming data affects performance radically and should be avoided to achieve consistent/predictable good performance.
Just where and how should the spare blocks be allocated?
Not the usual "end of the disk", which forces worst case seeks.
"In the middle" forces long seeks, which is better, but not ideal.
"Close by", i.e. for every few shingled-write bands or regions, include spare blank tracks (remembering they are 5+ shingled-tracks wide).

My best strategy, in-place bad-block identification and avoidance, is two-fold:

It assumes large shingled-write bands/regions: 1-4GB.
Use a 4-16GB Flash memory as a full-region buffer, and perform continuous shingled-writes in a band/region. This allows the use of CD-ROM style Reed-Solomon product codes to cater for/correct long burst errors at low overhead.
After write, reread the shingle-write band/region and look for errors or "problematic" recording (low read signal), then re-record. The new write stream can put "synch patterns" in the not to be used areas, the heads spaced over problematic tracks or the track-width widened for the whole or part of the band/region.

This moves the cost of bad-blocks from read-time to write-time. It potentially slows the sequential write speed of the drive, but are you writing to a "no update-in-place" device for speed? No. You presumably also want the best chance possible of retrieving the data later on.

Should the strategy be tunable for the application? I'm not sure.
Firmware size and complexity must be minimal for high-reliability and low defect-rates. Only essential features can be included to achieve this aim...

Monday, February 06, 2012

modern HDD's: No more 'cylinders'

Another rule busted into a myth. How does this affect File Systems, like the original Berkeley Fast File System?

There's a really interesting piece of detail in a 2010 paper on "Shingled Writes".

Cylinder organisations are no longer advantageous.
It's faster to keep writing on the one surface than to switch heads.

With very small feature/track sizes, the time taken for a head-switch is large. The new head isn't automatically 'on track', it has to find the track... "settling time".

Bands consist of contiguous tracks on the same surface.
At first glance, it seems attractive to incorporate parallel tracks on all surfaces (i.e., cylinders) into bands.

However, a switch to another track in the same cylinder takes longer than a seek to an adjacent track:
thermal differences within the disk casing may prevent different heads from hovering over the same track in a cylinder.

To switch surfaces, the servo mechanism must first wait for several blocks of servo information to pass by to ascertain its position before it can start moving the head to its desired position exactly over the
desired track.

In contrast, a seek to an adjacent track starts out from a known position.

In both cases, there is a settling time to ensure that the head remains solidly over the track and is not oscillating over it.

Because of this difference in switching times, and contrary to traditional wisdom regarding colocation within a cylinder, bands are better constructed from contiguous tracks on the same surface.

Saturday, February 04, 2012

Intra-disk Error Correction: RAID-4 in shingled-write drives

High density shingled-write drives cannot succeed without especial attention being paid to Error Correction, not just error detection.
Sony/Philips realised this when developing the Compact Digital Audio Disk (CD) around 1980 and then again in 1985 with the "Yellow Book" CD-ROM standard for data-on-CD. The intrinsic bit error rate of ~ 1 in 10⁵ becomes "infinitesimal" to quote one tutorial, with burst errors of ~4,000 bits corrected by the two lower layers.

Error rates and sensitivity to defects increase considerably as feature sizes reach their limit. The 256Kbit DRAM chips took years to come into production after 64Kbit chips because manufacturing yields were low. Almost every chip worked well enough, but had some defects causing it to be failed in testing. The solution was to overbuild the chips and swap defective columns with spares during testing.

Shingled-write disks, with their "replace whole region, never update-in-place", allow for a different class of Error Protection. RAID techniques with fixed parity disks seem a suitable candidate when individual sectors are never updated. Network Appliance very successfully leveraged this with their WAFL file system.

That shingled-write disks require good Error Correction should be without dispute.
What type of ECC (Error Correcting Code) to choose is an engineering problem based on the expected types of errors and the level of Data Protection required. I've previously written that for backup and archival purposes, the probable main uses of shingled-write disks, bit error rates of 1 in 10⁶⁰ should be a minimum.

One of the advantages of shingled-write disks, is that each shingled-write region can be laid down in one go from a Flash memory buffer.
It can then be re-read and rewritten catering for the disk characteristics found:

excessive track cross-talk,
writes affected by excessive head movement (external vibration),
individual media defects or moving contamination,
areas of poor media, and
low signal or high signal-to-noise ratio due to age, wear or production variations.

Depending on the application, multiple rewrites may be attempted.
It would even be possible, given spare write-regions, for drives to periodically read and rewrite all data to the new areas. This is fraught because the extra "duty cycle" will decrease drive life plus if the drive finds uncorrectable errors when the attached host(s) weren't addressing it, what should be done?

Reed-Solomon encoding is well proven in Optical Disks: CD, CD-ROM and DVD and probably in-use now for 2Kb sector disks.
Reed-Solomon codes can be "tuned" to the application, the amount of parity overhead can be varied and other techniques like scrambling and combined in Product Codes.

R-S codes have a downside: complexity of encoders and decoders.
[This can mean speed and throughput as well. Some decoding algorithmns require multiple passes to correct all errors.]

For a single platter shingled-write drive, Error Correcting codes (e.g. Reed-Solomon) are the only option to address long burst errors caused by recording drop-outs.

For multi-platter shingled-write disks, another option is possible:
RAID-4, or block-wise parity (XOR) on a dedicated drive (in this case, 'surface').

2.5 in drives can have 2 or 3 platters, i.e. 4 or 6 surfaces.
Dedicating one surface to parity gives 25% and 16.7% overhead respectively, higher than the ~12.5% Reed-Solomon overhead in the top layer of CD-ROM's.
With 4 platters, or 8 surfaces, overhead is 12.5%, matching that of CD-ROM, layer 3.

XOR parity generation and checking is fast, efficient and well understood, this is it's attraction.
But despite a large overhead, it:

can at best only correct a single sector in error, fails on two dead sectors in the sector set,
relies on the underlying layer to flag drop-outs/erasures, and
relies on the CRC check to be perfect.

If the raw bit error rate is 1 in 10¹⁴ with 2Kb sectors. The probability of any sector having an uncorrected error is 6.25 x 10 ^-9.
The probability of two sectors in a set being in error is:
6.25 x 10 ^-9 * 6.25 x 10 ^-9 = 4 x 10 ^-17

This is well below what CD-ROM achieves.
But, to give intra-disk RAID-4 its due:

corrects a burst error of 16,000 bits. Four times the CD limit.
will correct every fourth sector on each surface
is deterministic in speed. Reed-Solomon decoding algorithms can require multiple passes to fully correct all data.

I'm thinking the two schemes could be used together and would complement each other.
Just how, not yet sure. A start would be to group together 5-6 sectors with a shared ECC in an attempt to limit the number of ganged failed sector reads in a RAID'd sector set.

Multiple arms/actuators and shingled-write drives

Previously, I've written on last-gen (Z-gen) shingled-write drives and mentioned multiple independent arms/actuators:

separating "heavy" heads (write + heater) from lightweight read-only, and
using dual sets of heads, either read-only or read-write.

There is some definitive work by Dr. Sudhanva Gurumurthi of University of Virginia and his students on using multiple arms/actuators in current drives, especially those with Variable Angular Velocity, not Constant Angular Velocity, maybe nearer Constant Linear Velocity - approaches used in Optical drives.
E.g. "Intra-disk Parallelism" thesis and "Energy-Efficient Storage Systems" page.

The physics are good and calculations impressive, but where is the commercial take-up?
Extra arms/actuators and drive/head electronics are expensive, plus need mounting area on the case.
What's the "value proposition" for the customer?

Both manufacturers and consumers have to be convinced it's a worthwhile idea and there is some real value. Possibly the problem is two-fold:

extra heads don't increase capacity, only reduce seek time (needed for "green" drives), a hard sell.
would customers prefer two sets of heads mounted in two drives with double the capacity, the flexibility to mirror data and replace individual units.

Adopting dual-heads in shingled-write drives might be attractive:

shingled-write holds the potential to double or more the track density with the same technology/parts. [Similar to the 50% increase early RLL controllers gave over MFM drives.]
Improving drive $$/GB is at least an incentive to produce and buy them.
We've no idea how sensitive to vibration drives with these very small bit-cells will be.
Having symmetric head movement will cancel most vibration harmonics, helping settling inside the drive and reducing impact externally.

To appreciate the need

Ed Grochowski, in 2011 compared DRAM, Flash and HDD, calculating bitcell sizes for a 3.5 in disk and 750Gb platter used in 3TB drives (max 5 platters in a 25.4 mm thick drive).

The head lithography is 37nm, tracks are 74nm wide and now with perpendicular recording, 13nm long.
The outside track is 87.5mm diameter, or 275 mm in length, hold a potential 21M bitcells, yielding 2-2.5MB of usable data. With 2KB sectors, ~1,000 sectors/track maximum.

The inner track is 25.5mm diameter, 80mm in length: 29% of the outside track length.
The 31mm wide write-area contains up to 418,000 tracks with a total length of ~5 km.
Modern drives group tracks in "zones" and vary rotational velocity. The number of zones and how closely they approximate "Constant Linear Velocity", like early CD drives, isn't discussed in vendors data sheets.

While Grocowski doesn't mention clocking or sector overheads (headers, sync bits, CRC/ECC) and inter-sector gaps.
Working backwards from the total track length, the track 'pitch' is around 150nm, leaving a gap of roughly a full track width between tracks.

I've not seen mentioned bearing runout and wobble that require heads to constantly adjust tracking, a major issue with Optical disks. Control of peripheral disk dimensions and ensuing problems is discussed.

As tracks become thinner, seeking to, and staying on, a given track becomes increasingly difficult. These are extremely small targets to find on a disk and tracking requires very fine control needing both very precise electronics and high-precision mechanical components in the arms and actuators.

Dr Gurumurthi notes in "HDD basics" that this "settling-time" becomes more important with smaller disks and higher track density.

This, as well as thinner tracks, is the space that shingled-writing is looking to exploit.
The track width and pitch become the same, around 35nm for a 4-fold increase in track density using current heads, less inter-region gaps and other overheads.

Introducing counter-balancing dual heads/actuators may be necessary to successfully track the very small features of shingled-write disks. A 2-4 times capacity gain would justify the extra cost/complexity for manufacturers and customers.

Wednesday, February 01, 2012

Z-gen hard disks: shingled writes

We are approaching the limits of magnetic hard disk drives, probably before 2020, with 4-8TB per 2.5 inch platter.

One of the new key technologies proposed is "shingled writes", where new tracks partially overwrite an adjacent, previously written track, making the effective track-width of the write heads much smaller. Across the disk, multiple inter-track blank (guard) areas are needed to allow the shingling to start and finish, creating "write regions" (super-tracks?) as the smallest recordable area, instead of single tracks. The cost of discarding the guard distance between tracks is higher cross-talk, requiring more aggressive low-level Error Correction and Detection schemes.

In the worst case, a single sector update, the drive has to read the whole write-region into local memory, update the sector and then rewrite the whole write-region. These multi-track writes, with one disk revolution per track, are not only slow and make the drive unavailable for the duration, but require additional internal resources to perform, including memory to store a whole region. Current drive buffers would limit the size of regions to a few 10's of MB, which may not yield a worthwhile capacity improvement.

The shingled-write technique has severe limitations for random "update-in-place" usage:

write-regions either have to be very small with many inter-region gaps/guard areas, considerably reducing the areal recording density and obviating its benefits, or
have relatively few very large write-regions to achieve 90+% theoretical maximum areal recording density at the expense of update times in the order of 10-100 seconds and significant on-drive memory. This substantially increases cost if SRAM is used and complexity if DRAM is used.

Clearly, shingled writes are not an optimum solution for drives used for random writes and update-in-place, they are perhaps the worse solution for this sort of work load.

A tempting solution is to adopt a "log structured" approach, such as used in Flash Memory in the SSD FTL (Flash Translation Layer), and map logical sectors to physical location:
write sector updates to a log-file, don't do updates-in-place and securely maintain a logical-to-physical sector map.

Contiguous logical sectors are initially written physically adjacent, but over time as sectors are updated multiple times, contiguous logical sectors will be spread widely across the disk. Thereby radically slowing streaming read rates, leaving many "dead" sectors reducing the effective capacity and requiring active consolidation, or "compaction".

The drive controllers still have to perform logical-to-physical sector mapping, optimally order reads and reassemble the contiguous logical stream.

Methods to ameliorate disruption of spatial proximity must trade space for speed:

either allow low-density (non-shingled) sets of tracks in the inter-region gaps specifically for sector updates, or
leave an update area at the end of every write-region.
Larger update areas lower effective capacity/areal density, whilst smaller areas are saturated more quickly.

Both these update-expansion area approaches have a "capacity wall", or an inherent hard-limit:
what to do when the update-area is exhausted?

Pre-emptive strategies, such as predictive compaction, must be scheduled for low activity times to minimise performance impact, requiring the drive to have significant resources, a good time and date source and to second-guess its load-generators. Embedding additional complex software in millions of disk drives that need to achieve better than "six nines" reliability creates an administration, security and data-loss liability nightmare for consumers and vendors alike.

The potential for "unanticipated interactions" is high. The most probable are defeating Operating System block-driver attempts to optimally organise disk I/O requests and duplicating housekeeping functions like compaction and block relocation, resulting in write avalanches and infinite cascading updates triggered when drives near full capacity.

In RAID applications, the variable and unpredictable I/O response time would trigger many false parity recovery operations, offsetting the higher capacity gained with significant performance penalties.

In summary, trying to hide the recording structure from the Operating System, with its global view and deep resources, will be counter-effective for update-in-place use.

Shingled-write drives are not suitable for high-intensity update-in-place uses such as databases.

There are workloads that are a very good match to large-region update whole-disk structures:

write-once or very low change-rate data, such as video/audio files or Operating System libraries etc,
log files, when preallocated and written in append-only mode,
distributed/shared permanent data, such as Google's compressed web pages and index files,
read-only snapshots,
backups,
archives, and
hybrid systems designed for the structure, using techniques like Overlay mounts with updates written to more volatile-friendly media such as speed-optimised disks or Flash Memory.

Shingled-write drives with non-updateable, large write-regions are a perfect match for an increasingly important HDD application area: "Seek and Stream".

There are already multiple classes of disk drives:

cost-optimised drives,
robust drives for mobile application,
capacity-optimised Enterprise drives,
speed-optimised Enterprise drives, and
"green" or power-minimised variants of each class.

Pure shingled-write drives could be considered a new class of drive:

capacity-optimised, write whole-region, never update. Not unlike CD-RW or DVD-RAM.

With Bit-Patterned-Media (BPM), another key technology needed for Z-gen drives, a further refinement is possible for write whole-region drives:

continuous spiral tracks per write-region, as used by Optical drives.

Lastly, an on-drive write-buffer, of Flash memory, of 1 or 2 write-regions size would, I suspect, improve drive performance significantly and allow additional optimisations or Forward Error Correction in the recording electronics/algorithms.

For a Z-gen drive with 4TB/platter and 4Gbps raw bit-rates, 2-8Gb write-regions may be close to optimal. Around 1,000 regions per drive would also fit nicely with CLV (Constant Linear Velocity) and power-reducing slow-spin techniques.

A refinement would be to allow variable size regions to precisely match the size of data written, in much the same way that 1/2 inch tape drives wrote variable sized blocks. This technique allows the Operating System to avoid wasted space or complex aggregation needed to match file and disk recording-unit sizes. This is not quite the "Count Key Data" organisation of old mainframe drives (described by Patterson et al in 1988 as "Single Large Expensive Drives").

Like Optical disks, particularly the CD-ROM "Mode 1" layer, additional Forward Error Correction can be cheaply built into the region data to achieve protection from burst-errors and achieve unrecoverable bit-error rates in excess of 1 in 10⁶⁰ both on-disk and for off-disk transfers.

For 100-year archival data to be stored on disks, it has to be moved and recreated every 5-7 years, forcing errors to be crystallised each time. Petabyte RAID'd collections using drives with 1 in 10¹⁶ bits-in-error only achieve a 99.5% probability of successful rebuild with RAID-6. Data migrations are effective RAID rebuilds. Twenty consecutive rebuilds have a 10% probability of complete data loss, an unacceptably high rate. Duplicating the systems reduces this to a 1% probability of complete data loss, but at a 100% overhead. The modern Error Correction techniques suggested here require modest (10-20%) overheads and would improve data protection to less than 0.1% data loss, though not obviating the problems of failing hardware.

In a world of automatic data de-duplication and on-disk compression, data protection/preservation becomes a very high priority. Storage efficiency brings the cost of "single point of failure". This adds another impetus to add good Error Correction to write-regions.

A 4GB region, written at 4Gbps would stream in 8-10 seconds. Buffering an unwritten region in local Flash Memory would allow fast access to the data both before and during the commit to disk operation, given sufficient excess Flash read bandwidth.
Note, this 4-8Gbps bandwidth, is an important constraint on the Flash Memory organisation. Unlike SSD's, because of the direct access and sequential-write, no FTL is required, but bad-block and worn-block management are still necessary.

4GB, approximately the size of a DVD, is known to be a useful and manageable size with a well understood file system (ISO 9660) available, it would match shingled-disk write-regions well. Working with these region-sizes and file system is building on well-known, tested and understood capabilities, allowing rapid development and a safe transition.

Using low-cost MLC Flash Memory with a life perhaps of 10,000 erase cycles would allow the whole platter (1,000 regions) to be rewritten at least 10 times. Allowing a 25% over-provisioning of Flash may improve the lifetime appreciably as is done with SSD's, which could be a "point of difference" for different drive variants.
Specifically, the cache suggested is write-only, not a read-cache. The drive usage is intended to be "Seek and Stream", which does not benefit from an on-drive read-cache. For servers and disk-appliances, 4GB of DRAM cache is now an insignificant cost and the optimal location for a read cache.

Provisioning enough Flash Memory for multiple uncommitted regions, even 2 or 3, may also be a useful "point of difference" for either Enterprise or Consumer applications. Until this drive organisation is simulated and Operating and File Systems are written and trialled/tested against them, real-world requirements and advantages of larger cache sizes are uncertain.

Depending on the head configuration, the data could be read-back after write as a whole region and re-recorded as necessary. The recording electronics perhaps adjusting for detected media defects and optimising recording parameters for the individual surface-head characteristics in the region.

If regions are only written at the unused end of drives, such as for archives or digital libraries, maximum effective drive capacity is guaranteed, there is no lost space. Write-once, Read-Many is an increasingly common and important application for drives.
A side-effect is that individual drives will, like 1/2 inch tapes of old, vary in achieved capacity, though of the same notional size, but you can't know until the limit is reached. Operating and File Systems have dealt with "bad blocks" for many decades and can potentially use that approach to cope with variable drive capacity, though it is not a perfect match. Artificially limiting drive capacity to the "Least Common Denominator", either by the consumer or vendor, is also likely. "Over-clocking" of CPU's shows that some consumers will push the envelope and attempt to subvert/overcome any arbitrary hardware limits imposed. If any popular Operating System can't cope easily with uncertain drive capacity and variable regions, this will limit their uptake in that market, though experience suggests not for long if their is an appreciable price or capacity/performance differential.

When re-writing a drive, the most cautious approach is to first logically erase the whole drive and then start recording again, overwriting everything. The most optimistic approach is to logically erase 2 or 3 regions, the region you'd like to write and enough of a physical cushion to allow defects etc not to cause an unintended region overwrite.

This suggests two additional drive commands are needed:

write region without overwriting next region (or named region)
query notional region size available from "current position" to next region or end-of-disk.

This raises an implementation detail beyond me:
Are explicit "erase region" or "free region" operations required?
Would they physically write to every raw bit-location in a region or not?

On Heat Assisted Magnet Recording (HAMR), drive vibration, variable speed and multiple heads/arms.

HAMR is the other key technologies (along with BPM and shingled-writes) being explored/researched to achieve Z-gen capacities.
It requires the heating of the media, presumably over the Curie Point, to erase the existing magnet fields. Without specific knowledge, I'm guessing those heads will be bigger and heavier than current heads, and considerably larger and heavier than the read heads needed.

Large write-regions with wide guard areas between, would seem to be very well suited to HAMR and its implied low-precision heating element(s). Relieving the heating elements of the same precision requirements as the write and read heads may make the system easier to construct and control and hence record more reliably. Though this is pure conjecture on my part.

Dr. Sudhanva Gurumurthi and his students have extensively researched and written about the impact of drive rotational velocity, power-use and multiple heads. From the timing of their publications and the release of slow-spin and variable-speed drives, its reasonable to infer that Gurumurthi's work was taken up by the HDD manufacturers.

Being used for "Seek and Stream", not for high-intensity Random I/O, HDD's will exhibit considerably less head/actuator movement resulting in much less generated vibration if nothing else changes. This improves operational reliability and lowers induces errors greatly by removing most of the drive-generated vibration. At the very least, less dampening will be needed in high-density storage arrays.

Implied in the "Seek and Stream" is that I/O characteristics will be different, either:

nearly 100% write for an archive or logging drive with zero long seeks, or
nearly 100% read for a digital library or distributed data with moderate long seeking.

In both scenarios, seeks would reduce from the current 250-500/sec to 0.1-10/sec.
For continuous spiral tracks, head movement is continuous and smooth for the duration of the streaming read/write, removing entirely the sudden impulses of track seeks. For regions of discrete, concentric tracks, the head movements contain the minimum impulse energy. Good both for power-use and induced vibration.

Drawing on Gurumurthi's work on multiple heads compensating for performance of slow-spin drives, this head/actuator arrangement for HAMR with shingled-write may be benefical:

Separate "heavy head", either heating element, write-head or combined heater-write head.
Dual light-weight read heads, mounted diagonally from each other and at right-angles to the "heavy head".

Because write operations are infrequent in both scenarios, the "heavy head" will be normally unloaded, even leaving the heating elements (if lasers aren't used) normally off. The 2-10 second region-write time, possibly with 1 or 2 rewrite attempts, means 10 msec heater ramp-up would not materially affect performance.

A single write head can only achieve half the maximum raw transfer rate of dual read heads. Operating Systems have not had to deal with this sort of asymmetry before and it could flush out bugs due to false assumptions.

Separating the read and "heavy" heads reduces the arm/bearing engineering requirements and actuator power for the usual dominant case - reading. By slamming around lighter loads, lower impulses are produced. Because the drives are not attempting high-intensity random I/O, lower seek performance is acceptable. The energy used in accelerating/de-accelerating any mass is proportional to velocity². Reducing the arm seek velocity by 70% halves the energy needed and the impulse energy needing to be dissipated. (Lower g-forces also reduce the amplitude of the impulse, though I can't remember the relationship.)

With "Seek and Stream" mode of operation, for a well tuned/balanced system, the dominant time factor is "streaming". The raw I/O transfer rate is of primary concern. The seek rate, especially for read, can be scaled back with little loss in aggregate throughput.

Optimising these factors is beyond my knowledge.

By using dual opposing read heads, impulses can be further reduced by synchronising the major seek movements of the read heads/arms.
As well, both heads can read the same region simultaneously, doubling the read throughput. This could be as simple as having each head read alternate tracks, or in a spiral track, the second head starts halfway through the read area, though to simply achieve maximum bandwidth, the requesting initiator may have to be able to cope with two parallel streams, then join the fragments in its buffer. Not ideal, but attainable.

Assuming shingled-writes, dual spiral tracks would allow simple interleaving of simultaneous read streams, but would either need two write heads similarly diagonally opposed or a single device with two heads offset by a track width and possibly staggered in the direction of travel to be assembled. Would the a single laser heating element suffice two write heads?? This arrangement sounds overly complicated, difficult to consistently manufacture to high-precision and expensive.

For a single spiral track with dual read-heads, a dual spiral can be simulated, though achieving full throughput requires more local buffer space.
The controller moves the heads to adjacent tracks and reads a full-track from each into a first set of buffers, it then concatenates the buffers and streams the data stream. After the first track, the heads are leap-frogged and stream to an alternate set of buffers, which are then concatenated and streamed while the heads are leap-frogged and switch back to the first set of buffers, etc.

This scheme doesn't need to buffer an exact track, but something larger than the longest track at a small loss of speed. If a 1MB "track" size is chosen, then 4MB of buffer space is required. Data can begin being streamed from the first byte of track 0, though only after both buffers are full can full-speed transfers happen.

It's possible to de-interleave the data when written and reorder before writing with alternate sectors offset by half the now write buffer size (2MB for a 4MB buffer). On reading, directly after the initial seek to the same 4MB segment but offsets zero and 2MB, heads will read alternate sectors which can then be interleaved easily and output at full bandwidth. When a head reaches the end of a segment (4MB), they jump to the next segment and start streaming again. Some buffering will be required because of the variable track size and geometric head offsets. I'm not sure if either scheme is superior.

Summary:

shingled-write drives form a new class of "write whole-region, never update" capacity-optimised drives. As such, they are NOT "drop-in replacements" for current HDD's, but require some tailoring of Operating and File Systems.
abandon the notional single-sector organisation for multi-sector variable blocking similar to old 1/2 tape.
large write-regions (2-8GB) of variable size with small inter-region gaps maximise achievable drive capacity and minimise file system lost-space due to disk and file system size mismatches. If regions are fixed-sector organised, lost space will average around a half-sector, under 1/1000th overhead.
Appending regions to disks is the optimal recording method.
Optimisation techniques used in Optical Drives, such as continuous spiral tracks and CD-ROM's high resilience Error Correction, can be applied to fixed-sectors and whole shingled-write regions.
Integral high-bandwidth Flash Memory write caches would allow optimal region recording at low cost, including read-back and location optimised re-recording.
Shingled-writes would benefit from purpose-designed BPM media, but could be usefully implemented with current technologies to achieve higher capacities, though perhaps exposing individual drive variability.
shingled-writes and large, "never updated" regions work well with HAMR, BPM, separated read/write heads and dual light-weight read heads.

SteveJ's lab-notes

Tuesday, February 14, 2012

charging a Nokia phone (C2-01) from USB. Need enough power.

Sunday, February 12, 2012

shingled write disks: bad block 'mapping' not A Good Idea

Monday, February 06, 2012

modern HDD's: No more 'cylinders'

Saturday, February 04, 2012

Intra-disk Error Correction: RAID-4 in shingled-write drives

Multiple arms/actuators and shingled-write drives

Wednesday, February 01, 2012

Z-gen hard disks: shingled writes

Index

Blog Archive

About Me

Labels

MathJax