SMR, Encryption, Captive ASICs & Machine Learning: Challenges of Data Recovery Industry
“Most human beings have an almost infinite capacity for taking things for granted.” – Aldous Huxley, Brave New World
Data storage technology is growing at an incredible pace. Technologies such as: machine learning, multi-cloud storage, zero downtime and smart storage are hot topics. Many forget that no matter the latest storage technology one adopted, data will be stored on either a hard drive or some type of NAND flash based device!? The eternal quest for more capacity, speed and convenience has forced hard drive manufacturers to consolidate their business while many scientific discoveries and patents are being employed for the first time. Big tends to get bigger so it was not a surprise when hard drive manufacturers started acquiring solid-state drive controller manufacturers. While many believe traditional hard drive manufacturers are simply entering new and lucrative markets such as ones based on NAND flash media it turns out that the main reason is more technical in nature.
Data recovery service providers began seeing more and more Shingled Magnetic Recording (SMR) drives. Almost 90 percent of these drives are Drive Managed (DM-SMR). From Seagate to Western Digital to HGST and Toshiba these archive drives are prone to firmware corruption. It’s not a surprise since two translations need to be performed including one very similar to translator found? You guessed, in SSD controllers. So DM-SMR are like most of current technologies around us, hybrids. It seems that transfer from one technology to another requires the age of products which are sort of speak half way done.
Since FIPS was first introduced we have seen major shift in data storage technologies. Self encrypted drives led to encrypted firmware led to encrypted/disabled diagnostic ports to less and less useful updates coming from the usual reverse engineering companies located in Russia and China. Reminiscent to how Tesla trys to discourage self car repair by not providing parts and the dealership’s data storage manufacturers, under the “FIPS umbrella”, are trying to lock out 3rd party data recovery service providers.
However, more encryption is not necessarily good as it turns out that most new SSDs are now self-encrypted although users never asked for such feature and in case of failure they are forced to spend thousands of dollars to recover their data. One half of around 15 SSD controller manufacturers are currently captive, meaning they develop their own controllers and do not share them with others. This is why some Samsung EVO SSDs are notoriously difficult to recover data without manufacturers assistance. Projects like OpenSSD are a shining light into this especially dark and lucrative game.
It was not long ago, perhaps two years since Machine learning algorithms were proposed to deal with FTL complexity. Thus far we are seeing just a handful of such controllers. But it is expected this technology will catch up no latter than 2020. Traditional reverse engineering will not be capable to resolve issues and more innovative solutions are needed. Artificial intelligence has a great potential in data recovery as the next generation of data recovery engineers will have to deal with more complex math, adaptive electronics and “thinking machines”. Until then it is our job to lay the foundation as today’s machine learning designs are already adaptive enough, able to “learn on their own” and finally great at distinguishing data patterns.
Who is ready for the challenge?