Archive for the ‘Storage’ Category

True IT – Storage Stories: 7 (Data Wipe on the wrong machine)

October 26th, 2009 2 comments


Yea you are right, exactly as the title of this post says, Data Wipe got performed on the wrong machine.

The CE got permission from the customer to perform data wipe on a storage system. The host were retired, storage was ready to be turned off, but part of the procedures, customers typically require that all the data from the drives be cleaned.

The CE took this opportunity to remotely connect into the customers storage system. Thought it will take several hours to finish the process when he would go onsite to physically power off the storage system. Thought he logged into the machine he intended to, he got through to another one. Logically started taking the ports down through soft commands and then within 15 mins kicked off the process for data wipe.

An hour later, a SEV 1 ticket is opened at the customer site with major issues in the storage environment. To his luck he thought that while he is out taking care of the issue, he will also check on the data wipe and physically power off the storage subsystem.

On his way to the customer site, he gets a call from the Level 3 folks from the vendor support team on what they had just found on this storage system, that it was busy doing a data wipe and there was no way to stop it.

The realization set in for the CE…….that he had started a data wipe on the wrong storage system without performing the correct procedures.

Lesson Learnt

Set a corporate wide policy on how storage and server teams can perform certain task onsite and remotely. Set similar procedures with vendor teams as well.

The CE ended up losing his job, though there was no way to recover the data on the storage subsystem except for bringing the backup tapes out.

One could have the best storage execution plan in place to manage the storage environment, but is there a way to avert these exceptional cases.

Policy! Policy!! Policy!!!

October 20th, 2009 6 comments

It has been an exciting month, some new details are emerging related to automated storage tiering, workload distributions, workflow automation, SLA’s, QoS and how Policy based storage management can help solve these challenges. “Policy” as we all know in the “business world”, “advanced algorithms” as known in “scientific community” is used to solve complex storage challenges. This has been one of the favorite topics of discussion in the storage blogosphere these days.

Though there are two distinct groups of people, one favoring automation and the other half possibly thinking this technology brings no value-add in terms of how storage is utilized and managed today. This game was initially started by Compellent (Compellent Data Progression technology) about 4 years ago, then joined by Pillar Data Systems and now other OEM’s (including EMC, HDS, IBM) are starting to catchup on policy based automated storage tiering.

With private clouds in the near future and then hybrid clouds (a mesh of private and public clouds) in the horizon, automation, workload distribution, SLA’s, QoS will need to be monitored and managed to optimally run IT Infrastructures. Policy based management will create a new wave of storage management, automation and will act as a principle ingredient of hybrid clouds.

Generation 1 of policy based storage tiering works within a single storage subsystem.
Generation 2 in the near future should work across heterogeneous storage subsystems (by the same manufacturers).
Generation 3 over the next year or two will work across storage platforms irrelevant of the manufacturers.
Generation 3 of policy based management will include the entire stack of management. These products will be capable of not only managing the Storage, but also interact through policies at the Virtualization, Networking, Application, OS, Middleware and other layers in the stack of Infrastructure management..

We should see an up-rise of new emerging technologies that will create these external policy based engines for data movement automation. All infrastructure components including Storage, Virtualization, Networking, Application, OS, Middleware will provide the necessary API’s for these external engines to interact and enable data automation and workflow automation in the hybrid clouds (irrelevant of the manufacturers).

www links

Here are a few articles from the past month related to the topics of Policy, Automated Storage Tiering, Workloads, SLA’s and QoS.

Pillar (OEM)


Compellent (Partner Blog)



Your thoughts always welcome!!!


Enhancements to EMC Symmetrix V-Max Systems coming!!

October 14th, 2009 No comments

Enhancements to EMC Symmetrix V-Max system is possibly around the corner (FY09 Q4).

FAST (Fully Automated Storage Tiering) is due this quarter and will be one of the most awaited software release in the enterprise storage space by EMC.

Bundled together with FAST, possibly a new microcode version the enables FAST (its associated features) and other expected enhancements.

Though this will be a major software release and functionality upgrade, I don’t think this would qualify as a 2nd generation EMC Symmetrix V-Max system.

But fully expect EMC to release its FAST v2 and V-Max G2 somewhere around Mid year 2010.

Here are a few new features to possibly expect on the EMC Symmetrix V-Max Systems this quarter.

1. Introduction of FAST v1, which should allow automated data movement within a single Symmetrix V-Max system. Here are some features of FAST as discussed on GestaltIT and by Barry Burke (TSA) on his blog.

2. FAST v1 data movement should possibly be policy driven around factors like time (how old is the data), SLA (promised SLA’s), Tier (from Tier 0 to Tier 1 to Tier 2) and possibly I/O or IOPS based.

3. FAST v1 should allow automated policy based data movement or prompt a user for manual intervention for data movement.

4. Do not expect FAST v1 to come for free, it will possibly be licensed based on the total number of TB’s in the storage subsystem.

5. Expect some integration between the IONIX platform and FAST v1 and possibly some very tight integration with future releases of FAST and IONIX.

6. Expect FAST and IONIX to integrate very tightly with Atmos through API’s and policies. We should expect to see this with FAST v2 and not with FAST v1.

7. So when does EMC retire Symmetrix Optimizer, with FAST v1 probably not, with FAST v2 probably yes.

8. 2TB SATA II drives will be introduced (According to a Keynote from Joe Tucci in NYC), though Joe Tucci didn’t mention what platforms the 2TB SATA II drives will be available on, it seems the V-Max upgrade would be the most logical platform.

9. The 2TB SATA II drive upgrade should make the V-Max 4 PB total storage (2400 drives x 2TB), possibly the single largest storage subsystem at an enterprise level.

10. RapidIO speed upgrade from 2.5 Gbps to 4 Gbps (interconnects between the engines) upgraded either through MBIE (new processors) and / or through microcode upgrades. Edit 10/15/2009 – 12:50 PM: Not sure currently the technology that EMC uses for RapidIO, since Parallel RapidIO supports 250 Mhz to 1Ghz clocking speeds while Serial RapidIO supports 1.25Ghz to 3Ghz.

11. Drive connect speed upgrade from 4 Gbps to 8 Gbps

12. FC and FICON (Host Connects) port speeds upgrade from 4 Gbps to 8 Gbps

13. Interconnect between two separate Symmetrix V-Max Systems (8 Engines each per system) expanding into possibly 16 or 32 (max) engines. The more I think about this concept, the more it makes me feel that there are no added benefits of this architecture, rather it will add more complexities with data management and higher latency. We may not see anything related to interconnects in this upgrade, but remember how the V-Max was initially marketed with having hundreds of engines and millions of IOPS, the only way to achieve that vision is through interconnects. The longer the distance, the more latency with cache and I/O. If Interconnets end up making in this release, limitation on the distance between two Symmetrix V-Max system bays would be around 100 feet.

14. To the point above, another way of possibly connecting these systems could merely be federation through external policy based engines. Ed Saipetch and myself have speculated that concept on GestaltIT.

15. With the use of larger drive size, possibly expect a cache upgrade. Currently the Symmetrix V-Max supports 1TB total cache (512GB usable), which may get upgraded to 2TB total cache (1024 GB usable).

16. New possible microcode version 5875 that will help bring features like FAST, SATA II drives and additional cache into the Symmetrix V-Max.

17. Processors: 4 x Quad Core Intel processors on V-Max engines may not get an upgrade in this release, it should possibly be with FAST v2 as a midlife enhancement next year.

18. Further enhancements related to FCoE support.

19. Upgrade of iSCSI interface on Symmetrix V-Max engines from 1GB to 10GB (is now available with the Clariion CX4 platforms).

20. Really do not expect this to happen, but imagine RapidIO interconnects change to FCoE. Really not sure what made EMC to go with RapidIO instead of Infiniband 40 Gbps (which most of the storage industry folks think is dead) or FCoE with Engine interconnects, but if the engineers at EMC thought of RapidIO as a means to connect the V-Max engines, there has to be a reason behind it. Edit 10/15/2009 12:50 PM: Enginuity more or less doesn’t care about the underlying switching technology, making a switch from RapidIO to FCoE or Infiniband can be accomplished without a lot of pains. Though for customers already invested into RapidIO technology (with existing V-Max systems), it might be offline time to change the underlying fabric, which in most cases is unacceptable.

21. Virtual Provisioning on Virtual LUNs which is currently not supported with the existing generation of Microcode on V-Max systems.

22. Atmos currently is running as a beta release and we should expect a market release this Quarter. Should we expect to see an integration between V-Max and Atmos. I am not sure of any integration today.

23. A very interesting feature to have in the EMC Symmetrix V-Max would be system partitioning, where you can run half the V-Max engines at a certain Microcode level with a certain set of features and other half can be treated as a completely separate system with its own identity (almost like a Mainframe environment). Shouldn’t this be a feature of a modular storage array.

24. Symmetrix Management Console (SMC) and Vmware integration (like VMware aware Navisphere and Navisphere aware VMware). There is already quite a bit of support related to VMware in SMC for provisioning and allocation.

25. Also a much tighter integration between IONIX, FAST, SMC, Navisphere and Atmos may after all be the secret sauce, which would enable workflow, dataflow and importantly automation. Though do not expect this integration now, something to look forward for the next year.


Though I am still a bit confused on where FAST will physically sit.

FAST v1 can merely be a feature integrated within the Microcode, configurable & driven through policy within the Symmetrix Management Console.

FAST v2 (Sometime Mid 2010) will support in-box and out-of-box (eg: Symmetrix to Clariion to Celerra to Centera) data movement through policy engine.

Ed Saipetch and myself have speculated on GestaltIT on how that may work. Though after some thoughts, I do believe a policy engine can merely be a VM or a vAPP sitting outside the physical storage system in the Storage environment.

To promote the sales of the EMC Symmetrix V-Max systems, Barry Burke in his blog post talks about Open Replicator, Open Migrator and SRDF / DM (Data mobility) are now available at no cost for customers purchasing a new EMC Symmetrix V-Max system, these are some of the incentives that EMC is offering and further promoting the sales of its latest generation Symmetrix technology.

It remains to be seen the path of success FAST will carve for Symmetrix V-Max systems.

True IT – Storage Stories: 6 (Storage Subsystem Move)

October 13th, 2009 2 comments

datacenter move.jpg

This true customer story is related to a physical move of a storage subsystem. A need for a data center move could arise because of a wide variety of reasons. In this case the customer was moving all the IT assets from one building to another as part of a cost savings (large facilities to smaller facilities).

The annual revenues of this customer were around 250 Million a year, with several groups within their IT Business organization.

The customer were moving all IT data center assets from Building 1 to Building 2. Typically during these moves, the vendors of the IT Assets are brought in to verify power shut down procedures, label all the cables, moves, recertification of assets, connect cables, power up, data consistency checks, etc. This customer decided to make a move without involving all the necessary vendors. This move was scheduled in various phases, where all the primary servers and storage assets were being moved in phase 1. Project plans were put in place, resources scheduled, etc, etc. Things were moving along fine with phase 1 move, until it came to moving one of the storage subsystems, it was too heavy to push it across the raised floor since some storage assets need reinforced raised floor and typically that is not the case throughout the entire datacenter. Someone associated with the move decided to remove every drive that was installed in the storage subsystem, pack it in boxes and move it to Building 2. Though they didn’t label the drives as to where they came from. The storage subsystem move to Building 2 finished without any issues.

The customer quickly realized this very big mistake once it was time to power the system back online. No one had an idea of where the disk drives would be inserted based on slot addresses.

At this crucial point, the vendor was contacted and asked for help. This storage subsystem consists of all the data used for their CRM and APPs development teams. Because this storage subsystem was moved without the prior knowledge of the vendor, now they had to come in and certify the system before they could start working on the issue. The vendor knew it was a daunting task ahead.. The onsite CE asked for all the logs prior to the system shut down, which the customer were able to provide, based on slot numbers and serial numbers of disk drives, they inserted one in at a time. Most of the drive serial numbers went in fine, there were some that had been recently replaced, where there was no way to match the slot id’s to disk drive serial numbers since they were not found in the logs. The vendor took an extra step to go back to their own records to find every drive that had been replaced at this customer site and what slot id’s they had been replaced based on their service ticketing system.

12 hours of tedious work of matching serial numbers to slot id’s and finally the system was back up and running with some failed drives. Escalations, Vendor meetings, customer meetings and a 24 hour downtime could have been averted.

Lesson Learnt

Data Center moves should be taken very seriously, 99% of times plug and play is not an option.

Label every cable that was pulled out of the storage subsystem before the move.

Every IT Asset vendor should be involved in the process.

Systems should be powered off correctly based on manufacturer specifications and by the manufacturer itself, especially all storage subsystems.

Every system should be certified prior to the move and re-certified after the move, these services are typically provided for free by all the major vendors.

Vendors recommend using movers that move storage subsystems on a daily basis and it may be a good idea to involve them during this process, as they can provide extra precaution for the move.

Backup data on the storage subsystem before the move.

Run data consistency checks after the move on the storage subsystem and from the associated host system for data integrity.