The basics on DDN Announces the Biggest Big Data Object System Networks


DataDirect Networks has announced WOS 2.0, which is positioned as the world's fastest object storage system. WOS 2.0 is a fully integrated system including DDN storage, erasure coded data protection mechanisms, a replication strategy for distributing object data, and extensions to the interface options to include S3 APIs and a NAS interface.

Founded by the French in 1998 and based in California, DataDirect Networks - aka DDN - is particularly known for its high-performance storage systems for the HPC world.

In order to meet the demand of some of its customers who have to deal with the storage and archiving of very large amounts of data. The company designed in 2009 its object storage system WOS (Web Object Scaler). In its latest version, this distributed storage system can handle up to 1 exabyte of data (per cluster assembly of 128 Po).

The data is protected via an erase code mechanism coupled with replication mechanisms. The supported access mechanisms are the home API (WOS API), the Amazon S3 API as well as the Microsoft SMB protocols (and NFS via clustered gateways). The technology is either offered as an appliance on DDN nodes or in software form, deployable on HPE, Dell, SuperMicro servers, etc.

The strong point of WOS is DDN's focus on performance, which would be among the best in the market. Compared to the OpenStack Swift API, DDN estimates that it offers 12 times better performance per hard disk in the cluster and 5 times lower write latency. The manufacturer also estimates that it can read 10 times more objects per second per server than a system under Swift.

In fact, according to DDN, a WOS cluster can provide up to 9.8 TB / sec of bandwidth across a namespace and can access up to 256 million files / objects per second. Another strong point of WOS is its integration with the manufacturer's high-performance storage systems via a gateway and tiering mechanism that allows it to host data from a Spectrum Scale (ex-GPFS) or Luster cluster. or a large NAS DDN system.

According to Gartner, one of the main weaknesses of the system is that WOS does not implement authentication or encryption mechanisms on its native Rest API. Any application that knows the identifier of an object can access it. WOS assumes that the data is protected by the fact that object identifiers are hard to guess, but for Gartner, this approach is not enough.

DataDirect Networks WOS is part of the Magic Quadrant for object storage and clustered file system actors as seen by Gartner (Gartner source, October 2016) where it is cataloged as a niche player (Niche Player).

Contents
1 Big Data Growth and Challenges
2 DataDirect Networks WOS 2.0
3 WOS 2 Proof Points
4 Summary and Conclusions

Big Data Growth and Challenges
The growth in data is coming from machines, not humans, and the biggest growth is coming from sensors. Data from video, acoustic, pressure, heat, chemical, proximity, speed, and many other sensors is flooding in, in addition to the computer generation of text, tables, and graphics. Organizations that are most affected by data growth operate in the fields of video, surveillance, high performance computing, life sciences, cloud & web content, environmental monitoring, rich media, and government intelligence.

The problems of storing this big data tsunami are the normal ones of writing, indexing, provenance, security, protection, and retrieval, only on a massive scale. In traditional IT, file systems have been built to handle this. The list of file systems has grown extensively, and traditional networked files systems (NAS) have improved dramatically, with global names spaces and better metadata management. However there is a computer science consensus that these types of systems cannot scale to meet the performance and availability requirements of petabyte/exabyte with billions/trillions of records, at least not cost effectively. The World’s Fastest POSIX File Systems in 2012 is a DARPA Lustre system, which achieves about 3 billion reads and writes/day.

The biggest big data systems are now object based rather than file based. One key advantage of object storage is that the data and metadata are stored together, which eliminates many of the locking, metadata traversals, directory crawling, and file allocation table issues of traditional file systems. For example, Google claims that the Google Megastore achieves 3 billion writes and 20 billion reads per day. As of Q2 2011, the Amazon S3 system stores about half a trillion objects and reads a peak of 290,000 0bjects per second (25 billion per day).

DataDirect Networks WOS 2.0
DataDirect Networks (DDN) have introduced a WOS 2.0 (Web Object Scaler) object storage system which they claim delivers up to 55 billion small object reads and 25 billion writes/day. This is twice as many as the Amazon S3 system and twenty times the throughput of the DARPA system.

The components of the DDN system include high density storage appliances, which deliver 2 petabytes per rack and 23 petabytes per cluster. Up to 25 billion objects can be stored in a rack.

The sustained performance and data protection is achieved by combining the traditional DDN 8+2 hardware enhanced data striping together with de-clustered erasure coding. In addition, DDN has introduced an asynchronous replication capability that writes a second copy locally before replication to a remote site.

The most interesting proof-point that DDN offers is the work being done with a Department of Defense multi-agency partnership to provide large scale systems to analyze and distribute high-definition sensor data. Figure 1 shows some examples of the data challenges involved. The DDN solution offered was a Geo-distributed WOS object storage system to address the requirements of high speed and low-latency.

DDN has introduced an fully integrated end-to-end geo-distributed, scale-out object storage system, with a single namespace and single global cluster interface. This matches or exceeds the performance of the largest bespoke object systems currently deployed. In addition, the data protection mechanisms can allow recovery in place, and accommodate the introduction of very large disks.

The DDN system has the potential for wide-scale adoption by cloud providers, large organizations and government agencies.
Action Item: This announcement has integrated a number of critical technology components to provide a geo-distributed, scale-out object storage system with the potential to address high performance read/write applications. DDN claim that objects can be retrieved in 40 millisecond. For the first time this has changed the positioning of object-based systems from archive-only to general purpose. CIOs and CTOs of large organizations and service providers should long and hard at the WOS 2.0 system architecture as a potential for much lower cost internal and external cloud deployments