DICE: Data Intensive Computing Environment

Real Testing * Real Data * Real Results

DICE Projects

Parallel File System Comparison Project

File system performance is a key component of user application efficiency in high performance computing (HPC) data centers. Storage administrators and data center directors must choose wisely to select the correct file system to meet their users’ requirements and properly adjust settings for optimal performance. But which file system is the best for each situation? Today, there is no direct correlation between parallel file systems and many of the decisions are based upon vendor marketing. The DICE Program, in partnership with the Department of Energy’s Sandia National Laboratories, is conducting a project to develop a comprehensive benchmarking framework for evaluating file systems. This project also seeks to develop a normalization factor between different file system architectures. HPC community involvement is key to providing a viable tool and understanding all the factors for an accurate comparison, especially for file systems at large scale.

Click here for the project pages.

Power & Cooling Practices and Planning

The explosive growth in the number and size of data centers and computing power are creating an unsustainable demand for power and cooling requirements. At the current rate, America’s power infrastructure will not be able to support the growth in computing to maintain – let alone grow – America’s global competitiveness.

The HPC Research Division of Avetec (DICE) has partnered with the IDC HPC Group since early 2008 to study power and cooling practices and planning for data centers, including high performance computing (HPC) and enterprise centers as well as vendors. Research to date has included a two-phase survey of administrators representing more than 170 data centers and vendors worldwide that explored current actions, attitudes, approaches, plans and expectations related to energy efficiency and the growing problem of data center power and cooling. The studies have produced many interesting findings and clearly validated that both the HPC and IT communities are struggling with power consumption and meeting ever-increasing cooling needs.

To view Phase I and Phase II study results of this important project, go to Power & Cooling Planning and Practices for Data Centers in Reports and Publications.

Information Lifecycle Management

Information life cycle management (ILM) is a comprehensive approach to managing the flow of data and associated metadata from creation and initial storage to the time when it becomes obsolete and is deleted. ILM enables more complex criteria for storage management than data age and frequency of access.

ILM products automate the processes involved, typically organizing data into separate tiers according to specified policies and automating data migration from one tier to another based on those criteria. As a rule, newer data and data that must be accessed more frequently is stored on faster, more expensive storage media, while less critical data is stored on cheaper, slower media. However, the ILM approach recognizes that the importance of any data does not rely solely on its age or how often it's accessed. Users can specify different policies for data that declines in value at different rates or that retains its value throughout its life span.

Data Deduplication

Data deduplication eliminates redundant data and can improve the use of storage. DICE has conducted vendor-specific projects to investigate various data deduplication technologies to identify the following: capabilities, user and center impacts, security issues and inter-operability issues within a single location. Projects for Quantum StorNext and Data Domain supported the Department of Defense (DoD) High Performance Computing Modernization Program’s (HPCMP) Storage Initiative team and its investigation of the program’s current storage architectures across five major centers and two disaster recovery sites to address its challenge of storing, managing and organizing user data.

For more information click on one of the following reports:
Data Deduplication Quantum StorNext (pdf)
Data Deduplication Data Domain (pdf)