Category: Point of View

Point of View | IBM FlashSystem 9100 Combines Performance of NVMe and Capability of Spectrum Virtualize

Point of View | IBM FlashSystem 9100 Combines Performance of NVMe and Capability of Spectrum Virtualize

IBM’s FlashSystem 900 is a high-performing all-flash array. If you want extremely low latency for your application, this is the box for you. However, if you want advanced functionality, such as data replication, sub-volume tiering, or point-in-time copies, you need something more than a FlashSystem 900. One solution might be to place your FlashSystem 900 behind an IBM SAN Volume Controller (SVC) cluster. This is a proven solution, providing SVC’s feature-rich Spectrum Virtualize software stack and FlashSystem’s performance in 6U of rack space.

But what if IBM combined these two products, threw in Non-Volatile Memory Express (NVMe), and put it all in one tidy package? With the new FlashSystem 9100, IBM has done just that. It combines the robust feature set of the Spectrum Virtualize software stack with the extreme performance of FlashSystem, in just a 2U rack footprint. The benefits of this marriage are many; let’s discuss a few of them.

Hardware Performance

The flash drives in the FlashSystem 9100 control enclosure are connected via Non-Volatile Memory Express (NVMe). NVMe is an efficient storage protocol, with less overhead than previous protocols such as SCSI. A future code upgrade will allow the FlashSystem 9100 to provide NVMe over Fiber Channel (FC-NVMe); the hardware is capable today. This is a significant advancement, as it makes use of the NVMe protocol between the storage system and the host, eliminating the overhead of SCSI on the host side. High SCSI workloads can consume significant CPU resources on the host. NVMe is designed to be significantly more efficient, allowing higher workloads and deeper queues, with lower CPU utilization than SCSI.

FlashSystem 9100 can be outfitted with either industry standard NVMe drives, or IBM FlashCore Modules. FlashCore Modules have hardware-based compression built in, which imposes no performance penalty. This can provide significant capacity savings with no impact on throughput or latency. Customers can get a savings guarantee from IBM… up to 5:1 if IBM can analyze the customer’s data, and even 2:1 with no data analysis.

Software Capabilities

Hardware is only half of the story. The addition of the Spectrum Virtual software stack really enhances the appeal of the FlashSystem 9100, and most of the features described below are included in the base software license.

Storage Virtualization

IBM’s sub-volume tiering technology, Easy Tier, is included with FlashSystem 9100. This opens up the opportunity to better utilize the flash capacity installed in the system, by adding a second or third tier of storage. Easy Tier can place the hot portions of a volume on flash capacity, and cold portions on a slower technology, saving flash capacity for only workloads which can benefit from the performance.

In regard to adding storage tiers to FlashSystem 9100, we have two options. First, we can add internal storage to the system. The NVMe drives installed in the FlashSystem 9100 control enclosure are considered “Tier 0” flash by the system. SAS-attached expansion enclosures can be added to the system, housing “Tier 1” flash drives in traditional 2.5” SSD form factor. However, FlashSystem 9100 is an all-flash system and installing spinning media is not supported, even in expansion enclosures. It would seem that this limits us to tier 0 flash and tier 1 flash, but that brings us to the second option for additional storage tiers.

Since the full Spectrum Virtualize stack is present, we can virtualize any of approximately 400 supported IBM and non-IBM arrays. Externally virtualized capacity can be any tier of storage, so this is a method for adding enterprise-tier drives or nearline-tier drives to the system. Spectrum Virtualize’s image mode is also available as an easy way to migrate data from an existing array onto the FlashSystem 9100. While external virtualization is enabled on all FlashSystem 9100s, this feature is licensed based on capacity virtualized, so there is an additional license cost for this.

Copy Services: Point-in-Time Copies, Disaster Recovery, & High Availability

DR and HA options are significantly enhanced with the addition of the Spectrum Virtualize software. On its own, a FlashSystem 900 has no real DR or HA functionality. That all changes with the FlashSystem 9100.

FlashSystem 9100 supports the same suite as the other Spectrum Virtualize products. Synchronous replication via Metro Mirror and asynchronous replication via Global Mirror or Global Mirror with Change Volumes are available and built in to the base FlashSystem 9100 license. As a full-fledged member of the Spectrum Virtualize family, FlashSystem 9100 can replicate with any other device running the Spectrum Virtualize software stack, such as other FlashSystem 9100s, SVC clusters, Storwize arrays, or IBM Spectrum Virtualize for Public Cloud.

Replication from an on-premise FlashSytem 9100 to a cloud-based Spectrum Virtualize for Public Cloud instance provides a powerful way to implement disaster recovery without the overhead of a second facility. In addition, since it’s the same software stack running in the cloud, you get most of the Spectrum Virtualize features you’ve become accustomed to with your on-premise systems.

On the HA front, FlashSystem 9100 supports HyperSwap, providing active-active high availability between two sites separated by up to 300 km. Hosts can access a HyperSwap-protected volume at either site, providing continuous availability even in the scenario of a full site failure.

In addition to the above DR and HA features, FlashSystem 9100 supports FlashCopy, which is IBM’s point-in-time copy technology. FlashCopy is a welcome addition to the FlashSystem lineup. The FlashSystem 900 does not include point-in-time copy technology, while some of its competitors’ products do.

FlashCopy works great on all-flash arrays like this. Back in the days when all storage was spinning disks, making extensive use of FlashCopy could add overhead such that the underlying disks were overloaded. But with the performance of flash storage, this danger is significantly reduced. FlashCopy is great for creating test copies of databases, backups, cloning boot volumes, and many other uses. FlashSystem 9100 brings this useful feature to the FlashSystem family. And, once again, it’s included in the base license, with no additional cost.

Clustering

FlashSystem 9100 brings with it the ability to cluster multiple systems together into one larger system, managed as a single entity. Just like Storwize and SVC, FlashSystem 9100 can cluster up to eight nodes (four control enclosures) into a single system. A host accessing the clustered system can use capacity housed anywhere on the cluster: control enclosures, expansion enclosures, or externally virtualized storage. And the host’s data can be non-disruptively moved around as needed.

Furthermore, if you are a current Storwize V7000 Gen2 or Gen2+ customer, then you’re in luck! Those systems can be clustered with FlashSystem 9100s, either for expansion purposes or migration purposes.

But Wait, There’s More…

This article touched on some of the ways in which the combination of Spectrum Virtualize software and FlashSystem performance make for a much more versatile product. FlashSystem 9100 has numerous other features, such as thin provisioning, compression, deduplication, volume mirroring, iSCSI, and others.

For all the reasons described here, and more, the combination of FlashSystem performance and Spectrum Virtualize capabilities makes the new FlashSystem 9100 a very capable product.

About ATS

As new tech emerges offering business advantages, enterprises need support and expertise that will enable them to reap the benefits. Based near Philadelphia, the ATS Group offers agile services aligned with modern IT innovations, providing a critical competitive edge. For almost 20 years, our consultants have worked together to provide independent and objective technical advice, creative infrastructure consulting and managed support services for organizations of all sizes. Our specialist help clients store, protect and manage their data, while optimizing performance and efficiency. The ATS Group specializing in server and storage system integration, containerized workloads, high performance computing (HPC), software defined infrastructure, devops, data protection and storage management, cloud consulting, infrastructure performance management and real-time monitoring for cloud, on-premises and hybrid solutions. The ATS Group supports solutions from today’s top IT vendors including IBM, VMware, Oracle, AWS, Microsoft, Cisco, Lenovo, Pure Storage and Red Hat.

Point of View: IBM Follows New Recipe to Deliver Performance Gains with POWER9 Servers

Point of View: IBM Follows New Recipe to Deliver Performance Gains with POWER9 Servers

By Blake Basom | Sr. Systems Engineer (The ATS Group)

IBM has been producing servers based on their line of Power processors for close to 30 years. These servers continue to improve with each new generation, and POWER9 is no exception. The difference this time derives from the ways in which IBM achieved those performance gains.

With POWER9, IBM continues to separate themselves from the competition, offering a host of improvements over their POWER8 predecessors. In fact, IBM claims that POWER9 servers offer the following benefits over their competitors:

  • 5x max I/O bandwidth vs. x86
  • 2x high performance cores vs. x86
  • 6x more RAM supported vs. x86
  • 8x more memory bandwidth vs. x86

In this document, I will delve into some of the more significant features and changes of the POWER9 servers and offer my opinion on the benefits. This document is intended to be an overview of POWER9’s new and enhanced features, and while it will be fairly detailed, it is not meant to be a deep dive in any specific technology. The features that we look at will be grouped into categories of Processor, Memory, and I/O, but first an overview of the new server line.

POWER9 Server Line

IBM began their launch of POWER9 servers with the AC922 server in late 2017. This server is designed specifically for compute-heavy Artificial Intelligence (AI) and cognitive workloads, rather than for general computing. This system was the first to embed PCIe 4.0, Nvidia NVLink, and OpenCAPI technologies. As a result, IBM claims that the AC922 enables data to move 9.5 times faster than on PCIe 3.0 based x86 systems.

Servers designed for a more general workload began to be launched in early 2018.  They began with the “Scale Out” versions of their servers – one and two-socket rack-mountable servers in sizes from 1U – 4U.  The S914, S922, and S924 PowerVM based servers are in the traditional mold, supporting AIX, IBM i, and Linux workloads. The L922 server is a Linux-only model, while the H922 and H924 servers have been optimized for SAP HANA. Also offered are LC921 and LC922 models, which are processor and storage dense servers designed for Linux Clusters.

Finalizing the POWER9 server line, IBM announced the larger and more powerful “Scale Up” models in August, 2018. These enterprise servers offer increased computing capability, along with enhanced security and availability, and simplified cloud management. The 4-socket E950 offers up to 48 processor cores and up to 16TB of memory in a 4U package. Last, but far from least, the E980 represents the top of the POWER9 server line, offering up to 192 processor cores and 64TB of memory.

Processor

The most obvious place to start when looking at the POWER9 servers is the POWER9 processor itself. In years past, performance improvements were achieved in part through improving the fabrication process to reduce the transistor sizes, allowing the clock to run faster. While the fabrication process continues to improve and transistor sizes continue to shrink, server manufacturers are not greatly increasing clock speeds, leaving performance improvements to be achieved through other means. In the case of the POWER9 processor, it is primarily due to improving processor pipeline efficiency, increasing the data flow between components, and allowing for faster access from external sources.

The POWER9 processor was fabricated via a highly advanced 14nm finFET Silicon-On-Insulator lithography process (using a 17-layer metal stack), an improvement from the 22nm process that was used for POWER8. This allowed IBM to jam a total of 8 billion transistors in each chip, compared to 4.2 billion in POWER8. Clock speeds run up to 4 GHz, which is similar to POWER8.

The POWER9 chip is a more modular design, and performance was improved by shortening the pipeline, improving fixed-point and floating-point operations, and improving instruction management. These changes allow more instructions to be completed per clock cycle, leading to performance improvements without raising the clock speed. Increasing the amount of on-chip memory (particularly L3 cache) helps as well, and on-chip switching bandwidth of over 7 TB/s allows data to move in and out of the processor cores at 256 GB/s in the SMT8 model.

You may be thinking, “SMT8 model? Aren’t all POWER9 chips SMT8, as the POWER8 chips were?” Actually, no. IBM is producing two main variants of the POWER9 processor – the PowerVM based general purpose servers will use a full SMT8 processor (which allows up to 8 threads per core), while certain non-PowerVM based Linux models will use SMT4 versions of the POWER9 processors (which only allow up to 4 threads per core). This may sound like a step backward, but it is a result of IBM listening to its customers and partners. Basically, IBM learned that a segment of the Linux market desired the reduced SMT version, which allowed more cores to be packaged in a single chip.  In fact, the SMT4 versions will allow up to 24 cores per die, while the SMT8 models only offer up to 12 cores.

Speaking of SMT4 vs. SMT8, what is the best multi-threading mode in which to run the new processors? When IBM introduced SMT8 with POWER8 processors, there were some performance problems initially. The problems weren’t necessarily severe, but running in SMT8 mode didn’t necessarily equate to much improvement in processing power, and in some cases IBM was recommending running POWER8 servers in SMT4 mode. This issue has seemingly been fixed in POWER9, with SMT8 being the preferred mode for most applications, offering a distinct performance boost over running in SMT4 mode (under most circumstances).

IBM also introduced Workload Optimized Frequency with POWER9, where the processor can dynamically change clock speeds based on the running workload, to allow for enery savings when the workload is low, with the ability to quickly ramp up when needed. This feature can be controlled through processor mode settings and can be changed without a reboot.

All of that sounds nice, but what does it really mean? How much faster are the POWER9 processors? Well, of course it varies by server model and workload, but in general you can expect a 30-50% improvement over comparable POWER8 models, along with 20-30% improvement in price/performance ratio (more bang for your buck).

Note that when migrating workload from POWER8 to POWER9, you will likely want to reduce the number of virtual CPUs, which may improve performance, while reducing software licensing costs. Each case will be unique, so testing a specific workload with different numbers of VCPUs will reveal the optimal allocation. Likewise, running tests in both SMT4 mode and SMT8 mode will show which threading mode is best.

Memory

The POWER9 servers use top of the line DDR4 memory (some of the later POWER8 models used this as well). The SMT4 processor models allow for direct attached memory DIMMs, while the SMT8 versions allow more memory to be attached, via buffers. The SMT4 models offer up to 120 GB/s of sustained memory bandwidth, while the SMT8 models offer up to 230 GB/s of sustained bandwidth with theoretical peaks of 340 GB/s. Memory capacity varies by model, up to 4 TB for Scale Out models, and up 64TB for Scale Up models.

I/O

Most modern servers are not self-contained, meaning they are connected to external devices for storage, networking, and increasingly for hardware acceleration devices. With the blazing speeds of current processors and memory, the computing bottleneck has shifted to Input/Output devices. IBM has spent a lot of effort in this area with the POWER9 servers, offering a number of options to improve the speed and bandwidth to attached devices.

The latest edition of PCIe (Gen4) is available in POWER9 servers, offering up to twice the bandwidth of PCIe Gen3 (note that Gen3 adapters will work in Gen4 slots, albeit with the Gen3 bandwidth). 48 lanes of PCIe Gen4 adds up to 192 GB/s duplex bandwidth to attached devices.  In addition to traditional PCI adapters for network and SAN connectivity, some PCIe Gen4 slots are enabled for CAPI 2.0 devices such as ASICs and FPGAs. CAPI 2.0 using PCIe Gen4 offers 4x the bandwidth of CAPI 1.0 on POWER8.

Additional connectivity is provided by a 25 Gb/s Common Link – 48 lanes provides up to 300 GB/s bandwidth for devices attached via NVLink 2.0 or OpenCAPI 3.0 (not available on PowerVM based servers). NVLink can be used for high speed GPU attachment, while OpenCAPI is an upcoming open hardware standard that is supported by a consortium of industry heavyweights, which will be used to connect components like high-speed network and SAN adapters, as well as additional memory and GPU accelerators.

POWER9 provides support for the next generation of SR-IOV Ethernet adapters – with increased port speeds of 10Gb, 25Gb, 40Gb, and 100Gb. Additional enhancements allow more VFs per port (target 60VFs per port / 120 VFs per adapter for 100Gb adapters), as well as vNIC and vNIC failover support for Linux.

Server I/O performance is also improved by the on-chip acceleration capabilities of the POWER9 processors themselves, which speed up the common but intensive tasks of compression/decompression and encryption/decryption.

Some POWER9 servers also support internal Non-Volatile Memory (NVMe) devices. These bootable disks are meant primarily for operating systems, offering high-speed access with low latency, but in a read-mostly format.

Conclusion

When you put it all together, it is clear that IBM put an emphasis on overall server performance with their line of POWER9 servers, rather than just trying to crank out the fastest processor that they could. By focusing on I/O enhancements, and partnering with great companies across the industry, they have achieved some impressive results.  But they didn’t forget about the processor either – the POWER9 processor improved upon an already industry leading standard.  From general purpose Scale Out servers, all the way up to the enterprise class Scale Up servers, IBM has provided a robust line of servers to meet the UNIX computing needs of users around the globe. And with certain models customized for specific technologies, users can expect optimized performance for their specific needs. As a longtime user, administrator, and consultant for IBM Power servers, I think that POWER9 represents another impressive step forward for IBM, offering endless possibilities for world class computing.


Did this content resonate with you and your organization? Download the full version of the Point of View: IBM Follows New Recipe to Deliver Performance Gains with POWER9 Servers document to share with peers.

Point of View: IBM Spectrum Scale 5.0 and Above – New Features and Functionality

Point of View: IBM Spectrum Scale 5.0 and Above - New Features and Functionality

By Prasad Surampudi | Sr. Systems Engineer (The ATS Group)

Up until a few years ago, only performance, reliability and ease of administration were considered major factors when selecting a clustered file system for enterprise data storage. But as cloud-based systems and services are gaining more attention, companies are looking for a complete data management solution that can leverage cloud services to cost effectively scale and support explosive growth of data.

Today’s clustered file systems must be highly scalable across different server architectures, whether they reside on-premises and/or off-premises. They also need to be highly available with minimal down time during maintenance and should be able to scale across multiple tiers of storage. The file system also should be able archive legacy and less frequently accessed data to cost effective storage. The file system should protect, secure, and encrypt the data to prevent unauthorized access and should be able to provide the highest granular level of access in terms of ACLs.

Apart from the above, the file system also should provide consistent performance in terms of throughput and IOPs to a wide variety of data ranging from small files to very large data sets used by various big data and analytical applications. Also, it should support various protocols to access data like POSIX, NFS, CIFS and Object.

IBM Spectrum Scale is a clustered file system that meets all the above requirements. Spectrum Scale is a highly scalable, secure and high-performance file system for large scale enterprise data storage. It is widely used in several industries including financial analytics, healthcare, weather forecasting, genomics and many other industries across the world.

Spectrum Scale has a long history of more than twenty years of development since 1998. Earlier versions of Spectrum Scale were known as General Parallel File System (GPFS). IBM rebranded its General Parallel File System as Spectrum Scale starting with version 4.1.1.

IBM added many new features and functions with the goal of delivering a complete software-defined storage solution rather than being just a clustered file system that is shared across several nodes. In 2017, IBM released Spectrum Scale 5.0 after a significant development effort to achieve performance and reliability requirements set forth from the US Department of Energy’s CORAL supercomputing project.

The purpose of this document is to briefly discuss some of the new exciting features and functions of Spectrum Scale 5.0 and understand how they can be leveraged to meet today’s demanding business requirements.

New Features and Functionality

Let’s look at some of the new features of Spectrum Scale 5.0. They have been categorized based on the Spectrum Scale function group.

Core GPFS functionality Changes

Variable Sub-Block Size
Earlier versions of Spectrum Scale (GPFS) have a fixed 32 sub-blocks per single file system block size. With Spectrum Scale 5.0 and above the number of sub-blocks depends on the file system block size chosen.

File System Block Size Number of Sub-blocks Sub-block size
64KiB/128KiB/256KiB 32 2KiB/4KiB/8KiB
512KiB/1MiB/2MB/4MiB 64/128/256/512 8KiB
8MiB/16MiB/32MiB 512/1024/2048 16KiB

Starting from Version 5.0, if not specified, Spectrum Scale file systems are created with the default file system block size 4MiB with a sub-block size of 8 KiB.

NSD Server Priority
The preferred or primary NSD server of NSD can be changed dynamically without unmounting the file system.

File System Rebalancing:
With version 5.0, Spectrum Scale uses a lenient round-robin algorithm which makes rebalancing much faster vs the strict round-robin method used in earlier versions.

File System Integrity Check
While doing a file system integrity check, if the mmfsck command is running for a long period of time, another instance of mmfsck can be launched with the –stats-report option to display current status from all the nodes that are running the mmfsck command.

Cluster Health
Spectrum Scale cluster health check commands have been enhanced with options to verify file system, SMB and NFS nodes.

IBM Support
The mmcallhome command has a new option ‘–pmr’ which can be used to specify an existing PMR number for data upload.

Installation Toolkit

Spectrum Scale installation toolkit was introduced with version 4.1 and many enhancements are made in Version 5.0. The installation kit now supports deploying protocol nodes in a cluster that uses Spectrum Scale Elastic Storage Server (ESS). The installation toolkit also supports configuring Call Home and File Audit Logging. Deployment of Ubuntu 16.04 LTS nodes as part of the cluster are also supported by the installation toolkit.

Encryption and Compression

The file compression feature was added in Spectrum Scale 4.2 and has been enhanced in Spectrum Scale 5.0 to optimize read performance.  Local Read-only cache (LROC) can be used for storing compressed files. Spectrum Scale 5.0 also simplifies IBM SKLM configuration for file encryption.

Protocol Support

Starting with Spectrum Scale 4.1.1, data in IBM Spectrum Scale can be accessed using a variety of protocols like NFS, CIFS and Object.  The packaged Samba version has been upgraded to 4.0. Spectrum Scale 5.0 also supports the option to use a Unix primary group in AD. You can also modify NFS exports dynamically without impacting connected clients.  CES Protocol node functionality is now supported on Ubuntu 16.04 and above.

File Audit Logging

Spectrum Scale File Audit Logging logs all file operations like create, delete modify etc. in a central place. These logs can be used to track user access to the file system.

AFM

Files can be compressed in AFM and AFM-DR filesets. Spectrum Scale 5.0 also made improvements for load balancing across AFM gateways. Information Life-cycle Management for snapshots is now supported for AFM and AFM-DR filesets. AFM and AFM-DR filesets can be managed using IBM Spectrum Scale GUI.

Transparent Cloud Tiering (TCT)

TCT now supports remote mounted file systems. Clients can access tiered files on a remotely mounted file system.

Big-Data Analytics

Spectrum Scale Big-Data Analytics is now certified with Hortonworks Data Platform 2.6 on both Power 8 and x86 platforms and also certified with Ambari 2.5 for rapid deployment.

Spectrum Scale GUI Changes

The Spectrum Scale GUI was introduced in version 4.1. IBM made significant upgrades to the GUI in IBM Spectrum Scale 5.0.  Call Home and monitoring of remote clusters, file system creation and integration of Transparent Cloud Tiering are some of significant features that were added in version 5.0.

Use Cases

With the enhancements included in version 5.0, Spectrum Scale has truly become an enterprise class file system for the modern cloud era. Let’s see how we can leverage some of the new features and functions.

File System Block Size

File System block size is a critical parameter that needs to be considered for optimal performance before a file system is created and used for large amounts of data. The wide range of file system block sizes and sub-block sizes offered by Spectrum Scale makes it possible to store different sizes of files in a single file system and still get better throughput and IOPs performance for various sequential and random workloads.

While a larger block size helps improve throughput performance, having a variable sub-block size and number of sub-blocks enables you to minimize file system fragmentation and use the storage effectively.

But keep in mind that only new file systems created with Spectrum Scale 5.0 and above can take advantage of the variable sub-block size enhancement.

NSD Server Priority Change

Today’s businesses expect their servers, storage, file systems and applications to run with minimal downtime. Dynamically changing NSD server priority for each NSD without unmounting the file system on all NSD servers helps minimize downtime in several scenarios including NSD server retirement, for example.

File System Rebalancing

Network Storage Devices (NSDs) must be added or removed to expand or shrink a Spectrum Scale file system. The data in the file system need to be rebalanced across all NSDs for optimal performance. after the NSD addition or removal. Most of the time, System Administrators let Spectrum Scale do the rebalance in the background without actually forcing it at the time of removal or addition of NSD. The file system performance is not optimal until the NSDs are balanced. With Spectrum Scale 5.0 the rebalancing occurs at a faster speed using lenient round robin instead of strict round robin.

Protocol Support

Within IT organizations, its fairly standard for applications to run on a variety of platforms such as AIX, Linux and/or Windows. Spectrum Scale Protocol Support of industry standard protocols like CIFS, NFS and Object will allow users to access the data stored using these protocols in the most efficient way.

Protocol Support enables businesses to consolidate all of their enterprise data into a global name space with unified file system and object access avoiding multiple copies of data. With Spectrum Scale 5.0, customers can now configure servers running Ubuntu 16.04 as protocol nodes in addition to Redhat Enterprise Linux.

Call Home

Configuring the call-home feature in Spectrum Scale 5.0 enables IBM to detect cluster/file system issues proactively and enables automatic sending of logs and other required data for a timely resolution. This helps customers minimize down time and improves reliability of the cluster.

Installation Toolkit

With the installation toolkit, clusters can be configured and deployed seamlessly by defining cluster topology in a more intuitive way. The installation toolkit performs all necessary pre checks to make sure all the required package dependencies are met, automatically installs Spectrum Scale RPMS, configures the cluster and configures Protocols, Call Home, File Audit Logging etc. It simplifies the installation process and eliminates many manual configuration tasks.

Complex tasks like balancing the NSD servers, installing and configuring Kafka message brokers for file system Dudit Logging, and Spectrum Scale AD/LDAP authentication, for example, are much easier and simplified with installation took kit.

Compression

With explosive rates of data growth, organizations are always looking for ways to reduce storage costs.

Spectrum Scale File compression introduced in Version 4.2 addresses this need to minimize storage costs effectively by compressing legacy and less frequently used data. File compression is driven by Spectrum Scale ILM policies and typically provides a compression efficiency of 2:1 and 5:1 in some cases. Compression not only reduces the amount of storage required, but also improves I/O bandwidth across the network and reduces cache (pagepool) consumption.

With Spectrum Scale 5.0, file compression supports zlib and lz4 libraries. Zlib is primarily intended for cold data where as lz4 is intended to compress active data. Compression using lz4 favors read-access speed than space saving.

Though, Regular File compression and Object compression use the same technology, keep in mind that Object compression is available in CES environment and whereas File compression is only available in non-CES environments.

Encryption

File Encryption was introduced by IBM in Spectrum Scale version 4.1 and is available in Spectrum Scale Advanced and Data-Management editions only.

The data is encrypted at rest. Only data is encrypted, not metadata. Keep in mind Spectrum Scale encryption protects data storage device misuse and attacks by unprivileged users but not against deliberate malicious acts by cluster administrators.

Spectrum Scale 5.0 enables encryption of files stored in local disk (LROC) and simplifies the SKLM configuration.

Spectrum Scale encryption can be leveraged where organizations need to store PII data, business critical and any other confidential data. It can be also used by organizations that are required to meet federal and other security compliance standards like GDPR.

Spectrum Scale encryption is also certified as Federal Information Processing Standard (FIPS) compliant.

File Audit Logging

File Audit Logging was introduced in Spectrum Scale 5.0. File Audit Logging addresses the need to track the access of files for auditing purposes. It’s not an easy task to track individual file access in large scale clustered file systems with petabytes of data and billions of files that are accessed by hundreds of applications and thousands of users. Spectrum Scale File Audit Logging is designed to be highly scalable as the file system grows.

File Audit Logging is not required to be installed and configured on each and every  node in the cluster as required by some of the operating systems audit logging processes. It just needs to be configured on minimum three quorum nodes in the Spectrum Scale cluster and can be scaled to other nodes as required.

File Audit Logging also supports tracking of file access using NFS, CIFS and Object Protocols.

Addressing GDPR requirements using Spectrum Scale

IBM Spectrum Scale allows organizations to avoid multiple data islands as it provides a single name space for both structured and unstructured data. This helps to have a single point of control when protecting and managing all data that is subject to GDPR compliance.

Spectrum Scale Encryption helps to secure personal data while at rest to meet GDPR security requirements.

IBM Spectrum Scale supports industry standard Microsoft AD and LDAP Directory Sever authentication and a rich set of ACL support to comply with GDPR Right of Access policies.

Active File Monitoring (AFM)

Active File Monitoring can be used to transfer and cache data over a WAN between two Spectrum Scale clusters. One cluster being the home cluster that stores all data and the other cluster being a cache cluster, which can cache all data in the home cluster or only limited amount of data. AFM can also be implemented as a Disaster Recovery solution with AFM-DR.

With Spectrum Scale 5.0, storing compressed data with Spectrum Scale file compression is supported. Load balancing improvements and ILM support for AFM and AFM-DR fileset snapshots has also been added.

Transparent Cloud Tiering (TCT)

As the name implies, Transparent Cloud Tiering is another way to reduce high performance storage costs by transparently migrating aged data into less expensive cloud storage. This makes more room to ingest new data into the high-performance storage tier. Spectrum Scale ILM policies can be used to scan the file system metadata and identify the files that are not accessed for months or years and tier them to cloud storage. Since only file data gets migrated and not metadata, the migration process is transparent to user and applications. The data gets pulled from cloud storage to the local file system storage when users or applications try to access the data.

Spectrum Scale Transparent Cloud Tearing was introduced starting from Version 4.2 and is available with data-management edition only.

With Spectrum Scale 5.0, TCT supports file systems mounted from remote clusters. TCT enabled file sets can now use different containers. Multiple cloud accounts and containers are also supported.

Big-Data Analytics

As companies started leveraging social media feeds and other unstructured data for business analytics, today’s file systems have a need to support both structured and unstructured data under a single name space. Spectrum Scale introduced File Placement Organizer (FPO) architecture and Hadoop plug-in starting with Version 4.1 to support Big-Data applications and frame-work. Later versions of Spectrum Sale enhanced Hadoop Distributed File System (HDFS) transparency. Spectrum Scale is certified on Hortonworks Data Platform (HDP) 2.6.5 provided by Hortonworks, a large Hartonworks which is major big-data frame-work distributor.

Spectrum Scale also supports Ambary for easy and quick deployment in large scale Hadoop clusters.

Spectrum Scale GUI Changes

Ease of deployment and administration is one of the major requirement of large clustered file systems that are deployed across hundreds of servers. The Spectrum Scale GUI introduced in version 4.1 and above simplifies the installation, configuration and administration of large scale clusters. IBM made many significant enhancements to the Spectrum Scale GUI to make it more intuitive for routine system administration and monitoring tasks.

The Spectrum Scale GUI can now monitor cluster performance, cluster health, individual node health, SMB, NFS and Object protocol health and several other enhancements. In certain cases, routine actions can be applied to fix errors using a simple click from the Spectrum Scale GUI.

The Spectrum Scale 5.0 GUI now supports monitoring of remote clusters, Transparent Cloud Tiering and also provides IBM call home support for cluster, node, file system issues.

Summary

IBM Spectrum Scale continues to add and align features into a complete data management solution. This solution meets market demands of a highly scalable solution across different server architectures on-premises or in the cloud. It support new-era big data and artificial intelligence workloads along with traditional applications while ensuring security, reliability and high performance. The IBM Spectrum Scale solution also exhibits the stability, maturity and trust with a long history of more than twenty years of development.