Storage Technology Glossary

glossary heroThe storage world can sometimes feel like it has its own language and the acronyms can be daunting. We’d like to help: This page has been designed to serve as a quick-reference guide for the most common terms an acronyms. If you see a term missing, or something that would be helpful for the community, please let us know via Twitter and we’ll do our best to add it!

TermsDefinitions
All-Flash ArraysAn all-flash array, also referred to as AFA, is an enterprise storage system containing multiple flash drives instead of spinning hard disk drives.
Application-Aware ProvisioningApplication-aware provisioning within a storage array's user interface enables end users to quickly optimize the volume's configuration to a specific application. The administrator simply selects the use case (database, server virtualization, VDI, etc) and the array will choose the most appropriate block size, compression algorithm, deduplication setting, etc).
BitA bit is a single numeric value, either '1' or '0', that represents a single unit of digital information.
Bit Error RateAlso known as BER or UBER, bit error rate refers to the total number (or percent) of unrecoverable bits that exhibited errors as compared to the total number of bits already read.
BlockA block is a measurement unit of data storage often used in storage area network (SAN) architectures, like those using Fibre Channel, iSCSI or FCoE. Blocks are comprised of bits and bytes and usually span a certain length of a storage disk drive. Blocks may be referred to as volumes or raw volumes and act as individual hard drives. They are typically controlled by server operating systems.
ByteA byte is a unit of data composed of 8 bits.
Cache hit rateCache hit rate (or ratio) refers to how often a data read request is able to retrieve the data it needs from the memory cache of a system. Data stored in cache is often retrieved more quickly than data stored elsewhere in a system. Cache hit rate is often expressed as a percentage, such as a 90% cache hit rate. This means data requests were retrieved from cache 90% of the time.
Caching Caching refers to the act of storing data in a more quickly retrievable memory "cache" area of a system.
ChecksumChecksum is a method of error detection where the number of bits transmitted to a system is included as part of the transmission. This allows the system receiving the data to verify whether or not the full amount of bits originally transmitted arrived at their destination.
CIFSCIFS stands for Common Internet File System. It is a file-sharing protocol that allows users and applications to request network server files or services. CIFS is based on the enhanced version of the Microsoft Server Message Block (SMB) protocol for Internet and Intranet file sharing.
Cloud analyticsCloud analytics refers to data analytics-related services, applications or processes that are offered or implemented from a cloud environment (i.e. public cloud or private cloud). According to Gartner, this might involve cloud-based data sources, data models, processing applications, computing power, analytic models and sharing or storage of results.
Consumer-grade Multi-Level Cell (cMLC)Consumer-grade Multi-Level Cell (cMLC) refers to a type of flash memory technology typically used in phones, cameras or USB sticks. Also known simply as MLC, it's often compared against enterprise-grade MLC (eMLC). cMLC has a shorter lifespan than eMLC. For instance, cMLC only provides 3,000-10,000 write cycles whereas eMLC handles 20,000-30,000 write cycles. Other architectural features make cMLC a less ideal choice than eMLC for enterprise flash environments.
Converged InfrastructureConverged infrastructure refers to the vendor practice of offering specific market "bundles" or validated reference architectures that combine several multi-vendor IT systems together. Converged infrastructure is usually offered as a preconfigured, pretested set of systems (such as combined or unified IT stacks server/compute, network and storage resources), often targeted at key application environments, such as virtual servers, virtual desktops or databases. Growing in popularity, the use of converged infrastructure is claimed to make it easier for IT organizations to implement and manage IT systems and applications.
Deduplication (Inline)Inline deduplication removes redundant or duplicate data either before or during the act of writing or saving the data to a storage target.
Deduplication (Post Process)As opposed to inline deduplication, post-process deduplication removes redundant or duplicate data as a separate process after the data has already been stored or written to a storage or backup target.
Electrically Erasable Programmable Read-only Memory (EEPROM)EEPROM (electrically erasable programmable read-only memory) is a type of non-volatile memory used in computers and devices. EEPROM stores small amounts of data that must be saved when the system is not turned on (or has no power). Unlike bytes in most other kinds of non-volatile memory, individual bytes in a traditional EEPROM can be independently read, erased, and rewritten. In contrast to flash memory technology (which some have considered a type of EEPROM), EEPROM technology typically "writes" to small pieces (or "bytes") of memory, at any time. Flash memory writes to an entire chunk, or "sector", of memory at a time. This allows faster Write/Erase cycles for flash memory technology than for EEPROM, although both types of technology may still be used in the same system.
EncryptionEncryption is a process that protects digital data stored or transmitted within computer systems or across the Internet. While many methods or types of encryption exist, the overall process converts digital data into a type of cypher text that cannot be read or decoded by anyone beyond authorized personnel. Authorized personnel, in turn, can decrypt and gain access to the original data via some type of encryption key.
EnduranceEndurance refers to the number of Write/Erase cycles that flash memory can perform without jeopardizing data reliability.
Enterprise Multi-Level Cell (eMLC)As opposed to cMLC (consumer grade MLC), Enterprise MLC (eMLC) is a type of multi-level cell (MLC) flash that has been enhanced to accommodate more write cycles than consumer-grade MLC flash offers. For instance, cMLC only provides 3,000-10,000 write cycles whereas eMLC handles 20,000-30,000 write cycles.
Error Correction Code (ECC)Error Correction Code (ECC) is used to verify data transmissions by locating and correcting transmission errors. ECC is also used on a per-block basis by flash systems utilizing MLC or eMLC flash technology.
Fibre ChannelFibre Channel, or FC, is a high-speed network technology (commonly running at 2-, 4-, 8- and 16-gigabit per second rates). FC is a storage protocol commonly used in storage area networks (SANs) or (FC-SANs). Fibre Channel is primarily used to connect computer data storage devices together.
Flash WearFlash wear (or wear-out) relates to the progressive erosion of a flash memory cell after repeated Write/Erase cycles involved in writing new data. Various techniques exist to prolong the lifespan of flash memory cells and reduce flash wear, including wear leveling which distributes write/erase cycles more evenly among all of the blocks in the device.
Garbage CollectionGarbage collection is a function used to improve write speed in flash memory-based solid state storage disk (SSD) systems. Garbage collection erases blocks of unused storage in the background. This allows flash memory to perform faster data writes to the now-empty blocks.
Gigabit EthernetGigabit Ethernet is a transmission technology that transmits Ethernet frames at a data rate of a gigabit (or 1 billion bits) per second. It is typically used in local area networks (LANs) and most enterprise networks.
GigabyteA gigabyte is a unit of computer memory or data storage equal to 1,024 megabytes or roughly 1 billion bytes.
Hybrid Flash ArrayA hybrid flash array is a solid state storage system that uses a mix of flash memory drives and hard disk drives.
Hyperconverged InfrastructureA variation of converged infrastructure, hyperconverged infrastructure integrates traditional IT "stack" components (virtualization, server/compute, network and storage) into a single unit or block that is supported by one vendor. Whereas components in a converged infrastructure can also be used and managed separately, if desired, the moving parts in hyperconverged infrastructure are meant to be used together. Hyperconverged infrastructures are purported to be easier to scale out, with resource management of underlying components performed from one central, integrated dashboard or interface. In many implementations of hyperconverged infrastructure, storage management tasks like resource allocation/provisioning, data protection, data deduplication, compression, or WAN optimization are managed and automated from within the server virtualization layer.
Input/Output Per Second (IOPS)Input/Output Operations per Second (IOPS) is a measurement of storage system performance that is often used to contrast how fast one system performs against another. IOPS measures the maximum read and write requests that a storage system can handle. IOPS is just one performance measurement which should be taken in conjunction with other factors (such as latency, type of application workload, and read/write profile) to determine overall system performance.
I/OI/O stands for Input/Output. It refers to any program, operation or device that transfers data to or from a computer and to or from a peripheral device, such as a data storage system.
I/O BlenderAlso referred to as the I/O Blender Effect, this term describes a virtual server scenario that can slow storage performance. The blender effect occurs when many virtual machines (VMs) send input/output (I/O) streams to a hypervisor for processing. These streams can make I/O more random when it's sent to the underlying storage system. Random I/O, as opposed to sequential I/O, can slow storage performance, which also impacts performance of the associated virtual machines. The use of flash or solid-state storage can effectively resolve the I/O Blender Effect by faster handling of random I/O.
iSCSIiSCSI (Internet Small Computer System Interface) is a network data storage protocol used to transmit data and commands between storage systems. It can be used in a LAN or WAN environment or over the Internet. iSCSI allows SCSI commands to be sent on top of traditional Transport Control Protocol (TCP). In storage networks, iSCSI can be used as an alternative storage communications protocol to Fibre Channel. As opposed to Fibre Channel which requires a separate Fibre Channel network, iSCSI can often be used more cost effectively with existing Ethernet-based networks.
KilobyteA kilobyte is a unit of computer memory or data storage. It is equal to roughly one thousand bytes (or, exactly 1,024 bytes).
LatencyLatency measures the time it takes for an I/O request to be completed. From an application point of view, low latency is more desirable than high latency. Latency is often one measurement used, along with IOPS, when evaluating storage system performance.
LUN Block SizeLUN block size refers to the amount of separate disk space set aside in a storage system. More than one LUN is often defined as its own virtual hard drive to server-based applications. LUNs are configured on a storage system, often for use with storage area network (SAN) protocols such as Fibre Channel or iSCSI. LUNs can also be used to support many popular virtualization platforms, including VMware VMFS datastores. Many factors can be involved in setting optimal LUN block sizes to support an application.
MegabyteA megabyte is a unit of computer memory or data storage. It is equal to roughly one million bytes (or, exactly 1,048,576 bytes).
Memory CellA memory cell is one of the key physical elements of a flash memory solid state storage system. The flash memory cell is a type of semiconductor or transistor which can hold and change information bits within it based on the voltage signals it transmits and receives.
Metadata AccelerationMetadata acceleration is a practice patented by Tegile Systems. It refers to the separation of metadata from data. Once separated, the metadata is then organized, aggregated, and placed on high-performance, low latency storage (i.e. DRAM or flash). This accelerates data services such as deduplication, compression, snapshots, clones and thin provisioning.
Multi Level Cell (MLC)See cMLC.
NAND Flash MemoryNAND flash memory is a type of non-volatile computer storage medium that can be electrically erased and reprogrammed. It is one of two main types of NVRAM, the other being NOR flash memory. NAND flash is typically preferred over NOR when needing to store large quantities of data. Its benefits over NOR memory include a lower cost, longer life expectancy, faster write and erase times and higher density. Unlike NAND memory, NOR memory is often used to store relatively small amounts of executable code (like firmware, boot code, operating systems and other data that seldom changes) for devices like PDAs or cell phones.
NASNAS stands for network-attached storage. NAS devices are a type of shared storage that's often used as file shares in organizations. They allow multiple computers, multiple users or multiple servers or multiple network nodes to access the same sets of files stored on the NAS device, via standard Ethernet connections.
NFSNFS stands for Network File System is a file-based network storage protocol based on TCP/IP. As a client/server protocol, NFS also requires user machines to run the NFS client in order to share files with an NFS server-enabled storage device or server. NFS-enabled storage devices often act as network fileshares.
Non-volatile Memory (NVM)Also known as NVRAM (Non-Volatile Random Access Memory), this term refers to a type of system memory that saves the data it stores, whether the power is on or off. Good examples of NVRAM are different types of flash memory, such as EEPROM, NAND, NOR and solid state disk (SSD) storage.
NOR MemoryNOR flash memory is one of two types of non-volatile random access memory (NVRAM) that allow data to be retained even when system power is turned off. Unlike NAND memory, the other main type of NVRAM, NOR memory is often used to store relatively small amounts of executable code (like firmware, boot code, operating systems and other data that seldom changes) for devices like PDAs or cell phones. When needing to store larger quantities of data, NAND flash is typically preferred over NOR due to its lower cost, longer life expectancy, faster write and erase times and higher density.
PageA page is a unit of measurement where a certain amount of data can be written to (and stored on) a NAND flash device. There are native page sizes defined by the flash device manufacturer. These make up a fixed chunk of memory that's typically larger than bytes but smaller than blocks. An average page size might be between 4 and 8 kilobytes and may be made up of more than one flash cell. Write optimization methods may be used with native flash page formats in order to reduce flash wear and reduce fragmented I/O.
RAIDRAID stands for redundant array of independent disks. The term refers to various methods to improve the availability of data stored on a storage system. In general, RAID allows the same data to be stored, copied, shared or "striped" across more than one hard disk at the same time. Thus, if one hard disk fails, data can be rebuilt or still accessed from one or more other disk locations. Different types of RAID implementations exist, from RAID-0 to RAID-53.
Random access memory (RAM)RAM stands for random access memory. It is a form of computer memory that can be accessed randomly. This means any byte of memory can be accessed without touching the preceding bytes. RAM is the most common type of memory found in computers and other devices.
ReplicationReplication is a practice of writing the same data to two separate locations (or two separate storage systems). Replication is often used as part of a disaster recovery process and is often used to copy data from one site to another. Different types of replication exist, including synchronous and asynchronous replication. Synchronous replication writes data to both Site A and Site B at the same time. Asynchronous replication writes data first to Site A before writing it to Site B. In asynchronous replication, time delay and frequency of writing data to Site B may be associated with an organization's DR policy that contains specific RTO (recovery time objective) and RPO (recovery point objective) goals.
RPORPO stands for recovery point objective. It is one of two common disaster recovery planning terms often used in disaster recovery service level agreements (SLAs). The other term is RTO (recovery time objective). RPO can be thought of as the optimal, prior point in time from which you can successfully restore file or application data from previous backup files. This is best considered by example. If you have an email system you suddenly needed to restore, how much data could you lose from that system if you had to resort it from prior backups or snapshots? If your users could stand to lose 30 minutes of past emails, 1 hour or 4 hours of past emails since the last backup, then your RPO for that system would be 30 minutes, 1 hour or 4 hours. That means your backup processes might need to take snapshots every 30 minutes, 1 hour or 4 hours.
RTORTO stands for recovery time objective. It is one of two common disaster recovery planning terms often used in disaster recovery service level agreements (SLAs). The other term is RPO (recovery point objective). RTO describes the ideal amount of time it should take you to restore file or application data from a previous backup. This may relate to the criticality of different tiers of data. For instance, mission-critical (or Tier 1) data may require faster recovery time (a shorter RTO) versus less-commonly access file data which might handle a longer RTO.
SANA storage area network (SAN) is any high-performance network and storage protocol that allows storage devices to communicate with computer systems or with each other. Fibre Channel, iSCSI or FCoE are different communication protocols that can be used with a SAN. SANs may also be referred to as FC SANs or IP SANs, the latter referring to the use of Ethernet LAN technology as the network communication backbone between the various storage devices.
Self-encrypting drivesA Self-Encrypting Drive (SED) offers hardware-based encryption within a solid state drive (SSD) or hard disk drive. that provides hardware-based data encryption. When data is written to this type of media, it becomes encrypted. The encrypted data then requires either a 128-bit or 256-bit key to decrypt.
Single Level Cell (SLC)SLC refers to single-level cell. It is one of two main types of flash cells that may be used on a NAND flash memory chip. (The other main type is MLC or multi-layer cell.) As opposed to MLC, an SLC cell holds one bit of data, with the bit value being either 0 or 1. MLC holds more than one bit in a single cell, giving it a higher data density than SLC.
SMBSMB stands for Server Message Block. The terms refers to a message format used to share files, directories and devices in Microsoft Windows environments. Common Internet File System (CIFS) is one version of SMB commonly used for storage system fileshares in a Microsoft Windows network environment.
SnapshotsSnapshots are read-only copies of a data set that are taken and stored digitally at specific points in time, often while applications are still up and running. Sometimes referred to as shadow copies as well, different applications or operating systems have different snapshot implementations. Such implementations can impact the amount of storage required or the speed and ease of restoring certain types of application data. One example of a snapshot implementation is Microsoft Volume Shadow Copy Service (VSS), which first appeared in Windows Server 2003.
Solid State Drives (SSD)SSD stands for solid-state drive or solid-state disk. It uses nonvolatile, solid-state flash memory in a storage device in order to store persistent data. Contrasted against slower, spinning hard disk drive (HDD) technology, SSDs are faster. In contrast to HDDs, SSDs store their data on microchips and, thus, do not require a mechanical arm with a read/write "head," a drive motor or a spinning disk.
TerabyteA terabyte is a unit of data storage or memory equal to 1,024 gigabytes or one trillion bytes. It is often shortened to TB, especially in storage system capacity measurements, such as 4 TB.
Thin clonesA thin clone is a space-efficient copy or snapshot of a virtual machine file or storage volume. What makes it "thin" or space-efficient is that the copy usually only stores unique changes from the primary volume or virtual machine. In VMware, this might be referred to as a linked clone that is able to share virtual disks with the "parent" or primary virtual machine from which it was copied.
Thin provisioningThin provisioning is a common storage efficiency method for provisioning or allocating storage resources within an enterprise storage system. It typically involves the use of some type of virtualization technology to appear as if more physical resources have been assigned in the system than are actually available.
ThroughputThroughput is a term used to describe the speed that a certain amount data used to pass through a computer or storage system. Throughput measurements are often used to compare storage system performance. A few throughput measurements include IOPS (Input/Output per Second) and MBPS (Megabytes per Second).
TieringTiering is the process (either automated or manual) of separating, storing and distributing data on different performance/cost types of storage media, based on different criteria or service level needs. In a simple example, mission-critical or frequently accessed data (or "hot" data) may be categorized as "Tier 1" data which might require storage on a faster tier of storage media, such as DRAM or flash memory solid-state disk. Storage of other data types may make more sense on more economical hard disk technology. Tiering may involve automatic movement of data between one storage tier and another, based on preset policy rules. It can also involve automated distribution of different types of data across more than one tier at the same time.
TLCTLC stands for triple-level cell flash. Based on NAND flash memory, TLC stores three bits of data per cell of flash media. Known to be less expensive than its counterparts--single-level cell (SLC) and multi-level cell (MLC) flash memory--TLC is often used in consumer devices needing solid-state disk (SSD) technology.
TRIMTRIM is a command performed by the operating system when certain blocks of data stored on an SSD or flash system can now be reclaimed by the SSD. The TRIM command ensures faster write operations and less flash wear by allowing such memory blocks to be erased and reclaimed in advance--before they are rewritten in one or more new write operations.
vCenter Plug-inA vCenter plug-in refers to additional software integration that can be added to boost the operation and functionality of VMware vSphere vCenter Server virtual server environments. While VMware offers its own vCenter plug-ins, third-party vendors like Tegile also offer vCenter plug-ins, which integrate the storage array with VMware vCenter so that it can be centrally managed within vCenter. Using the plug-in, datastores and snapshots can be created, managed and monitored within vCenter. VM-aware reporting can also be performed at the VM-level.
VDIVDI stands for Virtual Desktop Infrastructure. The term is used to describe virtual environments where user desktops environments that are not traditional, standalone or network PCs. Instead, such desktop environments are hosted "virtually" within a virtual machine (VM) that runs on a centralized server.
VMFS datastoresVMFS stands for Virtual Machine File System. It is a clustered file system and volume manager developed by VMware for its virtual server environments. In these environments, VMFS datastores are storage containers that hold the templates, disk images and files (i.e., VMDK files) associated with each virtual machine. The datastore acts as a separate filesystem on top of a traditional storage volume or LUN. VMFS datastores allow the creation of virtual machines (VMs), creation of VM templates, the act of turning on a VM, as well as creating and deleting files stored within. VMFS datastores are used with SAN (block-based) storage protocols, such as iSCSI or Fibre Channel.
vStorage APIs for Array Integration (VAAI)vStorage API for Array Integration (VAAI) is a type of VMware API that allows storage tasks, like thin provisioning, to be offloaded from the VMware server to a storage array. Tegile storage arrays support VAAI, which means they allow smoother, more efficient space reclamation for storage associated with a VMware environment.
Wear-levelingThe process of distributing writes across the flash storage media. Without wear leveling, there is the potential for individual cells to be continuously and repeatedly erased and rewritten, which could lead to a premature death.
Write AmplificationWrite amplification refers to the need for flash cells to be repeatedly erased first, then rewritten multiple time by both data and metadata. This process may hinder flash performance and may reduce the life of flash cells within a solid state storage (SSD) system.
Write CliffA write cliff is a term used to describe a sudden decrease in flash/SSD performance. This performance degradation occurs when an SSD system's flash cells are already full and run out of free blocks for new write requests.