Table of Contents
Storage provides large-scale networked attached access to research data from a variety of endpoints: compute cluster, instruments workstations, laptops, etc. Various storage types are typically designed for specific I/O (literally data in/out) needs that stems from raw data collection, data processing and analysis, data transfer and sharing, data preservation and archiving. These storage types usually have different data retention and replication policies as well.
Related: Introduction to FASRC Cluster Storage video
Key Features and Benefits
Quota / Logical Volume – data storage may be allocated as separate logical volumes, which allows for separation of the network export of the data. Alternatively, in a shared filesystem, a quota may be applied to provide a specific allocation size of storage and/or number of files (or objects). A soft quota is the point in which warnings could be issued above a specified limit, and hard quotas do not allow for overage of data or file capacity.
Access: Files may be accessed through a number of different protocols. The following are common to Research Computing:
NFS (network filesystem) – Linux POSIX based file system served from single storage server to internal network attached NFS clients, usually compute nodes or VMs. Use of the service looks transparent to the end users within a POSIX filesystem and requires no authentication. Client and server share the same authentication service (Windows AD or LDAP), so UID/GID (user and group names) are preserved.
Parallel filesystem – Linux POSIX based file system served from multiple storage servers to internal network attached clients via NFS protocol. Open source Lustre and IBM’s GPFS are two commonly used parallel filesystems in cluster computing due to their performance and scalability. Individual files can be split (aka stripped) across multiple storage targets and multiple clients can read or write to the same file via MPI-IO in MPI version 2.
SMB (Windows), Samba (Linux/Mac), CIFS – Linux POSIX based file system served from a single server by re-exporting a local or remote filesystem as a mapped network drive. Use of this service requires the end user to establish a connection and requires authentication. Connections from the client that do not use central authentication will not have proper UID/GID mappings, and changed of file permissions are not preserved as they are enforced from the server
Backups: The following methodologies are used to providing loss of data:
Snapshot – A point in time reference to a collection of files or objects is maintained for a specified duration. Thus, the total amount of space used is the current collection of data, plus the change in data kept during the snapshot duration. Use of this service allows the end user to recover files or objects to a prior point it time. A changed file could be recovered to the previous state if done after the frequency of the snapshots and before the duration of snapshots expire. For example, if snapshots are taken daily for two weeks, and you altered or deleted a file 3 days ago, you could recover this file from a snapshots directory. However, if you created a file today and removed it, there would not be a snapshot of this file, /. as its lifetime was shorter than the snapshot frequency. Also, if you tried to recover a file past two weeks, the snapshot would have already expired.
Disaster Recovery (DR) Copy – Is a point in time transfer of the entire filesystem to a different storage server in a remote data center for the purpose of recovering data from a complete filesystem lost. This could be due to physical damage to the system (fire, electrical, or water damage) or due too many drives lost. As this is completed asynchronously, it is not necessarily a point in time reference, as changes to the source filesystem could happen during the transfer.
No backup / Single copy – Files are located on a single storage device, when files are removed by users, they are immediately deleted from the filesystem. This storage device still has resiliency to individual drive failure from RAID/ erasure coding.
Retention – Some filesystems, especially scratch have a retention policy that sets the amount of time files can remain on the filesystem before being moved or removed.
Scratch / Temp – While performing research computations, it is common that many files are created during runtime that are not needed after the series of computations is complete. Thus, most compute nodes have a local and computing clusters general purpose scratch filesystem. These usually have a specified retention policy, whereby files are only kept a short duration.
Performance/Tiers – With modern drives and network, there is a wide variety of performance characteristics of storage, often broken up into storage tiers. By offering multiple tiers of storage, different workflows can be accommodated, and moving data through the storage tiers will facilitate the storage lifecycle. Storage tiers typically start with tier zero being the most performant tier, subsequent tiers one, two, three, etc. step down the performance with each additional tier.
- Tier 0 – This tier is typically used for the “intermediate” files that are created, used, and deleted during a cluster compute workflow. This tier generally has the highest performance and capacity and is designed to sustain thousands of computing jobs simultaneously. Tier 0 does not have snapshots nor a DR copy, therefore users are responsible for backing up critical data. Scratch (/n/holyscratch01) is built on the same technology as our Tier 0.
- Tier 1 – This tier is often the “general purpose” storage tier. This is where most labs who require snapshots or a DR copy will store their data as it is created and will keep the data that they are actively using. This tier is generally performant enough to use for hundreds of clustered compute jobs simultaneously and is typically the class of storage used for general file sharing, SMB and NFS.
- Tier 2 – This tier is generally used for data that is less active but could be needed in the near future. For example, data associated with a recently completed experiment that is no longer active. It is not appropriate to compute against data on this tier with more than ~10-20 computations at a time.
- Tier 3 – This tier is generally used for data that is static (no longer needed to make changes to) but needs to be kept for compliance or publishing reasons. Tier 3 data is accessible, generally with a large delay.
Service Expectations and Limits:
Research data storage is not intended to be an enterprise service and is generally operated more with a more conservative cost basis. Many factors affect performance of storage including the network, percent full of filesystem, age of hardware, mixture of read/write patterns from endpoints, single server vs clustered storage. In addition, due to the nature of research that is being performed, it isn’t always well understood by the end user how scaling out their computations on a cluster is affecting the underlying filesystem.
At FASRC, availability, uptime, and backup schedule is provided as best effort with staff that do not have rotating 24/7/365 shifts.
All PIs with an active FASRC account from any Harvard School.
Service manager and Owner:
Service Manager: Brian White, Associate Director of Systems Engineering and Operations
Service Owner: Scott Yockel, Director of FAS Research Computing
All storage requests should be sent via email to email@example.com
Offerings (Tiers of Service)
Tier 0: Bulk - Lustre
- Features: High-performance, single copy, network attached to cluster via (Lustre/NFS), quota (files + size), Starfish data management web access, Globus transfer access.
- Mount point(s): Other (varies)
- Quota: 1-1024 TB
- Cost: $50/TB/yr (excluding scratch)
Tier 1: Enterprise - Isilon
- Features: Tiered performance, snapshot, DR copy, network attached to cluster (NFS), quota, SMB, Superna Eyeglass data management web access, Globus transfer access.
- Mount point: /n/pi_lab
- Quota: 1-1024 TB
- Cost: $250/TB/yr
Tier 2: Lab Share - Ceph
- Features: Regular performance, object store, DR copy, network attached to cluster (NFS), SMB, encrypted at rest.
- Mount point: /n/pi_lab
- Quota: 1-1024 TB
- Cost: $100/TB/yr
Tier 3: Attic Storage - Tape
- Features: Low performance, S3 object store access, single copy, network attached to data transfer nodes, Globus transfer access.
- Availability July 2021
- Cost: estimate $8/TB/yr