#

Data Storage

Description 

Storage provides large-scale networked attached access to research data from a variety of endpoints: compute cluster, instruments workstations, laptops, etc.  Various storage types are typically designed for specific I/O (literally data in/out) needs that stems from raw data collection, data processing and analysis, data transfer and sharing, data preservation and archiving.  These storage types usually have different data retention and replication policies as well. 

 

Key Features and Benefits  

Definitions:  

Quota / Logical Volume – data storage may be allocated as separate logical volumes, which allows for separation of the network export of the data.  Alternatively, in a shared filesystem, a quota may be applied to provide a specific allocation size of storage and/or number of files (or objects).  A soft quota is the point in which warnings could be issued above a specified limit, and hard quotas do not allow for overage of data or file capacity. 

Access: Files may be accessed through a number of different protocols. The following are common to Research Computing: 

NFS (network filesystem) – Linux POSIX based file system served from single storage server to internal network attached NFS clients, usually compute nodes or VMs. Use of the service looks transparent to the end users within a POSIX filesystem and requires no authentication.  Client and server share the same authentication service (Windows AD or LDAP), so UID/GID (user and group names) are preserved. 

Parallel filesystem – Linux POSIX based file system served from multiple storage servers to internal network attached clients via NFS protocol.  Open source Lustre and IBM’s GPFS are two commonly used parallel filesystems in cluster computing due to their performance and scalability.  Individual files can be split (aka stripped) across multiple storage targets and multiple clients can read or write to the same file via MPI-IO in MPI version 2. 

SMB (Windows), Samba (Linux/Mac), CIFS – Linux POSIX based file system served from a single server by re-exporting a local or remote filesystem as a mapped network drive. Use of this service requires the end user to establish a connection and requires authentication.  Connections from the client that do not use central authentication will not have proper UID/GID mappings, and changed of file permissions are not preserved as they are enforced from the server  

Backups: The following methodologies are used to providing loss of data: 

Snapshot – A point in time reference to a collection of files or objects is maintained for a specified duration.  Thus, the total amount of space used is the current collection of data, plus the change in data kept during the snapshot duration.  Use of this service allows the end user to recover files or objects to a prior point it time.  A changed file could be recovered to the previous state if done after the frequency of the snapshots and before the duration of snapshots expire. For example, if snapshots are taken daily for two weeks, and you altered or deleted a file 3 days ago, you could recover this file from a snapshots directory.  However, if you created a file today and removed it, there would not be a snapshot of this file, /. as its lifetime was shorter than the snapshot frequency.  Also, if you tried to recover a file past two weeks, the snapshot would have already expired.   

Disaster Recovery (DR) Copy – Is a point in time transfer of the entire filesystem to a different storage server in a remote data center for the purpose of recovering data from a complete filesystem lost.  This could be due to physical damage to the system (fire, electrical, or water damage) or due too many drives lost.  As this is completed asynchronously, it is not necessarily a point in time reference, as changes to the source filesystem could happen during the transfer. 

No backup / Single copy – Files are located on a single storage device, when files are removed by users, they are immediately deleted from the filesystem.  This storage device still has resiliency to individual drive failure from RAID/ erasure coding.  

Retention – Some filesystems, especially scratch have a retention policy that sets the amount of time files can remain on the filesystem before being moved or removed.   

Scratch / Temp – While performing research computations, it is common that many files are created during runtime that are not needed after the series of computations is complete.  Thus, most compute nodes have a local and computing clusters general purpose scratch filesystem.  These usually have a specified retention policy, whereby files are only kept a short duration. 

Performance/Tiers – With modern drives and network, there is a wide variety of performance characteristics of storage, often broken up into storage tiers.  By offering multiple tiers of storage, different workflows can be accommodated, and moving data through the storage tiers will facilitate the storage lifecycle.  Storage tiers typically start with tier zero being the most performant tier, subsequent tiers one, two, three, etc. step down the performance with each additional tier. 

  • Tier 0 – Often called “scratch” this tier is typically used for the “intermediate” files that are created, used, and deleted during a cluster compute workflow.  Some workflows will also copy source files for a cluster compute job to scratch to take advantage of the performance characteristics and will delete them at the end of the job.   In scratch/temp spaces, all data that needs to be retained will need to be copied to another tier.  In general tier 0 does not have snapshots and does not have a DR copy. 
  • Tier 1 – This tier is often the “general purpose” storage tier.  This is where most users will store their data as it is created and will keep the data that they are actively using.  This tier is generally performant enough to use for a modest number of clustered compute jobs and is typically the class of storage used for general file sharing, SMB and NFS.  This tier is generally protected with snapshots and a DR copy. 
  • Tier 2 This tier is generally used for data that is no longer active but could be needed in the near future.  For example, data associated with a recently completed experiment which is no longer active.   It is not appropriate to compute against data on this tier with more than ~10-20 computations at a time. 
  • Tier 3 – This tier is generally used for data that is no longer needed but needs to be kept for compliance or publishing reasons.  Tier 3 data is accessible, generally with a large delay. 

 

Service Expectations and Limits: 

Research data storage is not intended to be an enterprise service and is generally operated more with a more conservative cost basis.  Many factors affect performance of storage including the network, percent full of filesystem, age of hardware, mixture of read/write patterns from endpoints, single server vs clustered storage.   In addition, due to the nature of research that is being performed, it isn’t always well understood by the end user how scaling out their computations on a cluster is affecting the underlying filesystem.   

At FASRC, availability, uptime, and backup schedule is provided as best effort with staff that do not have rotating 24/7/365 shifts.   

 

Available to: 

All PIs with an active FASRC account from any Harvard School. 

 

Service manager and Owner:  

Service Manager: Brian White, Associate Director of Systems Engineering and Operations  

Service Owner: Scott Yockel, Director of FAS Research Computing 

All storage requests should be sent via email to rchelp@rc.fas.harvard.edu  

 

Offerings (Tiers of Service) 

Tier 0:  Scratch -  Lustre

  • Features: High-performance, temporary, single copy, network attached to cluster via (Lustre/NFS), quota (files + size), 90 retention policy. 
  • Mount point: /n/holyscratch/pi_lab
  • Quota: 50 TB max per Lab 
  • Cost: Included as part of Cluster Computing

Tier 0:  Bulk - Lustre 

  • Features: High-performance, single copy, network attached to cluster via (Lustre/NFS), quota (files + size), Starfish data management web access, Globus transfer access.
  • Mount point: /n/holyscratch/pi_lab
  • Quota: 1-1024 TB
  • Cost: $50/TB/yr

Tier 1: Home directories - Islon

  • Features:  Regular performance, snapshot, DR copy, network attached to cluster (NFS), quota, SMB 
  • Mount point: /home/username 
  • Quota: 100 GB
  • Cost: Included as part of Cluster Computing 

Tier 1: Enterprise - Isilon

  • Features:  Tiered performance, snapshot, DR copy, network attached to cluster (NFS), quota, SMB, Superna Eyeglass data management web access, Globus transfer access.
  • Mount point: /n/pi_lab
  • Quota: 1-1024 TB
  • Cost: $250/TB/yr 

Tier 2: Lab Share - Ceph 

  • Features: Regular performance, object store, DR copy, network attached to cluster (NFS), SMB, encrypted at rest.
  • Mount point: /n/pi_lab 
  • Quota: 1-1024 TB
  • Cost: $100/TB/yr 

Tier 3: Attic Storage - IBM Tape 

  • Features: Low performance, S3 object store access, single copy, network attached to data transfer nodes, Globus transfer access.
  • Availability June 2021
  • Cost: estimate $8/TB/yr