#

Research Computing Storage Modernization Initiative

Improving Data Storage Infrastructure at FAS

FAS Research Computing (FAS RC) will be transitioning to a new storage infrastructure. The project will consolidate and modernize a significant portion of the existing storage filesystems and move to new and improved hardware. The new storage will replace outdated filesystems, realign storage options with data lifecycle needs, incorporate enterprise-grade features, and establish a scalable and financially sustainable service model.

The project incorporates over 70 pebibytes of new data storage to ensure FAS RC remains at the forefront of research and meet the needs of the Harvard community.  

Strategic Outcomes

  • Sustainable Growth
    Scalable, cost-effective storage aligned to support expanding research demands and lifecycle trends.
  • Improved Service Quality
    Resilient infrastructure with enterprise-grade reliability and support creating a user-friendly experience.
  • Operational Efficiency
    Reduces manual overhead, eliminates disruptive migrations, and reallocates staff to strategic and user-facing initiatives.
  • Financial Stability
    Establishes a predictable long-term cost-recovery model with transparent tiered pricing and vendor-aligned incentives.
  • Forward-thinking Infrastructure
    Supports AI/ML, secure multi-protocol access, and evolving scientific workflows.

Storage Growth & Future Demand

FASRC storage capacity grew from 32.72 PiB to 87.89 PiB (a 168.5% increase) over the past several years, reflecting an average growth of 11.03 PiB per year or 21%. Storage demand is projected to reach 141 PiB by FY30. 

Key growth drivers include:

  • New lab and research group expansions
  • Data retention requirements 
  • Legacy data migrations

Improved data lifecycle enforcement and better management practices supported by analytics tools like Starfish and Coldfront will help mitigate future growth, enabling users and administrators to make more informed decisions, optimize storage usage, and enhance internal forecasting.

New Storage Architecture

  • Compute Storage

    • 20 PiB of high-performance cluster adjacent storage
    • Active storage for data analysis; designed to house data readily utilized and accessed. 
    • Intended for HPC, AI/ML, and parallel workloads
  • Lab Storage

    • 35 PiB of lab storage with backups (snapshots) and disaster recovery*
    • General purpose storage for raw and project data; can be utilized as buffer storage for lab instruments and equipment
    • Not intended for heavy computational workflows
  • FAS Secure Environment (FASSE)

    • 2.5 PiB secure storage with backups (snapshots) and disaster recovery* 
    • Secure storage environment for sensitive or regulated data; data generated using Data Use Agreements (DUAs) or IRB
    • Encryption at rest also included.
  • Long-Term Storage (Cold)**

    • 20 PiB of disk-based S3-compatible storage with disaster recovery***
    • Long-term storage of research data to meet institutional data retention and compliance requirements.
    • On-premise long-term storage option for Harvard affiliated labs that is easier to access and faster than alternative long-term storage options (Tape) 
  • Tape (NESE) 

    • Existing long-term storage option without disaster recovery
    • Provided by the Northeast Storage Exchange (NESE) to house inactive research data after project completion or for data retention purposes.
    • Stored on physical tapes in 20TB increments. Limited to ten thousand files per folder and file sizes between 1GiB to 100 GiB. You need to tar the data with small files to fulfill the technology requirement
    • Data transferred to and retrieved from Tape using the Globus tool.  

*Snapshots are copies of a directory taken at a specific moment in time. They offer labs a self-service recovery option for overwritten or deleted files within the specific time period. Disaster recovery is a copy of an entire file system that can be used internally by FASRC in case of system-wide failure.

** ECS object storage. 

*** Disaster recovery is an additional cost. 

Storage Costs 

Former Storage Tiers and Costs
Tier Rate ($/TB/year)
Tier 0 $50
Tier 1 $250
Tier 2 $100
Tier 3 (NESE Tape) $5
 
New Storage Options and Costs
Tier Rate ($/TB/year)
Compute Storage $150
Lab Storage $125
FASSE (Secure Enclave) $150
Long-Term Storage  $30
Tape (NESE)  $15

 

Storage Transition Schedule 

The migration to new storage hardware will occur in phases, beginning January 2026 through Spring next year. We will provide additional detailed guidance via email in advance of the migrations to ensure you are fully informed and prepared. We appreciate your patience and support through this transition. 

Current Storage Previous Tier New Storage Offering Timeframe
boslfs02 Tier 0 Lab Storage Phase 1 (Jan 2026)*
bos-isilon/holy-isilon Tier 1 Lab Storage Phase 1 (Jan 2026)*
FAS Secure Enclave (FASSE) Various FASSE Phase 2 (Winter 2026)*
b-nfs-01/b-nfs-10 Tier 2 Lab Storage Phase 3 (Winter 2026)*
h-nfs15/h-nfs-20 Tier 2 Lab Storage Phase 3 (Winter 2026)*
holystore01 Tier 0 Compute Storage Phase 4 (Spring 2026)*
holylfs04/holylfs05/holylfs06 Tier 0 Compute Storage Phase 4 (Spring 2026)*

*Specific timelines and dates will be communicated directly to affected labs prior to the migration. Netscratch, /holylabs, and home folders will not be affected by this migration. 

Contact 

If you have any questions regarding the new storage environment, please email the FASRC Storage Migration team and a member of FASRC will respond within two business days.

Please designate an individual within the lab or group that can act as a General Manager, or data manager to help track storage usage and communicate with the FASRC Research Data Manager. General managers also have the ability to review and monitor the group's storage usage using data management tools such as Starfish and Coldfront. If you have any questions or would like to discuss data cleanup efforts, please email the FASRC Research Data Manager at rdm@rc.fas.harvard.edu