Improving Data Storage Infrastructure at FAS
FAS Research Computing (FAS RC) will be transitioning to a new storage infrastructure. The project will consolidate and modernize a significant portion of the existing storage filesystems and move to new and improved hardware. The new storage will replace outdated filesystems, realign storage options with data lifecycle needs, incorporate enterprise-grade features, and establish a scalable and financially sustainable service model.
The project incorporates over 70 pebibytes of new data storage to ensure FAS RC remains at the forefront of research and meet the needs of the Harvard community.
Strategic Outcomes
- Sustainable Growth
Scalable, cost-effective storage aligned to support expanding research demands and lifecycle trends.
- Improved Service Quality
Resilient infrastructure with enterprise-grade reliability and support creating a user-friendly experience.
- Operational Efficiency
Reduces manual overhead, eliminates disruptive migrations, and reallocates staff to strategic and user-facing initiatives.
- Financial Stability
Establishes a predictable long-term cost-recovery model with transparent tiered pricing and vendor-aligned incentives.
- Forward-thinking Infrastructure
Supports AI/ML, secure multi-protocol access, and evolving scientific workflows.
Storage Growth & Future Demand
FASRC storage capacity grew from 32.72 PiB to 87.89 PiB (a 168.5% increase) over the past several years, reflecting an average growth of 11.03 PiB per year or 21%. Storage demand is projected to reach 141 PiB by FY30.
Key growth drivers include:
- New lab and research group expansions
- Data retention requirements
- Legacy data migrations
Improved data lifecycle enforcement and better management practices supported by analytics tools like Starfish and Coldfront will help mitigate future growth, enabling users and administrators to make more informed decisions, optimize storage usage, and enhance internal forecasting.
New Storage Architecture
-
Compute Storage
- 20 PiB of high-performance cluster adjacent storage
- Active storage for data analysis; designed to house data readily utilized and accessed.
- Intended for HPC, AI/ML, and parallel workloads
-
Lab Storage
- 35 PiB of lab storage with backups (snapshots) and disaster recovery*
- General purpose storage for raw and project data; can be utilized as buffer storage for lab instruments and equipment
- Not intended for heavy computational workflows
-
FAS Secure Environment (FASSE)
- 2.5 PiB secure storage with backups (snapshots) and disaster recovery*
- Secure storage environment for sensitive or regulated data; data generated using Data Use Agreements (DUAs) or IRB
- Encryption at rest also included.
-
Long-Term Storage (Cold)**
- 20 PiB of disk-based S3-compatible storage with disaster recovery***
- Long-term storage of research data to meet institutional data retention and compliance requirements.
- On-premise long-term storage option for Harvard affiliated labs that is easier to access and faster than alternative long-term storage options (Tape)
- 20 PiB of disk-based S3-compatible storage with disaster recovery***
-
Tape (NESE)
- Existing long-term storage option without disaster recovery
- Provided by the Northeast Storage Exchange (NESE) to house inactive research data after project completion or for data retention purposes.
- Stored on physical tapes in 20TB increments. Limited to ten thousand files per folder and file sizes between 1GiB to 100 GiB. You need to tar the data with small files to fulfill the technology requirement
- Data transferred to and retrieved from Tape using the Globus tool.
*Snapshots are copies of a directory taken at a specific moment in time. They offer labs a self-service recovery option for overwritten or deleted files within the specific time period. Disaster recovery is a copy of an entire file system that can be used internally by FASRC in case of system-wide failure.
** ECS object storage.
*** Disaster recovery is an additional cost.
Storage Costs
| Former Storage Tiers and Costs | |
| Tier | Rate ($/TB/year) |
| Tier 0 | $50 |
| Tier 1 | $250 |
| Tier 2 | $100 |
| Tier 3 (NESE Tape) | $5 |
| New Storage Options and Costs | |
| Tier | Rate ($/TB/year) |
| Compute Storage | $150 |
| Lab Storage | $125 |
| FASSE (Secure Enclave) | $150 |
| Long-Term Storage | $30 |
| Tape (NESE) | $15 |
Storage Transition Schedule
The migration to new storage hardware will occur in phases, beginning January 2026 through Spring next year. We will provide additional detailed guidance via email in advance of the migrations to ensure you are fully informed and prepared. We appreciate your patience and support through this transition.
| Current Storage | Previous Tier | New Storage Offering | Timeframe |
| boslfs02 | Tier 0 | Lab Storage | Phase 1 (Jan 2026)* |
| bos-isilon/holy-isilon | Tier 1 | Lab Storage | Phase 1 (Jan 2026)* |
| FAS Secure Enclave (FASSE) | Various | FASSE | Phase 2 (Winter 2026)* |
| b-nfs-01/b-nfs-10 | Tier 2 | Lab Storage | Phase 3 (Winter 2026)* |
| h-nfs15/h-nfs-20 | Tier 2 | Lab Storage | Phase 3 (Winter 2026)* |
| holystore01 | Tier 0 | Compute Storage | Phase 4 (Spring 2026)* |
| holylfs04/holylfs05/holylfs06 | Tier 0 | Compute Storage | Phase 4 (Spring 2026)* |
*Specific timelines and dates will be communicated directly to affected labs prior to the migration. Netscratch, /holylabs, and home folders will not be affected by this migration.
Contact
If you have any questions regarding the new storage environment, please email the FASRC Storage Migration team and a member of FASRC will respond within two business days.
Please designate an individual within the lab or group that can act as a General Manager, or data manager to help track storage usage and communicate with the FASRC Research Data Manager. General managers also have the ability to review and monitor the group's storage usage using data management tools such as Starfish and Coldfront. If you have any questions or would like to discuss data cleanup efforts, please email the FASRC Research Data Manager at rdm@rc.fas.harvard.edu.






