FASRC provides a number of services including cluster computing, data storage, and virtual machines. Each of these services all share the same FASRC authentication services. As well FASRC works directly with PIs to create various access groups for data and services. A number of common technical components are needed to access the different services...
Researchers can take advantage of the scale of the FASRC cluster by setting up workflows to split many different tasks into large batches, which are scheduled across the cluster at the same time. Most clusters are made from commodity hardware, so do note that the performance of an individual computation may not be any faster than a local new workstation/laptop with new CPU and flash (SSD) storage; cluster computing is about scale. Doing computations at scale allows a researcher to test many different variables at once, thereby shorter time to outcomes, and also provides the ability to ask larger, more complex problems (I.e. larger data sets, longer simulation time, more degrees of freedom, etc.)...
The New England Research Cloud. Coming Spring 2021.
In a typical cluster computing environment, the OS, configuration, etc. is relatively homogeneous across the infrastructure and root/administrator access on each host is limited to system administrators. Allowing different operating systems or giving users privileged access creates a fragmented infrastructure. Often users need the computational benefits of an entire “server” but do not have the funds for or need an entire physical server. Virtual Instances provide an avenue for users looking to utilize different operating systems and configurations than are typically used in the underlying physical infrastructure. They also enable users to own an entire "server" without having to pay the cost of a full machine...
Storage provides large-scale networked attached access to research data from a variety of endpoints: compute cluster, instruments workstations, laptops, etc. Various storage types are typically designed for specific I/O (literally data in/out) needs that stems from raw data collection, data processing and analysis, data transfer and sharing, data preservation and archiving. These storage types usually have different data retention and replication policies as well...
In a data analysis environment, organized collections of data need to be hosted and accessed by a set of researchers. A database service provides an interface to store collections and provide privileged access from a variety of endpoints. Databases also provide a much more structure and searching capabilities of information than file-based methods. Hosted database servers can be shared and have databases for several groups and store different datasets. Access to the data/database is controlled by the owner of the data or based on a data use agreement...
Hosting services provide a secure, environmentally controlled data center space with redundant power and network feeds. This allows researchers to not have to house computer equipment such servers and storage within their lab spaces. Professional IT staff can provide a variety of services to support these systems if needed...
Research software is the collection of tools, codes, libraries that allows a researcher to generate new data or analyze and make meaning of existing data. Some software is used across many disciplines, like MatLab, R, python, whereas most are very domain specific. The majority of software is provided by the researchers, however FAS has site or volume licenses for several packages for cluster and/or desktop use, which are provided at no cost to faculty and students. Licenses are usually available upon request. For software that is needed on the cluster, FASRC can install codes in a common location. Desktops installation of software is self-service or can be done by your local HUIT Desktop support group.
RC will also help you acquire any other software you may require for your research, or help you determine which packages are most appropriate for the types of analyses you want to do. These packages are purchased individually for each user and are charged to research accounts...
A research consultation and facilitation service can enable a research team to perform research using an expanded set of methodologies and skills, to accelerate research and complete deliverables in a more timely manner, and to expand the scope and amount of research pursued. These goals can be achieved with help from a research computing staff via consultation, facilitation, scripting, project and data management, and training. A research team can often benefit from incremental help from just-in-time learning to expand their knowledge and skills, to augment their collective skill set, and/or to accomplish analysis, coding, and organizational tasks...
Data Science and Software Engineering play an important role in research by creating new capabilities to process and analyze data, helping ensure reproducibility, and aiding researchers in extracting knowledge and insight for the data. Researchers utilize software in their research by using scripts, tools, open-source software, and licensed software. Data science also covers a wide range of skills and techniques applied to cleaning (aka wrangling), processing, and statistics that are typically beyond what a researcher from a specific domain might have. Due to the rapidly evolving nature of research, there are not always codes for all functions needed, nor are their clean data sources; therefore, the software or data pipelines are developed specifically for a given project. Traditionally, this development was done with researchers (graduate students and postdocs) or independent contractors. This approach poses several issues in terms of maintenance, optimization, reproducibility, and cost.
An RSE and/or Data Science team can overcome the issues with the traditional software development approach and provide the institutional memory on the research software projects that can benefit an individual research group as well as a broader Harvard community in the long-term. A RSE or Data Scientist team can work closely with other Research Computing Systems teams to design, develop, deploy, optimize, and maintain software packages/tools and data pipelines that are paired with specific hardware architectures to accelerate cutting-edge research at Harvard University...
Queuing System: SLURM
FASRC has built a cluster that is specifically geared towards courses and is tied into Canvas. All class accounts will be through this new system as of 2021. The main cluster can no longer host class accounts due to FERPA and other requirements.
See: https://atg.fas.harvard.edu/ondemand for details and help.
Service Center and Billing FAQ
FASRC works closely with the FAS Informatics group who provide data management, analysis, training, and software support for the faculty, staff, and cores.
Research Support at Harvard
To see the full listing of Harvard Wide Research Computing, Research Data & Scholarship, and Research Administration services, please visit the Research Support at Harvard site.