FabMig First Questionnaire Findings

Fabian Cloud Migration Questionnaire Findings

Many thanks to all of you who responded to the recent questionnaire about the migration of the Fabian general purpose cluster to the cloud and participated in the subsequent follow-up workshops and interviews. The following summarises your feedback and how we plan to incorporate it into the design of the new Fabian Cloud service.

You told us that “CPU speed”, “memory limits” and “job runtime” are the top pinch-points at the moment.
First, here’s a quick reminder of the topics we asked you about: 
  • Job scheduling
  • Fabian desktop service
  • Browser-based job submission, remote desktops, notebooks and RStudio
  • Number of cores on the largest servers
  • Current performance limitations
  • Storing sensitive data
  • Sharing datasets and notebooks
  • Database hosting
  • Collaborating with external researchers
  • Applications and approaches for the future
  • Suggestions on how to improve the service

Job scheduling

One of our concerns was the possible impact on users of changing the job scheduler from GridEngine to Slurm. Based on your feedback, most users are comfortable with having to learn new versions of familiar commands. We’ll be sharing more information on this change nearer the time, including providing before-and-after examples. 

Fabian desktop service

We already knew that some users weren’t happy with the current Fabian desktop access application but we wanted to hear from more users; over half of responses agreed with that sentiment. There will be a new Fabian desktop service for users but increasing, you’ll see alternative ways of accessing services via the browser.

Browser-based job submission, remote desktops, notebooks and RStudio 

Having browser-based options is clearly supported by the questionnaire responses. Command-line access isn’t going away with the move to the Cloud but there will be a popular browser-based alternative for submitting jobs: Open OnDemand. In fact, OnDemand provides browser-based access to jobs, files, notebooks and clusters (via a built-in terminal). Here’s a screenshot of a beta Fabian OnDemand homepage:

In response to very positive feedback, there will also be a new, browser-accessible RStudio service, as well as JupyterLab with the ability to run popular languages such as julia, python, R and stata.

Number of cores on the largest servers 

Two options received a significant number of responses, “4 cores” and “the more, the better”, so we’re considering a move from small/medium/large core options to “regular” and “max”. Within the “max” option, we’re investigating automatically picking the best core configuration.

Fabian will likely still offer small/medium/large options for memory.

Current performance limitations 

You told us that “CPU speed”, “memory limits” and “job runtime” are the top pinch-points at the moment. The new Cloud service will enable us to offer more flexible processor speeds and memory configurations so you should see improvements there. 

We’ll still need to apply some realistic limits on job runtimes to manage costs but we are considering changing how jobs are terminated: moving to a longer hard limit coupled with soft termination at flexible default times. Hopefully this will mean a better user experience for you.

Storing sensitive data 

Whilst the secure research nodes should continue to be used for highly confidential research, we did want to find out whether you might want to store confidential or personally-identifiable-information on the Fabian general purpose cluster.

Your feedback indicated that there is interest in changing the scope of the general-purpose cluster to accommodate sensitive datasets. We’re investigating whether the Cloud encryption mechanisms will allow us to permit confidential data storage; personally-identifiable-information may require segregated clusters, however.

Sharing datasets and notebooks 

From the questionnaire responses and workshop sessions it is clear to us that there isn’t much interest in sharing datasets: your research is your own or your project’s, especially when writing articles based on the datasets you’re using. 

There is interest, however, in us providing shared notebooks for when you were collaboratively working on a project, for example. This is something we are investigating as a new supported feature.

Database hosting 

Whilst overall you told us that database hosting isn’t a priority for you, for a few users it is important so we will include database hosting in our service design.

Collaborating with external researchers 

Your feedback was roughly fifty-fifty on whether the provision of collaboration with non-LSE researchers is needed and important. We will investigate what is possible within technical and cost constraints to see how external collaboration can take place on the new Fabian Cloud service.

Applications and approaches for the future 

There were no widely-supported suggestions for new features for the general-purpose cluster but we were already aware of interest in GPUs. This was repeated in both the questionnaire responses and the subsequent workshops and interviews.

There are cost constraints on the design of the new Fabian general-purpose cluster – the new service needs to operate within the current budget – but we do want to offer a better service where possible. Accordingly, we are investigating how we might incorporate GPU options. 

From your feedback we also realised that we could do a better job of communicating what versions of applications are already available!

Suggestions on how to improve the service 

Your top suggestions were for more documentation and support. We recognise that the success of the Cloud-based service will depend on how successfully we can help you transition from the current processes you know to the new, initially unfamiliar ones.

The new user groups will continue to meet during the Service design stage in order to keep you updated on how work is progressing and what final features will be available. We are hoping to be able to demo beta versions of the new Fabian general-purpose cluster once we reach that stage. 

Having browser-based options is clearly supported by the questionnaire responses.