September 21, 2020, by Chris Handley
From Subsystem to Super System – Windows Subsystem for Linux, a Gateway to HPC
If you are seeking to leverage the full extent of computational resources for your research, you are likely to end up using a Unix machine to run your software. This is especially true if you are developing software and data analysis pipelines on High-Performance Computing (HPC) facilities, like Augusta – our own HPC facility at the University of Nottingham.
The World that Was
Unix is a powerful operating system (OS) that allows for multi-tasking, multiple users, and supports a wide variety of software and programming languages. This makes it ideal for servers, workstations, and HPC services. Unix is the basis for Linux, which also can be used as an OS for servers and HPC systems. For example, Augusta uses the CentOS implementation of Linux.
The popularity of the Linux OS means that a great deal of research software is designed to run on it. Developing the same software for Windows users is troublesome. For some time, Apple machines have been dominant because the Mac OS is built upon a Unix-like system, which makes testing and deployment of software simple (as you are able to access a Unix-like environment via a terminal). For Windows, users previously had to install either virtual machines or emulation software, such as Cygwin.
Window into Linux
However, things changed in 2016 when Windows 10 introduced the Windows Subsystem for Linux (WSL). This addition to Windows puts in place a compatibility layer, allowing users to run software and Linux distributions, such as Ubuntu, on a Windows 10 machine, including running a terminal and Linux GUI applications using the X Windows Server.
Since then, things have only improved. The next generation of the WSL became available in the spring of 2020. This improved performance and offered greater compatibility with Linux software by using a fully-fledged Linux kernel in a virtualized environment using Microsoft’s Hyper-V architecture. That allows the Linux software to fully utilize all system calls to the machine. You can even launch Windows applications from within the Linux environment of the Subsystem.
So what does that mean for the end-user?
My Experience
As a Research Software Engineer who has also taught and trained undergraduates (in particular in the field of computational chemistry), part of the training of these students was how to use Linux and HPC computers. Students needed to be able to run simulations, perhaps many hundreds, if not thousands of these. Some simulations need to be run for a very long time. Others require large amounts of memory and temporary write space. Knowing how to use Linux was essential to achieving these goals. Teaching the use of Linux was somewhat easier if the student was an Apple user, as the OS is built on a Linux-like system, and so the opening of terminals to connect to a remote server is simple. It also means Linux-based software that runs from the command line can be installed on Apple machines, meaning research software could be installed and used by students to test and trial methods before using them on an HPC system.
Of course, a user could use a computer running a Linux OS. While a valid choice, many students do not encounter Linux until they come to university. Windows machines can be made to dual boot into Linux or the Windows OS, but those two OSs are separate on the machine and unable to share files. The Linux OS is, in cases, unable to fully utilize the hardware of the Windows computer it was booting upon.
The WSL eases students and researchers into HPC usage, while retaining workflows they are familiar with, and creating a relatively cheap hardware option.
Let’s be honest, the vast number of students and researchers will have a degree of familiarity with the Windows OS and software that runs upon it. While we should promote the use of free-to-use software, we also would be foolish to pursue that at the detriment of these users being able to perform work with tools they know. They will, for example, have used Windows and software on the platform to make spreadsheets, presentations, and documents.
Windows being commonplace means that most students and researchers have established workflows within that OS. If we want to accelerate research and the uptake for digital research tools, the last thing we want to do is hinder workflows that already work for the user. If the user is happier writing a paper using Word and Endnote, it is perverse to demand they start using a Linux computer and LaTeX because we have thrust them onto a Linux machine in order to perform development for HPC.
The Subsystem removes that barrier.
A researcher, for example in materials science, can now install software, such as CASTEP and GULP, on the WSL. These programs have no GUI and run from the command line. Python code can be developed that uses the above simulation codes, so the user can build pipelines on their desktop machine before moving to the HPC. Furthermore, some software that visualizes the results of these simulations runs perfectly well on a Windows machine – or Apple – exploiting dedicated graphics card to render high-quality 3D images of crystal structures.
To do all the above without buying expensive Apple computers is a boon if you wish to do teaching and research-led teaching in computational sciences, and retains the familiarity of the Windows OS. No need for virtual machines installed on Windows, or packages like WinSCP to communicate with the HPC machine, and with that comes performance improvements. Plus, on a Virtual Machine, you can’t access your Windows files, but you can with the WSL!
WSL and beyond
With the advent of the recent iteration of the WSL, some interesting things are happening. One example is the Kali Linux distribution, which offers a GUI interface to the Linux OS running on WSL. This removes another barrier to using Linux for digital research. Users can be eased more gently into using the command line, but with the fall back of the GUI to navigate the file system. Even Nvidia is getting in on the action, developing drivers that allow the WSL to communicate with the Nvidia graphics card so that in the Linux environment you can run software, such as machine learning programs, that exploit the power of a Graphical Processing Unit.
I have been using the WSL for 4 years now, and have found that it has improved my workflows, and enabled easier development of codes and workflows, as I easily switch between the regular desktop and applications I have experience with, and the Linux environment and the interface I use for scientific computing. I’ve used the WSL on several different machines, ranging from All-in-One desktop computers, high-performance gaming laptops, Surfaces, and other small and low-end Windows machines. In some cases, I run scientific software on a high-end machine software, while others it just provides a good terminal with which to communicate with an HPC service.
If you want to know more about the WSL and how to use it, just contact the Digital Research Service. We can give guidance and support.
Sorry, comments are closed!