Break Out of Applications
The Capitalist mode of production has greatly influenced how we use, and think about, computers. A computer really is just logic circuits which can be directed by software. Software is at heart, a configuration of electrical states that represents data and determines how the circuitry will work. Technically, there is no distinction between “Operating System” and “Applications” to the computer. The computer just processes data using executable code and doesn’t “know” that this particular instruction or that particular bit of data belongs to this or that application. Likewise, there is nothing fundamental about hard drives or SSDs or NVME’s which necessitate the existence of files. Well, that may not be entirely true regarding the CPU, but its close enough. The separation of tasks into programs, and the grouping of programs into “Applications” is managed by the Operating System, but this is done for our convenience. We humans need to separate out the different tasks and manage them in our brains. If the computer is very simple we can do without these concepts, and just put in data and code to solve a problem and not worry about concepts like “files” or “applications”. But we need to be able to label and delineate the code responsible for tasks we want to perform, and to label and delineate the data. A computer can run without the concept of files or programs, be we can’t, so we write software (the operating system) to manage files and programs as discrete entities. Note that we don’t need to create files to delineate data. We can also do this by different physical media, that is, separating the data out by tape or disk. The “filename” would be what label you put on the physical disk.
So these concepts, actually having Applications, Programs, are abstractions for us. It’s too much work for any individual to input all the code and data to get everything they need done, so we buy program components pre-made and install them. Even if we could write all the programs we needed ourselves, concepts like files, separate programs would be needed to separate out the different problems we are trying to solve, and to be able to direct the computer to execute the right code with the right data. The move to the creation and distribution of software as a business has compartmentalised things even further. We don’t want to buy “parts”, but to buy “solutions”. The popularisation of computers among the general public fundamentally changed how we view them, particularly in the late 80s onwards. People bought computers and wanted to use them to do things, so the solution came forward. That solution was shrink wrapped software. Discrete bundles which were sold for profit to be run to answer the questions the users had, such as “How do I manage my finances on this thing?”. These solutions were almost always self-contained. Lets say one wanted to write a resume, then the “solution” was a Word Processor, and that Word Processor would do everything that was needed for that task, creating a formatted document. Want to manage your finances? Another application. Want to play music? Another application. Want to a way to manage passwords? Another application. The data source for the application was specific to that application. You usually HAD to use that application to do anything in that problem domain, with that data. Sure, a lot of these applications are now are free, aren’t sold in shrink wrap boxes, but the idea remains. People creating software understandably want to make money, and they can make few assumptions as to your system, apart from the type of Operating System it runs. So they can only sell a complete solution, a software package which usually manages all its own data, and doesn’t really interact with other packages. It may not be in their interests to make it work with competitors software. This even truer with mobile phones where EVERY function is an App. Even just having vouchers from a fast food store delivered to you, is done through a specialised, discrete App. Everything you may want to do, is done with a discrete App. Each App is separate from the rest, there is no cohesive system. What we are sold is discrete systems. Sometimes there are “Application Suites”, such as MS Office, but that it still a closed system. Everything happens within that application, outside of that application, the functionality and data may as well not exist.
Our computers (and devices) have multiple personalities. Use a computer at work, and one switches been “MS Word” to “Outlook” to “Teams” to whatever other software. In the old days before multitasking, the computer took on a different persona each time we exited one program and loaded another, but multitasking just meant all these personas could exist at the same time, but still be separate. Each with its own techniques, look and feel, shortcuts, conventions, etc. A Windows and Mac OS computer is basically just used as a portal to discrete applications, each separate and unaware of each other.
Unix on the other hand was developed before the era of mass shrink-wrap software. Unix still has discrete files, it after all really pushed the “everything as a file” idea so much so that devices and system processes would also appear as files, but its approach to solving problems is different. Unix was used quite a bit for text processing, and although all-in-one applications do exist (they often have to), the Unix operating system contains lots of smaller programs that do one thing, and do it well. These programs perform one specific process, and are able to take data which was output by another process and also are able output data to another. The tasks they perform are simple. One program may just sort lines, one may spell check, one will replace a string with another, one might extract a particular column, one perform mathematical operations, and so on. The user can specify where to get the data from, and which programs to pipe that data through and where at the end of the pipe, to put the final result. So with one command, the user can pull data from a file (or device!), run it through a program which extracts the second and third column, sort by the second column, remove any rows which match a specified pattern, compute the sum of the values in the last column, and write the sorted columns with the sum in another file (or device!). This is a basic example, but with all these tools, the user can compose whatever functions they need by stringing them together. Rather than rely on a pre-built application, or poke a GUI over and over, an instant new process is created, and can be reused over and over. Up until the introduction of Powershell, Windows just didn’t have this functionality in any useful sense. The Unix way is limited. Most of the classic tools (awk, grep, sort) work with text only, so if you want to employ them, you need the source data in text format (such as CSV). Powershell, which finally adds this sort of functionality to Windows, take a more modern approach and uses objects, so data does not have to be in text. However, a good as Powershell may be, no one really uses it except for System Administrators. Windows is just a platform to start applications for the most part.
Unix was isn’t the only model which breaks down the barrier between applications. Another model is the Lisp machine, best represented today by Emacs, which acts likes a text based Operating System, where instead of passing data through programs, textual data appears in buffers in memory, in which you can choose which Lisp code you want to use on it.
The desktop computer today, and therefore our phones and tablets, are the way they are because the primary mode of software production and distribution was through development by private companies making software which could be employed for one holistic task (spreadsheets, writing documents, budgeting, managing passwords) which they hoped people would buy. They couldn’t assume any functionality on the users system, so everything had to be put in. Microsoft Word has its own spell checker, its own editor, its own converter to PDF, its own functionality to sort, search, replace. These components are all built into the application, and only available WITHIN that application. MS Word wouldn’t sell as just a piece you fit together, or just a document formatting system which you would use as part of a larger workflow. Now add an e-mail program, that has its own editor despite one existing for MS Word, maybe it own spellchecker, its own search and replace function. A password manager? Same thing, it has to do everything itself. Sure, you can pass some of it off to libraries, but the end result is a computing environment where the different parts don’t talk to each other. Your password manager works clumsily with your web browser (unless it is built into your web browser, in which case it works clumsily with anything outside of the web browser). We store data in Excel, but can only view it IN Excel. If we just want a register of documents, we have to open Excel, add the line, save close. Watch anyone working in an office, they are jockeying apps all day long. Moving from application to application. Copying and pasting ad-nauseum. They are not using the COMPUTER, rather they are using applications and trying to fit square pegs in round holes all day long. Sending data by e-mail becomes awkward and clumsy. The data is stored specifically for one program, and often people will screenshot data and send it as an image, to have someone then type that back out into their document.
The idea scenario is one where the interaction with the computer, the workflow, is phrased in as close to a manner as possible, how the task would be phrased if expressed as a verbal instruction. If a common task is phrased as “Close customer complaint ticket # X with comment Y”, then the use of the computer should look like an execution of that phrase. The CLI can get the closest to this with something like $ complaint -close -number X -comment “Customer was wrong” but a GUI management system, which would be more appropriate could have a screen based on that task, or, even better, the ability for the user at the computer to do this without navigating through the application. It would expose functionality which could be incorporated into other software. Bad design, which I have commonly seen, sees you navigating to a spreadsheet, locating the correct row, entering data into the right columns. This process is quite different from the human inclination to just say “do this”. Even applications which are designed for this, are often poorly done, you still navigate to the record, and have to pick the right fields to update. I want to feel like the data and tasks my life are part of the system, not restricted to within the applications ecosystem. Things should feel less and less like a conglomerate of different applications and more like a tool box, a system that can understand itself and be aware of itself. Software designed to inter-operate, data designed to be open is the way to achieve this.
The silo approach, while great for selling software packages, is horribly inefficient to work with. Its easy to develop and sell Applications, but the problem that computers are meant to solve is the manipulation, computation and processing of data, NOT how to make people money through selling software. But because the computing industry is driven by selling software, and sellers are COMPETING for sales. One all singing and all dancing “solution” has to have more features than the other to be purchased. This leads to massive applications that do EVERYTHING, or even worse, do everything inside the web browser. The user is restricted, they can only use computing paths that the software developer put in. They cannot use any of the other code and data on their computer and because they are mostly graphical, the user is having to “show” the application what to do, rather than just telling it what to do. The problem of getting different applications to communicate and share data is solved often by building mega-apps, a giant walled garden. These are barely an improvement, but are great money spinners.
The Alternative
Lets step back and consider what we are actually wanting to do when we are working on the computer. We want to edit some text. We may want to start playing some music file, or send an email, or a message, or query some data, extract some data, store it, download a file. What if instead of “going into the Application” every single time we could just tell our computer what we want to do. What if the Operating System WAS the application? Emacs is already somewhat like this, which is why some people live in it. You can be “in Emacs” and do all of those tasks previously listed. What if instead of having a generic Desktop environment, it was tailored to your needs? A company workstation could embed widgets into the desktop to perform many basic things. Applications shouldn’t be eliminated, but if we start thinking of the computer again as a toolbox, a unified system, we could build software which doesn’t constrain is to being “within Apps”, but turns the computer itself into the tool. Remember that applications aren’t fundamental to the computer, they are abstractions, all programs and all data actually occupy the same space, the machine itself, it is the creation of the abstraction of applications which separates them, which stops the pieces interacting with each other.
This is best demonstrated by example. MPD is a piece of software that plays music, but it is a daemon. It runs in the background and takes commands from a client, any client which knows how to communicate with it. Being free software, its a free and open protocol, and you can call on a program to manipulate MPD in your own scripts and programs. This means that you can use a number of clients, or no client at all. The music controls can be part of your GUI. This has been done elsewhere, but the key is the decoupling of the functionality with the interface. MPD just allows your computer to play music, but you can create the interface anywhere you like. Same with the aria2c downloaders. It can run as a daemon, and any program can use it to use downloading functionality. The FlashGot plugin for Firefox can use it, but you can also use a command line client, or a graphical one, or even one you created yourself. You can write a script to add the link in the clipboard as a download, and bind it to a key. The point isn’t that these programs can exist as a system tray icon, the point is that these services separate out a function from the interface. The point of entry to the function can be wherever we see fit to place it, and as direct as we need. The wall between these different programs is broken down and any graphical or command line interface can use it, even the basic GUI itself.
The “pass” program for Unix follows a similar principle. Pass is billed as the “standard” Unix password manager, and it is little more than scripting. The passwords are stored in GPG encrypted text files, easily accessible by any program that call GPG and read text. This means that any program can use “pass”, provided the user enters their GPG passphrase. You can use an application, but you don’t have to. One can have a program, which pulls up a pop up screen with your password listings, and automatically enter the password into the active window. You do not need to “open up” the password app, or even have it running all the time. The point is that using these “solutions” make it feel like these functions, these services are part of the computer itself, rather than confined within an app. You can more directly tell the computer, this way, what you want done. “Get me this password”, “Download this file”, “skip this song” now feels like something you tell your system, than tell a specific app. To expand further, thinking in terms of functions rather than apps opens up possibilities. Why should every application have its own editor, when you can leverage an existing one? Or spell checker? If we start thinking in terms of composition again, how we can use the pieces on our system to solve tasks, we can do things more efficiently again. We can start thinking of how to store data, how to access it, and what the most efficient interface to perform the tasks we need are, rather than “How do I get Excel to do this”. If the problem is to store data and access it, we can approach the problem more directly. Separate out the storage of the data, and the access, and put the points of access where we need it. We can use existing functionality on the system to do this. The mutt e-mail client uses an external editor to compose emails with good reason. We already have an editor we know how to use, why duplicate? The editor becomes part of the systems function, something to leverage.
This is a shift in thinking, and these examples are just examples. The shift in thinking is to move away from seeking discrete applications, which are pre-configured and solve abstract problems to how we can combine tools to solve real specific problems. The solving of abstract problems gets pushed down a layer. An operator that has to log calls into a system doesn’t have to open up an app, navigate to the right page, click “new”, etc, etc, but instead can have an easy to access form that can pop right up, tailored for that specific task. The computer itself should feel tailored to perform the users tasks, not just a means to run Applications. When I was developing a Quality Management System, this is what I had in mind. A task oriented way of working, but when in development I defaulted to the application way and what I was creating was not my vision. I was prejudiced by the tools I was using to build it, and the subconscious desire to sell a marketable product. The product I was making was too finalised, too polished. I should have been making capabilities which could then be pieced together to solve problems. My issue was I was making a saleable product. In retrospect, it should have been tools and services, and then for each use case, assembled them to match workflows. Developing methods with which we can use existing tools to solve the specific problems a customer might have. I could have customised the screens, but that wasn’t the point. The point wasn’t to create forms to fill out and edit.
Of course, sometimes the all in one application is the best approach, for example, designing levels for a game, a lot of graphical design, an air traffic control system and so forth. It is not that applications which take care of the entire process are inherently bad, only that this model is overused and is used all the time, when in it only really appropriate some of the time. Our systems, especially those used at work, just feel like chameleons, which change from persona to persona depending on the application we are using, but overall it is dumb. The applications and data make the Operating System no smarter.
One thing that may bring this about is speech driven interfaces. “Hey, Google, play Pyramid Song by Radiohead” is closer to how a computer should be used. This form of interactions bypasses the need to instruct the operating system which application to use to play it. More complex tasks like “Add ‘clean the gutters on Saturday’ to my todo list”, we know how to make the computer do exactly that, so why not have this capability in the command line and GUI interface? We don’t need to configure a computer for every possible task, only have means to build in the tasks the user will want to do. For those less frequently used, by all means, do it the “manual” way. The technical means to achieve this already exists, its a