What Is Apple’s Vision Pro Really For?

Apple announced a new computer last week, to great fanfare. The Vision Pro is a computer that is worn on your face, but the novel aspect is how you use it.

Rather than view the computer’s output through a physical screen, that output is projected directly into your eyes with two very small but high-resolution displays a very small distance in front of you. Rather than control the computer through a keyboard, mouse, or touch screen, the primary user interface is through eye tracking and gestures.

Just as they removed the stylus for screens when they launched the iPhone, Apple requires no physical controller to use the computer. The computer senses what you are interested in interacting with by watching your eye movement and then looks at your hands to determine what you want to do.

There have been antecedents to each of these things — a myriad of worn devices for viewing such as the Google Glass or Meta’s Quest Pro and technologies such as Leap Motion and the Myo Armband for gesture control. But none of these predecessors put it all together as a coherent vision.

Apple has termed this new device a spatial computer. The name is apt, because the device can use any physical space around you as a canvas to display digital outputs. There is no need for a desk (or lap) to place the device, and there are no limits on the size of the perceived viewing area. That means you could technically sit in a small space, like an airplane seat, and watch a cinema-sized movie.

What should you be aiming to do with a spatial computer? Right now, Apple has outlined use cases that seem pedestrian. You can use it like a normal computer or iPad but with the current 2D information presented on a much more flexible and unconstrained display. There’s demand for that. It will be of value where you do not have much space. And it will be of value to those who currently use their space with a host of large displays. In that sense, the closest analogue is a very large-screen TV. Would people pay $3,500 for that? They do right now. Even Apple sells a display (the Pro Display XDR) that can cost up to $6,000. From that perspective, it is easily in the cost ballpark for current use cases. This strategy also has the benefit of seeding this new platform with the vast number of applications that already exist for the iPad and the iPhone.

A better and more convenient display for 2D content, however, does not appear to justify the technological and R&D weight that has gone into the Vision Pro. The real question is whether this device can lead to the augmented and virtual reality applications that would justify strapping a computer to your head. It certainly has the technical capabilities to do so.

The Vision Pro can display 3D objects in your current space or even transport you to new spaces. Apple, however, barely mentioned the terms AR and VR during their announcement. In doing so, they drew a line people have not drawn before. This is no AR or VR device or technology. The technology is a spatial computer, and if there is a role for AR and VR, it is in applications that run on a spatial computer.

Let’s review those concepts. Augmented reality (or AR) involves taking the environment around you and changing your perception of it. Google Glass did this by showing you notifications via smart glasses. The Vision Pro does this by placing 2D displays in that environment and fixing them so that when your head moves, the display does not. It appears to be there in your current environment. This is achieved by passing a very accurate video of the real world through the device to you. You don’t see your environment directly, but you think you are doing so. Thus, technically, Apple is augmenting a video capture of your environment, not overlaying things on your direct viewing of it. To a user, there’s no real difference.

Virtual reality (or VR) involves taking the user and immersing them in a virtual environment. The Vision Pro captures the full attention of your eye, and so you are, by definition, immersed in a virtual environment.

In one mode, that looks like the environment you are actually in. Turn the dial, and that can change, and you can be transported somewhere else. The video pass-through from your current environment is replaced by a digitally created 3D environment. From that perspective, this is clearly a VR device.

The important thing to note is that despite being capable of AR and VR, these use-cases were not emphasized by Apple. Thus, they have produced a device capable of both but have not found compelling use cases in either domain. This is one of the reasons that they announced it at their annual developer conference. Apple needs apps, and they need other people to imagine them.

In a recent paper, we outlined what we believed were the ways that AR and VR apps could add unique value. Beyond gaming and entertainment, our focus is on economic uses — specifically, those that would increase users’ productivity. In this respect, we ask: Which AR and VR applications can create real value by assisting users in making better decisions? For someone looking to build applications for the platform Apple has created, understanding the possibilities is crucial.

Most decisions involve some degree of uncertainty. Information is the cure for that, allowing you to know more and, therefore, make fewer errors. But there are two aspects to using information in decisions. First, you need to have the right information available. Second, you need to have the cognitive space to distil and parse that information for usefulness.

As it turns out, AR and VR map into each of these things. VR has the ability to present the user with more relevant information, especially when that information is either not with them or is costly to acquire. By immersing users in new contexts, what it does is bring that information to them. In some cases, it may be a realistic view of what is going on inside a building, say, during a fire. In other cases, it presents a safe, simulated environment, such as a flight simulator that facilitates training without the high stakes.

By contrast, AR takes the information presented in a given context and parses it to yield the information that is relevant. For instance, when you meet someone at a conference, it provides identification of who that person is without you having to search your own memory. Or, it can provide a helpful overlay with exit routes if you are dealing with a fire. In each case, the goal is to distill the quantity of information from the user’s environment and present the information that is needed.

The one thing to note is that the Vision Pro is not meant to be a portable computer to be used outside of the home or the workplace, which limits its applicability in navigating external environments (such as while driving).

This perspective highlights why many previous AR and VR purported use cases have been of low value. VR meetings with avatars in pretty rooms do not provide information that is obviously more useful to those in the meetings that might arise from a Zoom call. AR glasses that provide text notifications as you walk around are increasing your cognitive load rather than decreasing it. Our framework suggests that the best use cases will be in contexts where it is normally expensive or dangerous to get information, highlighting the value of VR, or where the environment is so complex that the value of digital overlays to clarify it via AR is high — or both.

Think of applications like prototyping the design of a new aircraft or building, or assisting in remote medical procedures. The Vision Pro exhibits capabilities of being able to do each of these things but the job of experimenting and designing for these use cases has been left to others. Developers looking to profit off the platform Apple has created would do well to focus on applications that provide users with hard-to-access contextual information at just the right level of detail.

This is par for the course for Apple when they first introduce a device. The iPod was a digital walkman. The iPhone was a connected iPod. The iPad was a bigger iPhone. The Apple Watch was a better smartwatch. And the Vision Pro is an unconstrained 3D screen. In the previous cases, the device is outgrown and becomes more than that initial use by enabling developer innovation. The Vision Pro is a welcome new experiment along a well-trodden path in computing.

“Harvard Business Review is a general management magazine published by Harvard Business Publishing, a wholly-owned subsidiary of Harvard University. HBR is published six times a year and is headquartered in Brighton, Massachusetts.”

Please visit the firm link to site