Apple Vision Pro – Between Tom Cruise and M.C. Hammer

  • Scope:
  • Mobile Development
Apple Vision Pro
Date: February 2, 2024 Author: Iga Hupalo 8 min read

In the iconic scene from the 2002 movie “Minority Report.” John Anderton (Tom Cruise) used special gloves to manipulate images and movies displayed on a screen to extract the face of a person who was going to commit a crime – or a pre-crime, as dubbed in the movie and in Phillip K. Dick’s novel, which the movie was based on. The scene, involving teleconference and gesture-controlled devices was shown years before the introduction of the iPhone (2007) and Skype (2003). 

A gesture-based way to control programs had already been seen in games, notably Black and White (2001) and Sacrifice (2000). Yet the technology had yet to see professional use or popularity. 

With the introduction of Apple Vision Pro, augmented reality-based manipulation and gesture control will become a reality with Apple-induced power. 

Enter Vision Pro

February 2nd was the premiere of Vision Pro. Apple marked the debut of a new line of products announced during the latest annual Worldwide Developers Conference. While this is not the first attempt to enhance spatial computing and integrate advanced software into a three-dimensional environment— with predecessors like the widely adopted by professionals Google Glass, Microsoft HoloLens 2, or Lenovo ThinkReality A3—what sets the Apple headset apart is its highly developed technical parameters and sophisticated user input system, including hand gesture recognition.

However, what truly distinguishes Apple Vision Pro is its seamless integration into the broader Apple ecosystem. The ability of its software to effortlessly collaborate with other Apple devices such as iPhones, MacBooks, and Apple Watches could elevate spatial computing to new heights, potentially transforming how we interact with technology, akin to how the iPhone effectively introduced smartphones in 2007 with its launch.

The device’s initial release comes with a notably high price tag, and availability is not currently at its peak. However, as time progresses and the device becomes more accessible to users, there is a strong anticipation that it will undergo rapid development. It is widely expected that the technology will evolve, leading to the availability of more affordable iterations in the future.

visionOS – hand gestures and (no) touch screens

Apple Vision Pro operates on a derivative of iOS and iPadOS, known as visionOS. This new operating system enables the launch of spatial applications, offering users an immersive experience. It allows for the transition of familiar experiences, such as web browsing and movie watching, into a three-dimensional environment that surrounds the user. 

Unlike traditional devices, with limited display space, applications on visionOS can expand into multiple windows positioned around the user. This innovative approach influences how users will consume and interact with content, enhancing ergonomics and expanding possibilities in professional and specialized settings, as well as day-to-day activities.

The device introduces diverse input options, including simple hand gestures and voice commands, aiming not to constrain users but rather to extend their reality. This emphasis on varied interaction methods enhances the overall user experience.

Reusing components 

The development of applications for a new appliance demands a shift to meet its unique requirements. However, this process is surprisingly accessible, allowing for the adaptation of existing applications or the creation of new solutions at a relatively low cost. Apple provides tools and guides that empower developers to create software even without a physical headset, streamlining the development process. The existing frameworks utilized in building mobile applications for iOS and iPadOS are also utilized in constructing applications for visionOS.

Despite the distinct appearance of the user interface layer, it is built using familiar elements, such as lists, buttons, and navigation. The gestures commonly used on flat touch screens seamlessly translate to the spatial environment, with simple equivalents to tap or swipe actions.

In this post, we aim to showcase how existing resources can be utilized to deliver immersive solutions for this upcoming device, which is expected to redefine the way end-users engage with technology.

Right from its launch, Apple Vision Pro ensures that all compatible iPhone and iPad applications are automatically available, offering immediate access to existing content unless the app owner voluntarily opts out of the App Store Connect platform. The headset’s operating system, largely built on the technologies of iOS and iPadOS, allows many applications to run on the device without requiring modifications from developers. However, certain Apple Vision Pro Compatible apps may necessitate adjustments due to hardware differences. Including the most significant one: 

You Can’t Touch This

It’s crucial to note that applications developed for iOS and iPadOS, unless appropriately adjusted, maintain their original appearance as rectangular windows in the user’s surroundings. Although these Apple Vision Pro apps may be fully functional, they might not be ergonomically optimized for the spatial environment, potentially leading to issues such as unscaled vertical windows, illegible font sizes, or buttons that are too small, which can impact the overall user experience negatively.

While most functionalities seamlessly transfer, some features, such as those reliant on location services, photo and video capture, or integration with the Health app, may need modification or to be disabled due to hardware incompatibility. The documentation for Apple Vision Pro provides comprehensive guidelines on preparing existing applications for use on the new platform, highlighting features that may require special attention, including in-app purchases and sensor-related functionalities.

Transforming existing applications to suit a spatial experience might require minimal effort. Apple Vision Pro App development specialists can expand the source code by adding target-specific implementations that explicitly describe behavior and appearance depending on the platform. The application can be extended using conditional statements that define code paths in order to work on both mobile and spatial systems seamlessly. This adjustment allows for an optimized user experience on Apple Vision Pro while ensuring compatibility with mobile devices.

To unlock the full potential of spatial computing for an app, Apple provided detailed guidelines in WWDC sessions. These guidelines offer insights into constructing applications that maximize the advantages of the headset. Apple vastly elaborates on its interpretation of the mixed reality environment and provides instructions on creating software that effectively utilizes the full possibilities offered by the hardware.

Enter the Space – how to build apps for Apple Vision Pro

Upon launching the headset, users enter the Shared Space, which functions similarly to a personal computer’s desktop but is not constrained by screen size. It serves as a limitless canvas where infinite applications can run side-by-side. Virtual app content can be displayed as windows, volumes, 3D models, or a fully immersive environment enveloping the user. A single application can consist of multiple elements freely mixed.

How does a window spatial application, purposefully designed for Apple Vision Pro, differ from an iPad application that can be transitioned to VisionOS? Certain UI and UX aspects can significantly enhance the app, ensuring it is fully adjusted to function seamlessly in spatially computed surroundings while elevating ergonomics and usability to the highest level.

With a dedicated focus on developing a mixed-reality headset, Apple prioritizes applications that integrate with users’ surroundings. When users are engaged in non-fully immersive applications, it becomes highly advantageous for the application’s interface not to be fully opaque. Instead of utilizing solid colors for app backgrounds, which can obstruct the user’s view, Apple recommends utilizing newly introduced components that resemble frosted glass. 

Sticking to reality

This design trait ensures that the app’s contents remain fully readable while the added transparency allows users to keep aware of their surroundings at all times. The lack of a pass-through effect becomes evident in the case of transitioned mobile apps, as their designs are often based on solid colors and nontransparent forms. The use of transparency contributes to a more immersive and user-friendly experience in the mixed reality environment.

Window applications specifically designed for visionOS stand out due to the addition of a third dimension— providing depth. Interface components, like modal views, are constructed as distinct layers closer to the user. This use of depth enhances clarity within the three-dimensional environment, contributing to a more visually engaging experience. By leveraging the capabilities of spatial design, these applications create a sense of depth that goes beyond the traditional flat interfaces known from smartphones.

To ensure ease of use, every interactive element of the user interface must be adjusted for the spatial computing environment. The primary interaction with the system relies on a combination of eye movement and hand gestures. 

Users look at the interface element they wish to interact with, such as a button, and then tap two fingers together to execute the desired action. The specialized cameras located at the bottom of the headset detect this finger gesture. This design allows users to rest their hands calmly on their desks, laps, or any other surface, eliminating the need for mid-air hand movements.

The app interface needs proper preparation to ensure the user experience remains at the highest level and works well with such interactions. Crucial elements include implementing a hover effect that reflects the user’s eye focus and integrating feedback actions, like sound effects, to reflect ongoing interactions. Additionally, interface elements need to be sized appropriately. For example, buttons have specified minimum sizes to avoid being too small and, hence, challenging for users to interact with effectively.

Augmented Reality Apple Vision Pro resources

Apple’s documentation outlines the key areas that developers should focus on when developing fully optimized visionOS applications in depth. Moreover, they have released design templates in the form of Figma resources, offering a visual guide to the best approaches in spatial design.

VisionOS simulator – No need for a headset

At this point, it may appear crucial to have a headset to initiate the development of spatial software or to adapt existing apps. Working in a multi-dimensional environment seems to require specific hardware. However, Apple equips developers with functional tools to enable them to work on software even without direct access to a headset. In particular, window-based applications can be easily created and tested using the visionOS simulator integrated into the Xcode environment.

The visionOS simulator allows for the launch of applications in a virtual space. It provides three versions of spaces in two different light modes. Within the simulator, the elements of the virtual space are mapped to resemble real-world settings, including floors, walls, and furniture. This enables developers to visualize how launching the app would appear when using the device, including testing how the external environment influences the developed app’s layout, readability, or acoustics.

By utilizing the Mac’s pointer and keyboard, it is possible to reposition a viewpoint within a visionOS simulator window. Pressing keyboard keys enables a developer to “walk” around the simulated space, creating a game-like experience and offering enhanced control over the virtual environment.

One distinctive feature of Apple Vision Pro lies in the unique way users interact with the software, like eye and hand gestures. The simulator translates the standard methods through which users interact with the system. This allows developers to leverage the simulator for testing functionalities that are based on standardized user actions when developing an app.

GestureApple Vision ProvisionOS Simulator
TapTap index finger and thumb togetherClick
Double-tapTap index finger and thumb together twiceDouble-click
SwipePress and hold the Option key to display touchpoints. Move the pointer while pressing the Option key to change the distance between the touch points. Move the pointer and hold the Shift and Option keys to reposition the touch points.
Drag left, right, up, and down
Drag  (left, right, up, and down)
Drag (forward and back)Shift-drag up and down
Touch and holdTap fingers and hold (used, for instance, to highlight text)Click and hold
ZoomWith two hands, tap fingers and drag together or apartWith two hands, tap fingers and rotate in a circular motion
RotateWith two hands, tap fingers, and rotate in a circular motion

The SwiftUI framework is employed to harness the full range of hardware capabilities when developing spatial applications. Originally introduced in 2019 and extensively utilized for software development on various Apple devices such as iOS, watchOS, and macOS, SwiftUI has been expanded to include functionalities that facilitate the implementation of three-dimensional applications. 

Therefore, when creating a new app or adapting existing mobile apps for visionOS using SwiftUI, there is no requirement to introduce external add-ons. The library has been enhanced with features specifically designed to meet the demands of spatial computing.

The SwiftUI methods designed for handling user interaction input have been extended to adapt multidimensional input. Furthermore, in the development of visionOS app interfaces, the same elements utilized in iOS and iPadOS applications have been employed. The interface components provided by the SwiftUI framework have been adjusted to align with the requirements of crafting apps for the spatial environment. For that reason, the process of constructing interfaces, especially those for window-based applications, using this library is not markedly different from the experience of creating mobile apps.

While the introduction of visionOS with Apple Vision Pro is a significant leap in spatial computing, there are notable limitations that potential users and developers must consider. Firstly, the physical device is currently challenging to acquire due to limited availability, which could hinder immediate widespread adoption. 

Summary – The Imperfect World

While Apple provides a simulator for developers to validate and develop applications, it’s important to note that this simulator might not fully replicate all the features and nuances of the actual hardware. Certain aspects, especially those relying on the advanced sensory capabilities of the physical device, could be difficult or impossible to simulate accurately. 

Additionally, the high price point of the first version of the Apple Vision Pro could be a barrier for many consumers and small-scale developers, making it less accessible to a broader audience at this stage. 

Also, the device’s dependency on a power source could be a concern, especially if it suffers from limited battery life, impacting its utility for extended use. The weight and comfort of the headset are also crucial, as any discomfort in wearing it for long periods could detract from the user experience. 

Finally, content availability for visionOS is essential for the platform’s success; a limited range of visionOS-enhanced apps and experiences at launch might slow down user adoption and engagement. However, this scarcity of content also presents a unique opportunity for developers to create and showcase innovative applications, potentially gaining early recognition in an emerging market.

These limitations collectively highlight the early-stage challenges of this pioneering technology.