Showing posts with label WebGL. Show all posts

Thursday, November 10, 2016

W3C VR Workshop - Building a WebVR Content Pipeline

On October 19th and 20th the W3C held its first very VR Workshop. It was our first chance to both look back at what we had accomplished with some of the first revisions of the WebVR Specification which was working its way through as a W3C Community Group while also looking forward to the potentially large set of missing, future standards we were yet to write.

During the workshop I presented on a few topics, but mostly on accessibility of the Web in VR and performance. In the following few posts I'll be taking one of my lightning talks and expanding on it, line by line. I compared this with recording the lightning talks, taking my time, and hitting all of the points, but it seems better to get this into writing than to force everyone to put up with me 5-10 minutes at a time through a YouTube video!

My first performance talk was on the WebVR Content Pipeline. This is an area I'm supremely passionate about because it is wide open right now with so much potential for improvement. If we look at the commplicated, multi-tool build pipelines that exist in the games industry and use that as an indication of what is to come then that is a glimpse into where I'm thinking. If you want to see the original slides then you can find them here. Otherwise continue reading I'll cover everything there anyway.

The WebVR Content Pipeline Challenge

Not an official "fix this" challenge, but rather a challenging are to be fixed. I started this slide by reflecting on the foreign nature of build pipelines on the web. Web content has traditionally been a cobbled together mess of documents, scripts and graphics resources all copy deployed to a remote site. In just the past few years, maybe 5 at most, the concept of using transpilers, optimizers and resource packers has become almost commonplace for web developers. Whereas from 2000-2010 you might be rated on your CSS mastery and skillset, 2010 and onward we maybe started talking more about toolchains. This is good news, because it means some of the infrastructure is already present to enable our future build environments for WebVR.

My next reflection was on the complicate nature of VR content. You have a lot of interrelated components via linking or even programmatically through code dependencies. This comes across as meshes, textures, skins, shaders, animations and many other concepts. We also have a lot of experience in this area, coming from the games industry, so we know what kinds of tools are required. Unfortunately, even in games, the solutions to many of the content pipeline issues are highly specific to a given programming language, run-time or underlying graphics APIs. Hardware requirements make it even more custom.

My final reflection was on the graphic below. This is a content pipeline example where you start with assets on the left and they get processed into various formats before they finally get loaded on the client. The top of the graph represents what a common VR or Game pipeline might look like with many rounds of optimization and packaging. On the bottom of the graph we see what an initial web solution would be (or rather what it would lack). The difference in overall efficiency and sophistication is pretty clear. The web will need some more tools, distribution mechanisms and packaging formats if it wants to transmit VR efficiently, while retaining all of the amazing properties of deploying to the web.

Developer - Build, Optimize, Package

The graphic shows three major stages in the pipeline. First we start with a developer who is trying to create an experience. What steps in the pipeline can he do before sending things onto the server? How much optimization or knowledge of the target device can be had at this level? Remember, deploying to the web is not the same as deploying to an application store where you know almost exactly what your target device will be capable of. While fragmentation in the phone market does mean some progressive enhancement might be necessary, the web effectively guarantees it.

My first reflection is that existing build technologies for the web are heavily focused on soling problems with large 2D websites. These tools care mostly about those resources which we currently see scaling faster than the rest. This mostly means script and images. Some of the leading tools in this space are webpack and browserify. Since some set of tools do exist, it means that plugins are a potential short term solution.

My second reflection on this slide was that the game industry solution of packaging was also likely to not be the right solution. This breaks to principles of th web that we like. The first is that there is no installation required. Experiences are transient as you navigate from site to site. Even though these sites are presenting to you a full application, they aren't asking you for permission to install and take up permanent space on your disk. Instead the browser cache manages them. If they want more capability, then they have to ask for it. This might come in the form of specialized device access or the ability to store more persistent data using Indexed DB or Service Workers. The second principle that packaging breaks is that of iterative development. We like our ability to change something and have it immediately available when we hit F5.

My third reflection is around leveraging existing tools. There are many tools for 3D content creation, optimization and transformation. Most of the tools need to be repacked with the web in mind. Maybe they need to support more web accessible formats or they need to come with the library that allows them to be used efficiently. In the future I'll be talking about SDF fonts and how to optimize those for the web. You may be surprised to find that traditional solutions for game engines aren't nearly as good for the web as they are for the traditional packing model.

Another option is for these tools to have proper export options for the web. Embedded meta-data that is invisible to your user but still consumes their bandwidth has been a recent focus of web influentials like Eric Lawrence. Adobe supported export for the web for years, but often people wouldn't use the option and would instead ship an image that was 99% meta-data. Many 3D creation tools have never had the web in mind, so they often emit this extra data as well or export in uncompressed or unoptimized formats, more similar to raw. Upgrading these tools to target WebP, JPEG and PNG as image output formats, or any of the common texture compression formats supported by WebGL would be a great start.

My final reflection for this slide was on glTF. I think that this format could be the much needed common ground format for sharing and transferring 3D content with all of their dependencies. Its well structured format means that many tools could use it as both an import and export target. Optimization tools will find it easy to consume and rewrite. Finally client side the format is JavaScript friendly so that you can transform, explore and render it however you want. I'll be keeping a close eye on this format and contributing to the Github repository from time to time. I encourage you to check it out.

Server - CDNs, CORS, Beyond URLs

Our second leap in our build pipeline is the server. While you can precompile everything possible and then deploy it, there will be cases where this simply isn't feasible. We should rely on the server to ingest and optimize content as well, perhaps based on usage and access. Its also a tiered approach for the developer who might be working on the least powerful machine in their architecture. Having the server offload the optimization work means that iteration can be done much more quickly.

For the server my first reflection is that we need smarter servers and CDNs that offload many types of optimization from the developers machine and their build environment off into the cloud where it belongs. As an example, most developers don't produce 15 different streaming formats for their video today and upload each individually. We instead rely on our video distribution servers to cut apart the video, optimize, compress, resample and otherwise deliver the right content to the right place without us having to think about the myriad of devices connecting to us. For VR the same example would be in the creation of power of 2 textures, precomputing high quality mipmaps or even doing more obscure optimizations based on the requesting device.

For device specific tuning we can look towards compression. Each device will have a set of texture compression extensions that it supports and not all devices support every possible compression type. Computing and delivering these via the server can allow for current and future device adaptation without the developer having to think about redeploying their entire project with new formats.

The second reflection is on open worlds. For this we need high quality, highly available content. There are a large number of content pieces that will be pretty common/uniform. While you could generate all possible cubes, spherical maps and other common shapes on the client, you can also just make them widely available in a common interchange format and available over CORS.

For those not familiar CORS stands for Cross Origin Resource Sharing and is a way to get access to content on other domains and be able to inspect it fully. An example would be an image hosted on a server, perhaps it contains your password on it, so it would not be served via CORS. While you could retrieve that image and display it in the browser you would not be able to read its pixels or use it with WebGL. On the other hand if you had another image which was a simple brick texture you might want to use it on any site in WebGL. For this you would return the resource with CORS headers from the server and this would allow anyone to request and use the texture without worry of information leakage.

I suspect a huge repository of millions or even billions of objects to be available in this way within the next few years. If you are working on such a thing, reach out to me and let's talk!

My last real reflection is that accessing content via URL is inefficient. It doesn't allow the distribution of the same resources from different locations with the caching we need to make the VR web work. We need efficient interchange of similar and reusable resources across large numbers of sites, but not put the burden on a single CDN to deliver those so that caching works as it exists today in browsers. There are some interesting standards proposed that could solve this or maybe we need a new one.

Not even a reflection, but rather a note that HTTP/2 pushing content down the pipe as browsers request specific resources will be key. Enlightening the stack to glTF for instance to allow for server push of any external textures will be a huge win. But this comes with a lot of need for hinting. Am I getting the glTF to know the cost of an object or to render it? I will try to dedicate another future post to just HTTP/2, content analysis from the server and things we might build to make a future VR serving stack world and position aware. I don't think these concepts are new and are probably in heavy use in some modern MMORPG games. If you happen to be an expert at some company and want to have a conversation with me on the subject I would love to pick your brain!

Client - Progressive Enhancement

The last stage in our pipeline is on the client. This is where progressive enhancement starts to be our bread and butter technique. Our target environment is underpowered and at the high end of that spectrum will be LTE enabled mobile phones. Think Gear VR, DayDream and even Cardboard. Listening to Clay Bavor talk about the future it is clear Google is still behind Cardboard and admits to there being millions of units out there already with many more millions to come. Many people have their first and only experiences in VR on Cardboard.

Installation of content is no longer a crutch we can rely on. The VR Web is a no install, no long download, no wait environment. Nate Mitchell at OC3 alluded to this in his talk when he announced that Oculus would be focusing on some key projects to help move the VR Web forward. I still consider Oculus a team that delivers the pinnacle of VR so to take on the challenge of adapting the best VR experiences possible to this rather harsh set of mobile and web requirements is pretty epic. That is what the rest of this slide covers.

My first reflection after noting the requirements is that of progressive texture loading and fallback all the way to vertex colors when textures aren't available yet. The goal of VR is to get people into an immersive environment as fast as possible. Having a long loading screen breaks this immersion. The power of the web is the ability to go from site to site, without breaking your immersion, the way you do today when navigating between applications (or when you have to download a new application to serve a purpose for which you just gained a need). We also aspire to have VR to VR jumps work like they do in the sci-fi literature or in movies. A beautifully transparent experience as you switch from world to world.

We can achieve this with good design and since the VR Web is just starting we have the opportunity to design it right. My only contribution for now is to load your geometry with simple vertex colors, follow up with lightweight textures and finally, once the full texture is loaded and ready, upgrade to the highest quality experience. But don't block the user or lower your framerate significantly to jump up to that highest quality if it is going to be disruptive to your user. This will require some great libraries and quite a bit of experimentation to find all of the best practices. Expect more from me on these subjects in future articles as well.

My second reflection is on the importance of Service Workers. A feature designed to make the web offline can also be a powerful catalyst in helping the VR Web instantly load and become fully progressive. The features I think that are key in the near term are the ability to implement some of the prediction and prefetching for things like glTF resources. As the Service Worker intercedes it can fire off the requisite requests for all of the related textures and can cache them for later. We can also build in the progressive texture loading into the service worker and have it optimize for many different variables to deliver the best experience. Its basically like having the server of the future that we all want, but on the client and under our control.

Another feature of the Service Worker is understanding the entire world available to the experience and optimizing based on the current location and orientation information. This information could also fit into the future VR server, but we can test and validate the concepts in the Service Worker long before then.

My last reflection on the client is that there are some well defined app types for which we could deliver the best possible experiences using common code available to everyone, perhaps augmented by capabilities in the browser. This is highly controversial since everyone points out that there is indeed a base experience but that it is insufficient and customization is a must. I disagree, at least for now. Its like arguing against default transport controls in the HTML 5 video player. Why? 99% of your web developers will use them to provide 20% of the scenarios in the long tail of the web. Sure there is a 1% that develops top sites and accounts for 80% of the experiences and they'll surely want to add their own customization, but they are also in the best position to create the world class optimization and delivery services needed to make that happen.

To this end, I think its important that we build some best of class models for 360 photos and videos and optimize this into the browser core while VR is still standing up. These may only last for a couple of years, but these are the formative boot strapping years where we need these experiences to amaze and entice people to buy into the VR Web that doesn't quite exist yet.

Bonus Slides

I won't go into details on these. They are more conversation starters and major areas where I have some internal documentation started on what we might propose from the Oculus point of view. I'll rather list them here with a sentence describing some basic thoughts.

Web Application Manifests - If these are useful for describing web pages as applications they can be used to describe additional 3D meta-data as well.
Texture/Image Optimization - Decoding image types, compressing textures on the device, etc... Maybe the realm of Web Assembly, but who knows.
glTF - Server enlightened data type with HTTP/2 server push, order optimized, LOD optimized and full predicted based on position and orientation.
Position Aware Cube Maps - Load only visible faces, with proper LOD, perhaps even data-uri encode for the initial scene load.
Meta Tags/HTTP Headers for Content Hinting - Initially this was only position and orientation so that the server could optimize but has since grown.

What's Next?

If you find things here interesting you can contact me now at justrog@oculus.com and we can talk about what opportunities might exist. I'll also be reaching out to you as I find and discover experts. I like to constantly grow my understanding of the state of the art and challenges that exist. Its my primary reason for joining Oculus, so that I could focus on VR problems all day, every day!

I have 3 more talks that I delivered at the conference, each prepared similar to this one as a blog post. I'll finish the perf series first and the I'll end with my rather odd talk on Browser UX in VR. Why was it odd? Stick around and I'll tell you all about it in the beginning of that post.

Saturday, October 8, 2016

Progressive Enhancement for the VR Web

Modern VR developers are wizards of the platform. Andrew Mo commented that they are the pioneers, the ones who survived dysentery on the Oregon Trail. It was meant as a joke, but every well timed joke carries more weight when it reflects a bit of the reality and gravity of the situation. Modern VR developers really are thriving in the ecosystem against all odds in tinier markets than their Mobile app and gaming counterparts while meeting a performance curve that requires deep knowledge of every limitation in the computing platform.

John Carmack, in his keynote, said that the challenge present in Mobile VR development is like dropping everyone a level. The AAA developers become indie devs, the indie devs are hobbyists and the hobbyists have just failed.

If VR is dominated by these early pioneers then where does the web fit in? These VR pioneering teams aren't web engineers. They don't know JavaScript. While WebGL seems familiar due to years of working with OpenGL, the language, performance and build/packaging/deployment characteristics are all quite different from that of a VR application in the store. Many new skills have to be employed to be successful in the VR Web.

There is a shining light in the dark here. Most people when they hear about WebVR immediately think about games or experiences coded entirely in web technologies, in JavaScript and WebGL. It’s a natural tendency to think of the final form a technology will take or even just draw parallels with what VR is today. Since today, VR is dominated by completely 3D immersive experiences, always inside of a headset, it can be hard to imagine another, smaller step that we could take.

Let's start imagining. What does a smaller step look like? How do we progressively evolve and enhance the web rather than assuming that we have to take the same major leaps that the VR pioneers have made to date? How do we reduce our risk and increase the chance of reward? How do we increase our target market size so that it greatly exceeds the constraint of people with consumer grade VR already available?

VR Augmentations and Metadata

Our first goal has to be that existing websites continue to serve billions of users. We need to progressively update sites to have some VR functionality, but not make that a requirement for interaction. Just like a favicon can be used by a site to slightly customize the browsing experience and make their site stand out in history, favorites or the bookmark bar, a VR ready site could supply a model, photosphere, 360 video or even an entire scene. This extra content would be hidden from the vast majority of users, but would be sniffed out by the next generation of VR ready browsers and then used to improve the user experience.

One of the most compelling options is to have your website provide an environment that can be rendered around your page. In this way you can set up a simple scene, decide where your content would get projected to and the browser would handle the rest through open web standards such as glTF. This isn't even a stretch of the imagination as a recent partnership between OTOY and Samsung is working on the right standards. I was able to sync up with Jules at OC3 and I have to say, there is a pretty big future in this technology and I'm happy to add it to the list of simple things existing website developers can do without having to spend years learning and working in VR and 3D graphics. Stick a meta or link tag in your head, or push it down as an http header (this is why meta+http-equiv is probably the best approach here) and you'll get more mileage out of users with a VR headset.

It doesn't stop here though. While this changes the environment your page runs in, it doesn't allow you to really customize the iconography of your site the way a simple, 3D model would be able to. Another glTF opportunity is in delivering favicon models that a browser can use to represent sites in collections like the tab list, most recently visited sites and favorites. A beautifully rendered and potentially animated 3D model could go a long way to putting your website on the mantle of everyone's future VR home space.

I think there is more value to be had in the Web Application Manifest specification too. For instance, why can't we specify all of our screenshots, videos and icons for an installable store page? A VR Browser would now be able to offer an install capability for your website that looks beautiful and rivals any existing app-store. Or if you like the app-store then you can specify the correct linkage and "prefer" your native experience. The browser in this case would redirect to the platform store, install your native application and off you go. I see this as exceptionally valuable for an existing VR centric developer who wants to augment their discovery through the web.

Immersive Hybrid Experiences

Our next goal is to start incrementally nudging our users into the VR space. While augmentations work for existing VR users and are ways to provide more VR like experiences on the existing web, we can do even better. Instead of a VR only experience we can build hybrid, WebGL+WebVR applications that are available to anyone with a modern browser.

How does this work? Well, we start with the commonality. Everyone has a browser today capable of some basic WebGL. This means any experiences you build can be presented to that user in 3D through the canvas tag in a kind of magic window. You can even go full screen and deliver something pretty immersive.

To improve this further, we can abstract input libraries that work across a large set of devices. Touchpad, touch, mouse, keyboard, device orientation, gamepad and the list continues to grow each day. By having a unified model for handling each type of input you can have a great experience that the user can navigate with mouse and keyboard or spin around in their chair and see through the window of their mobile phone. Many of these experiences start bordering on the realism and power of VR without yet delivering VR.

The last thing we do to nail this hybrid experience is detect the presence of WebVR. We've landed a property in the spec on the navigator object, vrEnabled. This will return true if we think there is a chance the user could use the device in VR. While I think there are some usability issues with such a simple property that will result in maybe some browser UX for turning VR on and off, this is a great start.

This is the next level of write once, run anywhere, but its built on the concept of progressive enhancement. Don't limit your user reach, instead scale your experience up as you detect the capabilities required. I've recently stated that this is a fundamental belief to our design of WebVR and I truly believe in maintaining this approach as long as there are users who can benefit from these choices.

I wanted to give an example of one of these experiences and so here you can see a 360 tour we've linked from our Oculus WebVR Developer portal. There will be many more examples to come, but this will run in any browser and progressively enable features as it detects them. My favorite is simply going there in your phone and looking around using device motion.

I can't speak highly enough of the value in building experiences like this. For this reason we at Oculus and Facebook will be releasing numerous options for building these experiences in minutes to hours rather than in days or more. Both content generation and viewing needs to be made easier. We need to make great photosphere export from pretty much any application (any developer should be able to emit and share a photosphere from anywhere in their game/application/experience), optimize how we transmit and display photospheres with potential improvements to streaming them in, with simple to use libraries. It has to be just as easy to extend this to use 360 video. Even a simple looping 360 video could provide your user with AMAZING bits of animation and value. Finally, extending this to make it interactive with text and buttons. You can go further if you have the skills to do so, but the basic libraries such as React VR will allow anyone to create great experiences with the above characteristics.

Getting out of the Way

Once we've extended existing websites, gotten some simple libraries in place and people start to build VR content, then final step is get the hell out of the way. This is easier said than done. There is a large contingent of web technology naysayers that have pointed out certain flaws in the web platform that make it non ideal for VR. I'll give some examples so it is clear that I do listen and try to resolve those issues.

JavaScript is too slow. To which I respond, maybe when it comes to new experiences. JavaScript is a tuned language. It is tuned by browser vendors based on the content out there on the web today. Tomorrow, they could tune it differently. There are so many knobs in the run-time. C++ is no different. There are knobs you can tune when you pipe your code into cl.exe and it WILL result in differences in the run-time behavior and speed of your application. The browser is running in a chosen configuration to maximize the potential for a set of sites determined by usage.
GC sucks. To which I reply, yes, the general model of GC that is present in your average web runtime does indeed suck. However, this is again, only one such configuration of a GC. Most GC's are heuristic in nature and so we can tune those heuristics to work amazingly well for VR or hybrid scenarios. I won't go into details, but let's just say I have a list of work ready for when I hire my GC expert into the Carmel team ;-)
Binding overhead is too high. To which I respond, "Why?" There is no requirement that there be any overhead. An enlightened JIT with appropriate binding meta-data can do quite a bit here. While there are some restrictions in WebGL binding that slow things down by-design I have high hopes that we can arrive at some solutions to fix even that.

That list is not exhaustive. When you have full control over the environment and aren't part of a multi-tenant, collaborative runtime like a web browser, then you can be insanely clever. But insanely clever is only useful for the top 1% of applications once we enable anyone with a browser to build for VR. We need to get out of the way and make the common cases for the remaining 99% of the content not suck. That’s our space, that's our opportunity, that’s our goal.

Beyond the standard arguments against VR and the web I think there are more practical issues staring us in the face. The biggest one is the prevalence of full VR browsers. The first fully featured VR browser is going to be a deep integration of the web platform, OS, device and shell. There is simply too much going on today for a web application to be able to run with zero hiccups and an overall lack of measurement and feedback between the various sub-systems. Tuning everything from the async event loop, thread priorities and which sub-systems are running when in and out of VR is a very important set of next steps. Combining these enhancements with UX and Input models that enable people to have a level of trust in their to the extent that they could interact with a merchant in virtual space, whatever form that might take.

Right now we are experiencing the power of VR shells that plug into basic browser navigation models and do simple content analysis/extraction. This is an early days approach that only achieves a small fraction of the final vision.

For the graphics stack we need some basics like HTML to Texture support in order to bring in the full power of the 2D web into our 3D environments. I've been referring to 2D UX as having a "density of information" that can't be easily achieved in 3D environments. We need this in order for VR experiences to enable deep productivity and learning. Think about how often, in our "real" 3D world you break out your phone, laptop, book and other 2D format materials to quickly digest information. I don't think VR is any different in this regard. John Carmack, during his keynote, also noted that there was huge value in 2D UX because we've had so many years of experience developing. I believe this to be true and think that an HTML to Texture will broaden the use cases for WebVR dramatically.

We also need to enable full access to the pipeline. More and more I'm hearing about the necessity for compute shaders and access to gl extensions like multi-view. Even native is insufficient for delivering super high quality VR leading to advances in hardware, OS and software. The web needs to provide access to these capabilities quickly and safely. This may mean restrictions on how we compose and host content, but we need to experiment and find those restrictions now rather than holding the web back as a core VR graphics platform.

To close out, note how this section details yet more progressive enhancement. Each of the new capabilities and improvements we make will still take time to filter out to the ecosystem. This is no different in the native world where extensions like multi-view which have been defined for a couple of years are still not uniformly distributed. So producing experiences that reach the largest possible audience and gradually increase capability based on the device and browser features you detect is going to be key.

Over the next few months I'll be providing some tutorials talking about strategies and libraries you can use to enable VR progressive enhancement for your existing sites. You can also check out more of our demos and sign up for more information about our libraries like React VR at the Oculus Web VR Developer Portal.

Sunday, September 18, 2016

WebVR - 95% Web and 5% VR - That's a GOOD thing

When a new technology comes out there is rarely a singular influence. Technologies today are highly connected, almost always derivative and rarely kept private. These separate influences are good for the development of a diverse technology that is general enough to solve large scale problems and powerful enough to create experiences that are truly unique. I think WebVR is one of these truly diverse technologies, but only when you take full advantage of everything we've already developed and influence the many new technologies yet to be developed for the web.

The marriage of Web and VR tends to be very one sided. Most of the bits and pieces you need to build a modern VR application already exist in the web platform. Compare and contrast this with VR technology stacks, where people are still building their own event loops, threading models, memory models, etc... The VR space is the wild-wild west and this means there is a lot of room for cowboys, experimentation, lots of wheel reinvention factories and that's just the tip of the iceberg.

Leading the way in this round are the entertainment and gaming applications. This is where you tend to get highly specific experiences (tell me the last time the UX on your media player was considered innovative and didn't look like a 1990's VCR) or uniquely creative experiences that tickle your gaming itch. In this world nothing is uniform and everything is a new assault on your senses. A small and very influential group of people love this space and they become our pioneering group. They start to build and design the future while the calm majority sits back and waits for the technology to come to them.

The Web is this calm majority. A highly stable, yet quickly moving substrate of APIs and technologies, spread across billions of devices, with majority stakes already placed into many excellent and well established design principles and practices. Basically, you can get stuff done on the web. Its a productive place. Its usable by the masses, but configurable to the needs of the experts. It can do almost anything. Almost. What it can't do is blast you in the amazeballs with pixels of light that transport you to a far away world. Yet.

Given the existing influences there are two ways to get to the goal. The first is design it all again and make people move to the VR space. A sort of manifest destiny approach to technology. The second is to embrace the existing web for all of its great capabilities and even many of its terrible ones. Evolve the technology and let everyone adopt at their own pace.

The later has a much better chance of working. Not because its the best approach for VR but because it allows us to retain as much of the power of the existing web as we can without resetting the entire ecosystem. Think back just 5 years and look at the rapid adoption of new and pervasive technologies on the web to understand its power and importance. Here are a couple that I was around to watch blossom.

Canvas - Yeah, it was basically just entering adoption 5 years ago and is now a premiere API for 2D bitmapped graphics creation.
WebGL - Allowing for truly customized graphical experiences in 2D, 3D, Image Processing and its the basis for WebVR (without WebGL there is no WebVR).
WebAudio - Giving full access to sound effects and mixing for the first time on the web.
Web Workers - Half way between threads and tasks giving access to extra cores
Media Capture - Camera, Microphone, Video
WebRTC - Adaptive media transfer for person to person interactions
Fetch - A real networking primitive for highly configurable requests
Service Workers - A brilliant empowerment of the network stack without breaking existing networking expectations of the long tail web

All of these web technologies were driven by a collective set of requirements from a highly diverse group of interested parties. Each technology did not have to try and incubate and grow on its own. It could instead live side by side with many others. When used in isolation a technology like Canvas would have lost to the many, much better solutions in the market already. But because it was augmentative to your average web page, it had a place.

Those other, better, non-web, technologies live in isolation. They don't understand JavaScript, garbage collection, browser event loops, browser render loops and how to peacefully co-exist in a broad technology stack. Maybe they are faster and you can do more 2D rendering stuff in them than you can in Canvas, but those incremental differences aren't as powerful as being deployed to billions of devices, over the network, compiled from text to the screen and then printed (yeah, printing is a stretch, but bear with me!) alongside of any arbitrary piece of HTML content.

The capabilities of the web are now increasing at a rate faster than any other platform. This means its ability to provide value is leaps and bounds beyond most other systems. When compared on a single axis the web will likely lose. You can always build a better, more specific widget catering to a specific use case, but the power in the web is its Swiss Army Knife approach. We don't buy a Swiss Army Knife to be the best of any given thing, we instead buy it to be good enough at many things and hopefully fill in missing gaps in our tool chest. Its an astoundingly effective convenience and it takes convenience to enable modern developers to achieve the content depth and content breadth that is needed to appease the masses.

Looking forward there are both VR specific technologies and general web technologies that will make VR a success. Personally, some of the most interesting are being driven, not with VR in mind, but instead extending the web to new levels of capability that help it exceed app frameworks and runtimes and puts it on part with existing operating systems. A few critical to VR, web inspired technologies in my short list are...

Service Workers - VR applications are resource intensive and the first line of defense is going to be prioritization of resource downloads, caching and working around the inefficiencies that exist in legacy file formats.
WebGL - New performance improvements to WebGL and extensions like multi-view or stereo rendering are likely to land with or without WebVR being a consumer since they have general applications outside of VR as well.
HTML to Texture - If this existed today, it would likely be the top used API in WebGL applications. The pre-existing power we have in DOM, HTML/CSS layout and all of our existing content is something we don't just want to use we NEED to use it. We have to fill time in VR and new 3D assets alone are just not compelling enough.

VR will reinvent the web, but it won't happen in a chaotic upswell and massive conversion to fully 3D experiences overnight. It can't. There is too much value in our existing base. VR will reinvent the web in subtle ways such as spearhead better graphics APIs, faster JavaScript run-times tuned to our frame based patterns and parameterized GC behavior that will avoid frame jank. VR will push hard on solutions to the HTML to Texture problem so we can bring our web content into our 3D worlds. VR will redefine what security means when it comes to both new and existing APIs. Most importantly VR will define a small subset of new APIs (arguably 5% is too high from my title) that are critically important to the medium itself. Those APIs will work seamlessly with the existing web platform enabling our iterative transformation from the existing 2D informaton packed web to a mixed 3D/2D model allowing developers to pick the presentation mechanisms of their choice.

I hope that VR won't become a parallel stack to the web. If it did then I think we've failed hard. There is no need to reinvent all of the past 20 years in productivity and capabilities increases we've gotten from the web. There is no reason to relive all of the same mistakes and relearn all of the best practices. More importantly there is no reason that the existing web can't continue to evolve and get better as a collaborative partner, or more accurately an elder sibling allowing WebVR to shine in ways that it is best designed rather than spending endless cycles trying to replace all of the web experiences we already have with VR versions of the same.

Sunday, January 24, 2016

Browser VR Experiences – Let’s not VRML!

Virtual Reality is a hot topic right now and so there is a huge interest for APIs and standards related to it across a broad range of programming languages and platforms. One of the most interesting platforms right now, due to its pervasiveness is the web platform or Browser. Every major desktop and mobile platform now has access to a great Browser that is performant, highly interoperable with thousands of APIs and some general capabilities for composing the web platform into existing applications. The large set of developers who program for this platform are always itching for new and relevant technologies to be able to differentiate themselves and create rich new experience. This means, the race is on. How long before the web platform has a set of standards and specifications for helping guide the evolution of VR. Whatever is produced will be the software equivalent of the Oculus Rift and Gear VR. It’ll put in the hands of all developers and consumers, VR capability, as soon as the devices hit the market leading to pervasive VR.

But it’s not something you want to rush. Let’s not VRML! To be clear VRML (spec) was never really VR, but rather standard 3D projection onto a 2D plane we all experience today every time we play games. VRML, in other words, was way ahead of its time in more ways than one. As a technology it sort of replaced the existing HTML experience with a 3D experience through markup that more closely resembled SVG. It had concepts like link navigation that were necessary to allow switching views either within or between different scenes. It was like another top level document and so you could even switch back and forth between HTML and VRML quite easily. Since the experience wasn’t true VR, this wasn’t terrible. It wasn’t jarring for the user as they moved between 2D and 3D content.

Now let’s fast forward. The HTML world we live in has evolved immensely. By itself, HTML can render 3D content using WebGL. Since this maps to our current state of the art for game development it is a much better experience than say a limited markup language. It is more complex for sure, but various libraries intend to make it more approachable with Unity, three.js, Babylon.js and A-Frame being some really strong contenders. We also have true VR devices now and that means stereoscopic rendering and projection to the user. We’ve added rich media in the form of Audio and Video tags. The world is much more complicated and so any standard or specification has to now navigate existing Browser UI, user experience, the existence of new APIs and understand how developers want to code to the platform.

What I hope to present in the rest of the article is mostly a set of experiences that I see that exist. What are some of the challenges we face both in user, interface and API design to make those approachable? What is the line between Browser and Web Page and how do we bridge that gap in a way that enables developers AND users to have great experiences? Finally, how do we integrate the future 3D web without losing the power we already have established in the existing 2D web? These questions keep me up at night (literally, I was up last night until 3AM because I haven’t written this article yet ;-) and so I hope in the coming weeks and months to put as many of them to rest as possible.

Presentation Experiences for Browsers

Browsers currently have to deal with a number of different presentation formats. We then provide information about the current presentation (screen size, orientation, etc…) to the page using a number of technologies so that the content within can adapt. The overall experience is a combination of the Browser UX or Browser Chrome (not to be confused with Chrome the browser) and the content.

Screens and Phones

Recently trends have gone towards the browser being very light on the Chrome with thin edges, no status bars and minimal title areas. This is mainly due to the fact that user’s want the majority of the browser’s space to be dedicated to the content they are viewing. They still want critical UX elements like tabs since that has become a de-facto standard on how to organize your browsing experience. Finally over time the average number of tabs a user is comfortable managing has increased. That can give you some basic design criterion for building a good Browser UX.

For the screens on which we present the trends have also changed. The average page visible area when I started working on IE back in 2005 was 800x600. It grew to 1024x768. Right now I imagine it to be more fragmented and I haven’t looked in a long time. During that golden age the device pixel to CSS pixel was 1:1. This meant that every pixel you put on the screen was rendered as a physical pixel as well. With High DPI screens this is no longer true. Phones now have 4k screens, but couldn’t possibly render 4k worth of CSS pixels. Instead we simply use more than one screen pixel to represent the same CSS pixel but the effective screen real-estate is pretty much the same (actually less, since the number of CSS pixels on a phone screen might only be 400x800).

For desktop the average screen size is increasing as is the number of effective CSS pixels you can take advantage of. Very recently desktops have been balancing increased screen size with high DPI. A 28” 4K display is an example. 96DPI 4K is available at around 42” which is starting to get too big for usable desktop (neck strain, head tilt, etc…). For phones the number of device pixels (DPI) is increasing, screen size is approximately the same as are effective CSS pixels. With new casting technologies your phone could also be your desktop so you have to be prepared for that. I recommend reading the MSDN article on Windows 10 Continuum since it shows the API shape for how developers might deal with these types of devices.

Virtual Reality Presentations and Devices

I’ll start with the existing types of VR devices that we have. There are two basic forms. The first form is the integrated screen and device. This is demonstrated by a Gear VR or Google Cardboard. The user experience of these devices is kind of odd because you have to navigate content designed with and without VR in mind. The second form is the attached peripheral. This is the Oculus Rift and many other emerging contenders. Instead of using the same screen as the device for displaying 2D content, they have a secondary head mounted display (HMD)

Integrated Devices – GearVR and Google Cardboard

Let’s think through a common workflow. You are in your mail app, not 3D, so you are currently not in stereoscopic mode. If you put the device into your VR headset at this time, your brain would be super confused. Each eye would see a different half of the screen and it couldn’t do anything to combine this information into a visual. Some people would get sick, others would feel uncomfortable and probably 1 in half a billion people would be able to independently read each screen (Gecko People!).

Instead you click on a link and go to your web browser. All of a sudden, your screen shows two deformed images side by side. They look similar, but slightly different. Again, your mind can’t really do much since the deformation is too extreme for the image to look much like the original. It’s like looking at the world through coke bottles. This workflow is proving to be very confusing and uncomfortable.

You do put the device into a headset though and you are immersed in brilliant 3D. You spend the average 30 seconds on the web page and then you navigate back to your mail app and yet again you are confronted with sickness. You eventually realize that limitations of your devices and manage to work around these transitions, but you are still surprised EVERY time some content throws you into VR mode and you wish for a better experience.

Peripheral Devices – Oculus Rift, Morpheus, Vive

Again, we can think through a workflow here. You spend most of your day staring at your screen, navigating through your 2D applications, quite productive and happy. You launch a mail link and you are confronted with a blank page. It turns out this page has detected your Oculus and decided to use it. You probably notice your CPU fan noise just increased to the level of a Boeing 747 jet engine.

You quickly don your Rift and now the experience is obvious. You don’t quite like this application so you decide to exit. Again, you are now confronted with the Oculus shell and not any form of familiar Browser UX. You take off the Oculus and decide to get back to real work.

Alternately you could have clicked on some form of navigation primitive in the virtual space. Your desktop browser is going crazy navigating links, but you don’t see this occurring. Clearly the experience between the peripheral and the page wasn’t thought through very clearly. You wonder to yourself if this is the fault of the browser, the web page author or even the VR device itself.

Integrating Devices into the Web Experience

Adapting for Virtual Reality experiences requires that we understand the previously detailed experiences. That is what an experience might look like today if we implement the wrong set of APIs or no APIs at all. We certainly need to take into account multiple type devices and most importantly the different types of presentation they are capable of.

The browser, up to now, has only had to deal with rendering to a 2D surface, a single version of our image. No matter how complicated the end user’s setup was, this was abstracted from the Browser. They could be spanning us across multiple displays, or on a curved display or even changing the aspect ratio. To the Browser this doesn’t matter since it doesn’t change the geometry of our outputs.

But why does the geometry change? Why does a change in geometry matter? To answer these questions first look at your standard 2D browsing experience. In the 3D world in which you live, if replicated using existing 3D technologies, your computer monitor would be a quad. 2 triangles, oriented in space to form a square. Your camera would be each of your eyes and your eyes would be focused at a very specific location on the quad, the area of focus. While you are working the quad remains in a very narrow portion of your FoV. While you can see things to your left, your right, you can probably see your feet even, the monitor is not positioned in those locations.

Changing the position of the monitor, increasing its size, placing it nearer or further away all become uncomfortable operations to you, the user. Why? Well, the content is optimized for its default presentation. That of being visible in a square, placed in the center of our FoV. The geometry, in 3D space is that of a projected quad, at some depth, which makes the text and images the right size for us to comfortably read everything. I’m going to try and coin terms for each of these default experiences.

VR Experience #1 – Standard Web Browser, 2D Content Billboard Projection

Basically the same experience you get on your laptop should be the same experience you get on a VR device. This default experience will be fairly comfortable. First, you won’t be rotating your head up/down or left/right to try and navigate content. The same navigation primitives you use today will suffice. This will make the overall experience very accessible as well since it will limit the number of new things a user has to master before finding the space usable.

The Browser UX would probably take on the form of a phone browser. It would be lightweight and get out of your way, focusing mostly on the content. The content developers will be very happy. They spent millions on designing their experience and you’ve basically maintained that experience well.

Proposal #1 – The basic VR experience is NOT a standard and is not part of WebVR

This experience is how the browser allows the user to stay inside of the headset. It has nothing to do with APIs available to the page. It’s about the entire browser experience from soup to nuts and is therefore something to be tweaked by each vendor.

If you are not familiar with “Theater Mode”, then press F11 in your browser right now! Did you see how this provided a browser based full-screen experience? This is separate from the full-screen APIs. For the same reason that theater mode is a browser provided experience I believe that WebVR (the API set) should not intrude here. A browser, should be able to provide a VR experience for browsing that is independent of the content.

VR Experience #2 – Web Browser Features in 3D Space

My next experience is about the web browser having non-billboard 3D content. This is where the experience wants to utilize 3D space rather than simply maintain the status quo of a 2D screen in front of your eyes while the headset is on.

The goal here would be to achieve minority report like user interfaces for things like tabs, favorites, history and other content such as the recent news rivers present in Microsoft Edge and other browsers. This experience now only makes sense if the user is ready to be in a VR mode. They should show some intent for this experience because as we’ve seen before, they may need to prepare their device. They may need to either put on their peripheral HMD or they may need to drop their phone into their HMD harness. Either way, they are likely now deciding that they don’t want to continuously jump between a non-3D and a 3D experience. This is where our VR Experience #1 shines as well. When the browser does navigate to 2D content it has to do an amazing job of showing it in 3D space.

Proposal #2 – A Browser should have a VR Immersive Mode

Proposal #3 – A VR Immersive Mode is again the realm of the Browser not Standards APIs

With the browser in immersive mode, you can now display your tabs page as a 3D sphere of tiles and the user is likely to be happy. They can use head rotations to do navigate around as well as other inputs such as Gamepad, Mouse or Keyboard (or even more obscure capabilities like hand gestures).

Still this experience is completely outside of the API set necessary for a standards based API. We should only intrude where necessary is always one of my goals in standards. We find more and more reasons to intrude as time goes on, but that’s fine.

Theoretical Work-flows for VR Experiences #1 and #2

We’ve just turned basic VR into a button. When a browser detects that it can provide a VR like experience it shows intent in the UI. Further, you can always go into a VR mode, since any display can output a stereoscopic rendering that can then be consumed by an increasingly varied number of approaches. You could simply render to a 24” monitor, stereoscopically, mount a largish moving box to it to blank out light, have a sufficiently long flume attached to the user head with some corrective lenses and that is just as much a VR experience as say Google Cardboard.

If your browser is set up to use a peripheral it moves your entire browsing experience there using our 2D billboard approach. When it finds opportunities to use the VR space for 3D content it asks the user for permissions to assault their senses and then goes into that mode. Perhaps for browser experiences it does this automatically, but for page experiences it does so with user consent.

If the browser is not set up for a peripheral but rather an integrated device then it begins rendering its entire visual set stereoscopically.

Link navigation works great. You go from Tabs (3D) to Facebook (2D quad properly positioned for optimal viewing) to a WebGL site with VR capabilities (2D initially with a pop out to 3D, then back to 2D as you navigate away) and then you ask to pop out of VR mode and away you go.

You can start to see inspirations for additional VR experiences. Let’s complete an examination of Browser built-in experiences.

Browser VR Media Experiences

There is a lot of contention on what VR experiences are going to be the most popular. Right now, media is king. It delivers the highest quality immersive 3D experiences because it relies on the realism provided by, well, real film. It breaks down when only due to focal point issues where the mind expects to refocus. This is overcome by amazing directing that forces your eye to follow the focal points intended by the director. But let’s assume that 3D video and 3D images are going to be very important.

The problem here is that there are no standards for 3D video and images. Further, there may not be a great standard to evolve for some time. Further, most 3D media starts out as much less than 3D and is then composed into 3D. You can, for instance, take a pano and some pictures and stitch them into a great 3D media experience. However, there is no media format that does that for you that a browser can load. For that you are going to need actual WebVR. I’ll define this experience now, but will talk about it later when we get more into the WebVR applicable arenas. I will say that I think Carmack’s concept of VR script, converted to be based on a subset of JavaScript with a minimal API might be an even better than WebVR solution.

VR Experience #3 – Non-3D Media Composed into 3D Presentation

But this leads us to some other, more realistic experiences. Just as a video can have a full-screen button, if the browser can detect the presence of 3D video, then we should also be able to go VR mode on just that video. This integrates 2D to 3D very well. Some people have expressed further interest in having the 3D video break out and display in front of the page. This is so much more challenging. 3D experiences are designed to be immersive, not be an island of 3D within your focused FoV.

VR Experience #4 – 3D Videos get a VR Mode Button

We don’t have to stop at videos. If we can also get a basic 3D image format, we could also do 3D imaging. There will be more than one type of this sort of image though. First, you’ll have a basic stereoscopic quad that is presented more on a 2D billboard. Second you’ll have more immersive images taken with new cameras that capture fisheye views of the world.

I think that for the first experience we could get away with rendering a left/right version of the image in-line with the page. I haven’t done this and so I don’t know if the user experience is good, but I could imagine it not being terrible.

VR Experience #5 – Stereoscopic Inline Image Rendering

For the second experience we probably want to again take over the peripheral vision and break out of the billboard in front of the user’s face. The user can rotate and tilt their head and be fully immersed in the scene.

VR Experience #6 – Immersive Fisheye Image Rendering

Notice how so far none of the media experiences we’ve described, barring #3, require any APIs in the web page yet. They do require, potentially, some image formats. Maybe some new attributes, etc… If that is the case, then those APIs should be defined by some sort of standards specification. Most likely a VR extensions to HTML 5 (whereas I’d argue WebVR, is more like WebGL, and is thus its own specification not directly tied to the HTML 5 spec).

All of the above experiences are either accomplished by the VR mode of the browser, inline or through a VR breakout button. Just like a very large set of full-screen experiences that you enjoy today are browser provided with no additional work having to be done by the page author, I argue that many VR experiences will fall into the same. That will make VR much more consistent and approachable for end users and developers alike. Because next we are about to talk about the hard experiences, and yes, VR is hard.

Author Provided VR Experiences

You have users running their browser in VR mode and now you want to provide some VR light-up experiences. As an author, the browser will be in 1 of many states, but the most likely state is this, “The user does NOT HAVE any form of VR.” The next most likely option is, “The user has a Phone but cannot use, want or does not have VR.”

There are also some web truths. The number of WebGL experiences is very, very low on the web today. At least relative to the number of sessions, navigations, pages, etc… The number of WebGL experiences that would look good in VR is lower than that still. Finally, the added cost of providing WebVR like experiences will be a barrier of entry to many.

My goal, is not to make VR a gimmick, but to make it an institution of the web itself. My goal is that WebVR provides an experience that drives VR adoption rather than another reason for naysayers to claim that VR is not yet ready for the consumer.

For the author to be able to provide a WebVR experience, I think they need to be able to detect first a VR capable browser and then second, that the WebVR APIs needed to submit WebGL content to a device actually exist. Currently the WebVR specification focuses on methods for enumerating devices. So we can build in page experiences, but there isn’t a browser that has a VR mode that itself supports a VR experience. My prior experiences, especially VR Experience #1, are pre-requisites.

Proposal #4 – Browsers Commit to VR Experiences before VR APIs

To the extent that I can influence this, I find this to be one of the more important points of this article. With this in place an author will be able to detect that the browser has its own VR experience and whether or not it is enabled.

VR Experience #7 – Page Authors can Detect VR Capable Browsers

That doesn’t get us very far. Maybe the author would try to use our other built-in experiences, but they probably want more. This is where WebVR starts for me. If you haven’t read the WebVR spec you should. You can also read my previous article talking about its limitations in its current form (which is being actively worked on, so I suspect to have a much more positive article covering the specification in the future).

A TL;DR version of current capabilities:

Enumerate and Retrieve VR Devices
VR Devices are outputs (HMDs) or inputs (Position Sensors)
Retrieve Stereoscopic Rendering Properties (FoV, eye positions, etc...)
Commit Visuals to the Display

However, do we really need this information if we are in a Gear VR or Google Cardboard scenario? Mostly not. So we need some lighter weight APIs for the more common scenarios. Since Gear VR and Google Cardboard are the most prominent, it seems targeting them first would be a good idea.

The experience here is that they want to use the entire screen of the device. They want to full screen a Canvas element that is displaying some WebGL content which is being stereoscopically rendered. They can effectively do this today, but they do it at the expense of the user. Since the user doesn’t have a Browser controllable way of being in or out of VR mode, the author can simply display some stereoscopic, full-screen content and destroy the end user’s experience. Further, the browser has no proper way to overlay their own protected UX.

VR Experience #8 – Cooperative Full-Screen Stereoscopic Rendering for Integrated VR Devices

I think the solution here is really, really easy. So easy that we can overthink it. I’d prefer a solution for these devices that doesn’t require me to enumerate devices (unless I want the position sensor for Gear VR for instance) and doesn’t have me requesting unnecessary rendering properties (unless we want to provide some defaults supplied by the user).

Proposal #5 – Build Simple, Separate APIs for Integrated VR Device Scenarios

Once we get way from Gear VR and we start getting into Oculus my simple experiences break down, no longer working. You are probably wondering why an Oculus is so much different from say a Gear VR. They are both VR experiences, so why can’t they use the same solutions? The answer is simple, the approaches to VR are way different as are their target audiences and experiences.

Gear VR is great at 3D movies and simple 3D experiences, but since it uses your phone, and only the power of your phone, the complexity of the scenes can’t be at the same level as an Oculus. Refresh rates tend to suffer on phones (many phones ran at 50hz rather than 60hz just to save a bit more on battery) and the Oculus is going to ship with 90hz and hopefully get up to the 120hz range in the next year or two. You can start to see how it would be hard to keep up with the Oculus if you only had the power of a phone at your disposal.

The Oculus is a high end consumer device that is focused on achieving very high frame rates using powerful desktop class computing and GPU hardware. People were surprised at the amount of power it needs in fact and many, I’m sure, decided not to pre-order when they found their machines weren’t up to the challenge ;-) The Oculus is trying to show you the best of VR with 90hz+ high quality rendering, a ton of SDK features to help developers achieve this quality (time warping and view bypass) and it presents itself as a separate device so it doesn’t have the limitations of say your current display.

For this experience, we don’t know what the final API will look like. Oculus will continue getting better and changing their own APIs. These will in turn inform the APIs needed in WebVR to be able to control all of the things that can be configured in the Oculus. Since VR is so new it is hard to find even a couple of things that are similar to all devices. Contrast this with WebGL where it is based on OpenGL and the three primary GPU providers (Intel, nVidia and ATI) are all willing to build things that plug into the standard. We don’t often have many device specific differences to contend with in WebGL to provide a unified experience. Matching a API to WebVR will not be easy.

We do have some basic ideas though. So I’ll build the experiences off of that.

VR Experience #9 – WebGL Presentation to a Dedicated VR Device at Native Refresh Rates

With integrated devices we output to the screen. With dedicated VR devices we have to ship some textures to the device after the fact. For this reason, we need an API which takes this into account. You could just ship over the texture from a Canvas, but that turns out to be non-ideal. You probably want to ship several textures. Also, you want to know that the device is dedicated and whether or not it supports other features such as time warping and deformation.

It may also be that the device needs separate textures for the HUD texture versus the 3D scene texture.

I believe that WebGL provides the primitives for this already, so the goal is to make sure that we can combine these primitives with those of the WebVR device in a way that makes sense. We also need to make sure that the dedicated device can have appropriate features turned on an off so that the author can do the bits they want to do. For instance, turning off time warp and deformation may allow the author to experiment with other deformation methods on their own texture prior to committing the scene to the device.

Since dedicated devices have their own refresh rate, it isn’t sufficient to use the 60hz requestAnimationFrame cadence to render your scene. You’ll have to rely on the device to fill in at least 30 of your 90 frames every second using time warp or by simply repeating frames. At 60hz people can get simulation sickness and so 60hz isn’t the greatest target. We should make sure the API can support the native refresh rate of the device.

Proposal #6 – Build Device Oriented APIs for Dedicated Devices into WebVR

Proposal 5 and 6 are where I hope we spend most of our time when coming up with WebVR. We could spend our time figuring out many more complex scenarios, but the technology is just too young for that. We could also spend time trying to standardize the Browser experiences, but that isn’t in our interest either. There is still a ton of work in the author provided experiences though and it is where the most complicated APIs will exist.

Conclusions

It turns out there are a lot of VR experiences to focus on. One might think that this is a simple space with only a single solution, but the reality seems to be far more nuanced. While these are my opinions, I did think about them for quite some time (since I started really thinking about VR in 2014). I focused mostly on the user experiences and what my expectations are. As a browser vendor, I get an opportunity to maybe impact the space more than others. But I will also be one of the very early adopters browsing the web from my awesome with early builds and implementations of these features.

I think the balance between what we leave for the browser vendors to innovate on and what ends up the API under the control of the author is an important one for us to deliver. If we can deliver a strong default browsing experience for VR then that will be compounded by good VR content created by authors. If VR turns out to be a dumping ground of bad experiences and the browser leads the way, then we are likely to see slow uptake and high dissatisfaction from our users.

I’ve written this because I find this is more than a conversation. I can maybe fit a couple of these experiences or concepts into a half or hour long meeting, but never all of them. This is my foundation. I hope that it can help you build your own foundational understanding of the subject matter as well. I hope that this starts some conversations, some arguments or maybe even a company or two. As always you can catch me on Twitter @JustRogDigiTec or leave comments.

I’m going to treat this like an open article as well. If there are glaring mistakes or corrections I’ll probably supply edits over time. I’m sure I’m not covering all experiences and I’m sure 6 months from now there will be some other presentation/device that doesn’t quite fit into the two existing categories I’ve defined. I also intend to add some illustrations, but I’m terrible at them so they look great in my head and terrible on paper. Once I’ve corrected that I’ll upload them ;-)