Thursday, November 10, 2016

W3C VR Workshop - Building a WebVR Content Pipeline

On October 19th and 20th the W3C held its first very VR Workshop. It was our first chance to both look back at what we had accomplished with some of the first revisions of the WebVR Specification which was working its way through as a W3C Community Group while also looking forward to the potentially large set of missing, future standards we were yet to write.

During the workshop I presented on a few topics, but mostly on accessibility of the Web in VR and performance. In the following few posts I'll be taking one of my lightning talks and expanding on it, line by line. I compared this with recording the lightning talks, taking my time, and hitting all of the points, but it seems better to get this into writing than to force everyone to put up with me 5-10 minutes at a time through a YouTube video!

My first performance talk was on the WebVR Content Pipeline. This is an area I'm supremely passionate about because it is wide open right now with so much potential for improvement. If we look at the commplicated, multi-tool build pipelines that exist in the games industry and use that as an indication of what is to come then that is a glimpse into where I'm thinking. If you want to see the original slides then you can find them here. Otherwise continue reading I'll cover everything there anyway.

The WebVR Content Pipeline Challenge


Not an official "fix this" challenge, but rather a challenging are to be fixed. I started this slide by reflecting on the foreign nature of build pipelines on the web. Web content has traditionally been a cobbled together mess of documents, scripts and graphics resources all copy deployed to a remote site. In just the past few years, maybe 5 at most, the concept of using transpilers, optimizers and resource packers has become almost commonplace for web developers. Whereas from 2000-2010 you might be rated on your CSS mastery and skillset, 2010 and onward we maybe started talking more about toolchains. This is good news, because it means some of the infrastructure is already present to enable our future build environments for WebVR.

My next reflection was on the complicate nature of VR content. You have a lot of interrelated components via linking or even programmatically through code dependencies. This comes across as meshes, textures, skins, shaders, animations and many other concepts. We also have a lot of experience in this area, coming from the games industry, so we know what kinds of tools are required. Unfortunately, even in games, the solutions to many of the content pipeline issues are highly specific to a given programming language, run-time or underlying graphics APIs. Hardware requirements make it even more custom.

My final reflection was on the graphic below. This is a content pipeline example where you start with assets on the left and they get processed into various formats before they finally get loaded on the client. The top of the graph represents what a common VR or Game pipeline might look like with many rounds of optimization and packaging. On the bottom of the graph we see what an initial web solution would be (or rather what it would lack). The difference in overall efficiency and sophistication is pretty clear. The web will need some more tools, distribution mechanisms and packaging formats if it wants to transmit VR efficiently, while retaining all of the amazing properties of deploying to the web.


Developer - Build, Optimize, Package


The graphic shows three major stages in the pipeline. First we start with a developer who is trying to create an experience. What steps in the pipeline can he do before sending things onto the server? How much optimization or knowledge of the target device can be had at this level? Remember, deploying to the web is not the same as deploying to an application store where you know almost exactly what your target device will be capable of. While fragmentation in the phone market does mean some progressive enhancement might be necessary, the web effectively guarantees it.

My first reflection is that existing build technologies for the web are heavily focused on soling problems with large 2D websites. These tools care mostly about those resources which we currently see scaling faster than the rest. This mostly means script and images. Some of the leading tools in this space are webpack and browserify. Since some set of tools do exist, it means that plugins are a potential short term solution.

My second reflection on this slide was that the game industry solution of packaging was also likely to not be the right solution. This breaks to principles of th web that we like. The first is that there is no installation required. Experiences are transient as you navigate from site to site. Even though these sites are presenting to you a full application, they aren't asking you for permission to install and take up permanent space on your disk. Instead the browser cache manages them. If they want more capability, then they have to ask for it. This might come in the form of specialized device access or the ability to store more persistent data using Indexed DB or Service Workers. The second principle that packaging breaks is that of iterative development. We like our ability to change something and have it immediately available when we hit F5.

My third reflection is around leveraging existing tools. There are many tools for 3D content creation, optimization and transformation. Most of the tools need to be repacked with the web in mind. Maybe they need to support more web accessible formats or they need to come with the library that allows them to be used efficiently. In the future I'll be talking about SDF fonts and how to optimize those for the web. You may be surprised to find that traditional solutions for game engines aren't nearly as good for the web as they are for the traditional packing model.

Another option is for these tools to have proper export options for the web. Embedded meta-data that is invisible to your user but still consumes their bandwidth has been a recent focus of web influentials like Eric Lawrence. Adobe supported export for the web for years, but often people wouldn't use the option and would instead ship an image that was 99% meta-data. Many 3D creation tools have never had the web in mind, so they often emit this extra data as well or export in uncompressed or unoptimized formats, more similar to raw. Upgrading these tools to target WebP, JPEG and PNG as image output formats, or any of the common texture compression formats supported by WebGL would be a great start.

My final reflection for this slide was on glTF. I think that this format could be the much needed common ground format for sharing and transferring 3D content with all of their dependencies. Its well structured format means that many tools could use it as both an import and export target. Optimization tools will find it easy to consume and rewrite. Finally client side the format is JavaScript friendly so that you can transform, explore and render it however you want. I'll be keeping a close eye on this format and contributing to the Github repository from time to time. I encourage you to check it out.

Server - CDNs, CORS, Beyond URLs


Our second leap in our build pipeline is the server. While you can precompile everything possible and then deploy it, there will be cases where this simply isn't feasible. We should rely on the server to ingest and optimize content as well, perhaps based on usage and access. Its also a tiered approach for the developer who might be working on the least powerful machine in their architecture. Having the server offload the optimization work means that iteration can be done much more quickly.

For the server my first reflection is that we need smarter servers and CDNs that offload many types of optimization from the developers machine and their build environment off into the cloud where it belongs. As an example, most developers don't produce 15 different streaming formats for their video today and upload each individually. We instead rely on our video distribution servers to cut apart the video, optimize, compress, resample and otherwise deliver the right content to the right place without us having to think about the myriad of devices connecting to us. For VR the same example would be in the creation of power of 2 textures, precomputing high quality mipmaps or even doing more obscure optimizations based on the requesting device.

For device specific tuning we can look towards compression. Each device will have a set of texture compression extensions that it supports and not all devices support every possible compression type. Computing and delivering these via the server can allow for current and future device adaptation without the developer having to think about redeploying their entire project with new formats.

The second reflection is on open worlds. For this we need high quality, highly available content. There are a large number of content pieces that will be pretty common/uniform. While you could generate all possible cubes, spherical maps and other common shapes on the client, you can also just make them widely available in a common interchange format and available over CORS.

For those not familiar CORS stands for Cross Origin Resource Sharing and is a way to get access to content on other domains and be able to inspect it fully. An example would be an image hosted on a server, perhaps it contains your password on it, so it would not be served via CORS. While you could retrieve that image and display it in the browser you would not be able to read its pixels or use it with WebGL. On the other hand if you had another image which was a simple brick texture you might want to use it on any site in WebGL. For this you would return the resource with CORS headers from the server and this would allow anyone to request and use the texture without worry of information leakage.

I suspect a huge repository of millions or even billions of objects to be available in this way within the next few years. If you are working on such a thing, reach out to me and let's talk!

My last real reflection is that accessing content via URL is inefficient. It doesn't allow the distribution of the same resources from different locations with the caching we need to make the VR web work. We need efficient interchange of similar and reusable resources across large numbers of sites, but not put the burden on a single CDN to deliver those so that caching works as it exists today in browsers. There are some interesting standards proposed that could solve this or maybe we need a new one.

Not even a reflection, but rather a note that HTTP/2 pushing content down the pipe as browsers request specific resources will be key. Enlightening the stack to glTF for instance to allow for server push of any external textures will be a huge win. But this comes with a lot of need for hinting. Am I getting the glTF to know the cost of an object or to render it? I will try to dedicate another future post to just HTTP/2, content analysis from the server and things we might build to make a future VR serving stack world and position aware. I don't think these concepts are new and are probably in heavy use in some modern MMORPG games. If you happen to be an expert at some company and want to have a conversation with me on the subject I would love to pick your brain!

Client - Progressive Enhancement


The last stage in our pipeline is on the client. This is where progressive enhancement starts to be our bread and butter technique. Our target environment is underpowered and at the high end of that spectrum will be LTE enabled mobile phones. Think Gear VR, DayDream and even Cardboard. Listening to Clay Bavor talk about the future it is clear Google is still behind Cardboard and admits to there being millions of units out there already with many more millions to come. Many people have their first and only experiences in VR on Cardboard.

Installation of content is no longer a crutch we can rely on. The VR Web is a no install, no long download, no wait environment. Nate Mitchell at OC3 alluded to this in his talk when he announced that Oculus would be focusing on some key projects to help move the VR Web forward. I still consider Oculus a team that delivers the pinnacle of VR so to take on the challenge of adapting the best VR experiences possible to this rather harsh set of mobile and web requirements is pretty epic. That is what the rest of this slide covers.

My first reflection after noting the requirements is that of progressive texture loading and fallback all the way to vertex colors when textures aren't available yet. The goal of VR is to get people into an immersive environment as fast as possible. Having a long loading screen breaks this immersion. The power of the web is the ability to go from site to site, without breaking your immersion, the way you do today when navigating between applications (or when you have to download a new application to serve a purpose for which you just gained a need). We also aspire to have VR to VR jumps work like they do in the sci-fi literature or in movies. A beautifully transparent experience as you switch from world to world.

We can achieve this with good design and since the VR Web is just starting we have the opportunity to design it right. My only contribution for now is to load your geometry with simple vertex colors, follow up with lightweight textures and finally, once the full texture is loaded and ready, upgrade to the highest quality experience. But don't block the user or lower your framerate significantly to jump up to that highest quality if it is going to be disruptive to your user. This will require some great libraries and quite a bit of experimentation to find all of the best practices. Expect more from me on these subjects in future articles as well.

My second reflection is on the importance of Service Workers. A feature designed to make the web offline can also be a powerful catalyst in helping the VR Web instantly load and become fully progressive. The features I think that are key in the near term are the ability to implement some of the prediction and prefetching for things like glTF resources. As the Service Worker intercedes it can fire off the requisite requests for all of the related textures and can cache them for later. We can also build in the progressive texture loading into the service worker and have it optimize for many different variables to deliver the best experience. Its basically like having the server of the future that we all want, but on the client and under our control.

Another feature of the Service Worker is understanding the entire world available to the experience and optimizing based on the current location and orientation information. This information could also fit into the future VR server, but we can test and validate the concepts in the Service Worker long before then.

My last reflection on the client is that there are some well defined app types for which we could deliver the best possible experiences using common code available to everyone, perhaps augmented by capabilities in the browser. This is highly controversial since everyone points out that there is indeed a base experience but that it is insufficient and customization is a must. I disagree, at least for now. Its like arguing against default transport controls in the HTML 5 video player. Why? 99% of your web developers will use them to provide 20% of the scenarios in the long tail of the web. Sure there is a 1% that develops top sites and accounts for 80% of the experiences and they'll surely want to add their own customization, but they are also in the best position to create the world class optimization and delivery services needed to make that happen.

To this end, I think its important that we build some best of class models for 360 photos and videos and optimize this into the browser core while VR is still standing up. These may only last for a couple of years, but these are the formative boot strapping years where we need these experiences to amaze and entice people to buy into the VR Web that doesn't quite exist yet.

Bonus Slides


I won't go into details on these. They are more conversation starters and major areas where I have some internal documentation started on what we might propose from the Oculus point of view. I'll rather list them here with a sentence describing some basic thoughts.

  1. Web Application Manifests - If these are useful for describing web pages as applications they can be used to describe additional 3D meta-data as well.
  2. Texture/Image Optimization - Decoding image types, compressing textures on the device, etc... Maybe the realm of Web Assembly, but who knows.
  3. glTF - Server enlightened data type with HTTP/2 server push, order optimized, LOD optimized and full predicted based on position and orientation.
  4. Position Aware Cube Maps - Load only visible faces, with proper LOD, perhaps even data-uri encode for the initial scene load.
  5. Meta Tags/HTTP Headers for Content Hinting - Initially this was only position and orientation so that the server could optimize but has since grown.


What's Next?


If you find things here interesting you can contact me now at justrog@oculus.com and we can talk about what opportunities might exist. I'll also be reaching out to you as I find and discover experts. I like to constantly grow my understanding of the state of the art and challenges that exist. Its my primary reason for joining Oculus, so that I could focus on VR problems all day, every day!

I have 3 more talks that I delivered at the conference, each prepared similar to this one as a blog post. I'll finish the perf series first and the I'll end with my rather odd talk on Browser UX in VR. Why was it odd? Stick around and I'll tell you all about it in the beginning of that post.

Saturday, October 8, 2016

Progressive Enhancement for the VR Web

Modern VR developers are wizards of the platform. Andrew Mo commented that they are the pioneers, the ones who survived dysentery on the Oregon Trail. It was meant as a joke, but every well timed joke carries more weight when it reflects a bit of the reality and gravity of the situation. Modern VR developers really are thriving in the ecosystem against all odds in tinier markets than their Mobile app and gaming counterparts while meeting a performance curve that requires deep knowledge of every limitation in the computing platform.

John Carmack, in his keynote, said that the challenge present in Mobile VR development is like dropping everyone a level. The AAA developers become indie devs, the indie devs are hobbyists and the hobbyists have just failed.

If VR is dominated by these early pioneers then where does the web fit in? These VR pioneering teams aren't web engineers. They don't know JavaScript. While WebGL seems familiar due to years of working with OpenGL, the language, performance and build/packaging/deployment characteristics are all quite different from that of a VR application in the store. Many new skills have to be employed to be successful in the VR Web.

There is a shining light in the dark here. Most people when they hear about WebVR immediately think about games or experiences coded entirely in web technologies, in JavaScript and WebGL. It’s a natural tendency to think of the final form a technology will take or even just draw parallels with what VR is today. Since today, VR is dominated by completely 3D immersive experiences, always inside of a headset, it can be hard to imagine another, smaller step that we could take.

Let's start imagining. What does a smaller step look like? How do we progressively evolve and enhance the web rather than assuming that we have to take the same major leaps that the VR pioneers have made to date? How do we reduce our risk and increase the chance of reward? How do we increase our target market size so that it greatly exceeds the constraint of people with consumer grade VR already available?

VR Augmentations and Metadata

Our first goal has to be that existing websites continue to serve billions of users. We need to progressively update sites to have some VR functionality, but not make that a requirement for interaction. Just like a favicon can be used by a site to slightly customize the browsing experience and make their site stand out in history, favorites or the bookmark bar, a VR ready site could supply a model, photosphere, 360 video or even an entire scene. This extra content would be hidden from the vast majority of users, but would be sniffed out by the next generation of VR ready browsers and then used to improve the user experience.

One of the most compelling options is to have your website provide an environment that can be rendered around your page. In this way you can set up a simple scene, decide where your content would get projected to and the browser would handle the rest through open web standards such as glTF. This isn't even a stretch of the imagination as a recent partnership between OTOY and Samsung is working on the right standards. I was able to sync up with Jules at OC3 and I have to say, there is a pretty big future in this technology and I'm happy to add it to the list of simple things existing website developers can do without having to spend years learning and working in VR and 3D graphics. Stick a meta or link tag in your head, or push it down as an http header (this is why meta+http-equiv is probably the best approach here) and you'll get more mileage out of users with a VR headset.

It doesn't stop here though. While this changes the environment your page runs in, it doesn't allow you to really customize the iconography of your site the way a simple, 3D model would be able to. Another glTF opportunity is in delivering favicon models that a browser can use to represent sites in collections like the tab list, most recently visited sites and favorites. A beautifully rendered and potentially animated 3D model could go a long way to putting your website on the mantle of everyone's future VR home space.

I think there is more value to be had in the Web Application Manifest specification too. For instance, why can't we specify all of our screenshots, videos and icons for an installable store page? A VR Browser would now be able to offer an install capability for your website that looks beautiful and rivals any existing app-store. Or if you like the app-store then you can specify the correct linkage and "prefer" your native experience. The browser in this case would redirect to the platform store, install your native application and off you go. I see this as exceptionally valuable for an existing VR centric developer who wants to augment their discovery through the web.

Immersive Hybrid Experiences

Our next goal is to start incrementally nudging our users into the VR space. While augmentations work for existing VR users and are ways to provide more VR like experiences on the existing web, we can do even better. Instead of a VR only experience we can build hybrid, WebGL+WebVR applications that are available to anyone with a modern browser.

How does this work? Well, we start with the commonality. Everyone has a browser today capable of some basic WebGL. This means any experiences you build can be presented to that user in 3D through the canvas tag in a kind of magic window. You can even go full screen and deliver something pretty immersive.

To improve this further, we can abstract input libraries that work across a large set of devices. Touchpad, touch, mouse, keyboard, device orientation, gamepad and the list continues to grow each day. By having a unified model for handling each type of input you can have a great experience that the user can navigate with mouse and keyboard or spin around in their chair and see through the window of their mobile phone. Many of these experiences start bordering on the realism and power of VR without yet delivering VR.

The last thing we do to nail this hybrid experience is detect the presence of WebVR. We've landed a property in the spec on the navigator object, vrEnabled. This will return true if we think there is a chance the user could use the device in VR. While I think there are some usability issues with such a simple property that will result in maybe some browser UX for turning VR on and off, this is a great start.

This is the next level of write once, run anywhere, but its built on the concept of progressive enhancement. Don't limit your user reach, instead scale your experience up as you detect the capabilities required. I've recently stated that this is a fundamental belief to our design of WebVR and I truly believe in maintaining this approach as long as there are users who can benefit from these choices.

I wanted to give an example of one of these experiences and so here you can see a 360 tour we've linked from our Oculus WebVR Developer portal. There will be many more examples to come, but this will run in any browser and progressively enable features as it detects them. My favorite is simply going there in your phone and looking around using device motion.

I can't speak highly enough of the value in building experiences like this. For this reason we at Oculus and Facebook will be releasing numerous options for building these experiences in minutes to hours rather than in days or more. Both content generation and viewing needs to be made easier. We need to make great photosphere export from pretty much any application (any developer should be able to emit and share a photosphere from anywhere in their game/application/experience), optimize how we transmit and display photospheres with potential improvements to streaming them in, with simple to use libraries. It has to be just as easy to extend this to use 360 video. Even a simple looping 360 video could provide your user with AMAZING bits of animation and value. Finally, extending this to make it interactive with text and buttons. You can go further if you have the skills to do so, but the basic libraries such as React VR will allow anyone to create great experiences with the above characteristics.

Getting out of the Way

Once we've extended existing websites, gotten some simple libraries in place and people start to build VR content, then final step is get the hell out of the way. This is easier said than done. There is a large contingent of web technology naysayers that have pointed out certain flaws in the web platform that make it non ideal for VR. I'll give some examples so it is clear that I do listen and try to resolve those issues.

  1. JavaScript is too slow. To which I respond, maybe when it comes to new experiences. JavaScript is a tuned language. It is tuned by browser vendors based on the content out there on the web today. Tomorrow, they could tune it differently. There are so many knobs in the run-time. C++ is no different. There are knobs you can tune when you pipe your code into cl.exe and it WILL result in differences in the run-time behavior and speed of your application. The browser is running in a chosen configuration to maximize the potential for a set of sites determined by usage.
  2. GC sucks. To which I reply, yes, the general model of GC that is present in your average web runtime does indeed suck. However, this is again, only one such configuration of a GC. Most GC's are heuristic in nature and so we can tune those heuristics to work amazingly well for VR or hybrid scenarios. I won't go into details, but let's just say I have a list of work ready for when I hire my GC expert into the Carmel team ;-)
  3. Binding overhead is too high. To which I respond, "Why?" There is no requirement that there be any overhead. An enlightened JIT with appropriate binding meta-data can do quite a bit here. While there are some restrictions in WebGL binding that slow things down by-design I have high hopes that we can arrive at some solutions to fix even that.

That list is not exhaustive. When you have full control over the environment and aren't part of a multi-tenant, collaborative runtime like a web browser, then you can be insanely clever. But insanely clever is only useful for the top 1% of applications once we enable anyone with a browser to build for VR. We need to get out of the way and make the common cases for the remaining 99% of the content not suck. That’s our space, that's our opportunity, that’s our goal.

Beyond the standard arguments against VR and the web I think there are more practical issues staring us in the face. The biggest one is the prevalence of full VR browsers. The first fully featured VR browser is going to be a deep integration of the web platform, OS, device and shell. There is simply too much going on today for a web application to be able to run with zero hiccups and an overall lack of measurement and feedback between the various sub-systems. Tuning everything from the async event loop, thread priorities and which sub-systems are running when in and out of VR is a very important set of next steps. Combining these enhancements with UX and Input models that enable people to have a level of trust in their to the extent that they could interact with a merchant in virtual space, whatever form that might take.

Right now we are experiencing the power of VR shells that plug into basic browser navigation models and do simple content analysis/extraction. This is an early days approach that only achieves a small fraction of the final vision.

For the graphics stack we need some basics like HTML to Texture support in order to bring in the full power of the 2D web into our 3D environments. I've been referring to 2D UX as having a "density of information" that can't be easily achieved in 3D environments. We need this in order for VR experiences to enable deep productivity and learning. Think about how often, in our "real" 3D world you break out your phone, laptop, book and other 2D format materials to quickly digest information. I don't think VR is any different in this regard. John Carmack, during his keynote, also noted that there was huge value in 2D UX because we've had so many years of experience developing. I believe this to be true and think that an HTML to Texture will broaden the use cases for WebVR dramatically.

We also need to enable full access to the pipeline. More and more I'm hearing about the necessity for compute shaders and access to gl extensions like multi-view. Even native is insufficient for delivering super high quality VR leading to advances in hardware, OS and software. The web needs to provide access to these capabilities quickly and safely. This may mean restrictions on how we compose and host content, but we need to experiment and find those restrictions now rather than holding the web back as a core VR graphics platform.

To close out, note how this section details yet more progressive enhancement. Each of the new capabilities and improvements we make will still take time to filter out to the ecosystem. This is no different in the native world where extensions like multi-view which have been defined for a couple of years are still not uniformly distributed. So producing experiences that reach the largest possible audience and gradually increase capability based on the device and browser features you detect is going to be key.

Over the next few months I'll be providing some tutorials talking about strategies and libraries you can use to enable VR progressive enhancement for your existing sites. You can also check out more of our demos and sign up for more information about our libraries like React VR at the Oculus Web VR Developer Portal.

Sunday, September 18, 2016

WebVR - 95% Web and 5% VR - That's a GOOD thing

When a new technology comes out there is rarely a singular influence. Technologies today are highly connected, almost always derivative and rarely kept private. These separate influences are good for the development of a diverse technology that is general enough to solve large scale problems and powerful enough to create experiences that are truly unique. I think WebVR is one of these truly diverse technologies, but only when you take full advantage of everything we've already developed and influence the many new technologies yet to be developed for the web.

The marriage of Web and VR tends to be very one sided. Most of the bits and pieces you need to build a modern VR application already exist in the web platform. Compare and contrast this with VR technology stacks, where people are still building their own event loops, threading models, memory models, etc... The VR space is the wild-wild west and this means there is a lot of room for cowboys, experimentation, lots of wheel reinvention factories and that's just the tip of the iceberg.

Leading the way in this round are the entertainment and gaming applications. This is where you tend to get highly specific experiences (tell me the last time the UX on your media player was considered innovative and didn't look like a 1990's VCR) or uniquely creative experiences that tickle your gaming itch. In this world nothing is uniform and everything is a new assault on your senses. A small and very influential group of people love this space and they become our pioneering group. They start to build and design the future while the calm majority sits back and waits for the technology to come to them.

The Web is this calm majority. A highly stable, yet quickly moving substrate of APIs and technologies, spread across billions of devices, with majority stakes already placed into many excellent and well established design principles and practices. Basically, you can get stuff done on the web. Its a productive place. Its usable by the masses, but configurable to the needs of the experts. It can do almost anything. Almost. What it can't do is blast you in the amazeballs with pixels of light that transport you to a far away world. Yet.

Given the existing influences there are two ways to get to the goal. The first is design it all again and make people move to the VR space. A sort of manifest destiny approach to technology. The second is to embrace the existing web for all of its great capabilities and even many of its terrible ones. Evolve the technology and let everyone adopt at their own pace. 

The later has a much better chance of working. Not because its the best approach for VR but because it allows us to retain as much of the power of the existing web as we can without resetting the entire ecosystem. Think back just 5 years and look at the rapid adoption of new and pervasive technologies on the web to understand its power and importance. Here are a couple that I was around to watch blossom.
  • Canvas - Yeah, it was basically just entering adoption 5 years ago and is now a premiere API for 2D bitmapped graphics creation.
  • WebGL - Allowing for truly customized graphical experiences in 2D, 3D, Image Processing and its the basis for WebVR (without WebGL there is no WebVR).
  • WebAudio - Giving full access to sound effects and mixing for the first time on the web.
  • Web Workers - Half way between threads and tasks giving access to extra cores
  • Media Capture - Camera, Microphone, Video
  • WebRTC - Adaptive media transfer for person to person interactions
  • Fetch - A real networking primitive for highly configurable requests
  • Service Workers - A brilliant empowerment of the network stack without breaking existing networking expectations of the long tail web
All of these web technologies were driven by a collective set of requirements from a highly diverse group of interested parties. Each technology did not have to try and incubate and grow on its own. It could instead live side by side with many others. When used in isolation a technology like Canvas would have lost to the many, much better solutions in the market already. But because it was augmentative to your average web page, it had a place. 

Those other, better, non-web, technologies live in isolation. They don't understand JavaScript, garbage collection, browser event loops, browser render loops and how to peacefully co-exist in a broad technology stack. Maybe they are faster and you can do more 2D rendering stuff in them than you can in Canvas, but those incremental differences aren't as powerful as being deployed to billions of devices, over the network, compiled from text to the screen and then printed (yeah, printing is a stretch, but bear with me!) alongside of any arbitrary piece of HTML content.

The capabilities of the web are now increasing at a rate faster than any other platform. This means its ability to provide value is leaps and bounds beyond most other systems. When compared on a single axis the web will likely lose. You can always build a better, more specific widget catering to a specific use case, but the power in the web is its Swiss Army Knife approach. We don't buy a Swiss Army Knife to be the best of any given thing, we instead buy it to be good enough at many things and hopefully fill in missing gaps in our tool chest. Its an astoundingly effective convenience and it takes convenience to enable modern developers to achieve the content depth and content breadth that is needed to appease the masses.

Looking forward there are both VR specific technologies and general web technologies that will make VR a success. Personally, some of the most interesting are being driven, not with VR in mind, but instead extending the web to new levels of capability that help it exceed app frameworks and runtimes and puts it on part with existing operating systems. A few critical to VR, web inspired technologies in my short list are...
  • Service Workers - VR applications are resource intensive and the first line of defense is going to be prioritization of resource downloads, caching and working around the inefficiencies that exist in legacy file formats.
  • WebGL - New performance improvements to WebGL and extensions like multi-view or stereo rendering are likely to land with or without WebVR being a consumer since they have general applications outside of VR as well.
  • HTML to Texture - If this existed today, it would likely be the top used API in WebGL applications. The pre-existing power we have in DOM, HTML/CSS layout and all of our existing content is something we don't just want to use we NEED to use it. We have to fill time in VR and new 3D assets alone are just not compelling enough.
VR will reinvent the web, but it won't happen in a chaotic upswell and massive conversion to fully 3D experiences overnight. It can't. There is too much value in our existing base. VR will reinvent the web in subtle ways such as spearhead better graphics APIs, faster JavaScript run-times tuned to our frame based patterns and parameterized GC behavior that will avoid frame jank. VR will push hard on solutions to the HTML to Texture problem so we can bring our web content into our 3D worlds. VR will redefine what security means when it comes to both new and existing APIs. Most importantly VR will define a small subset of new APIs (arguably 5% is too high from my title) that are critically important to the medium itself. Those APIs will work seamlessly with the existing web platform enabling our iterative transformation from the existing 2D informaton packed web to a mixed 3D/2D model allowing developers to pick the presentation mechanisms of their choice.

I hope that VR won't become a parallel stack to the web. If it did then I think we've failed hard. There is no need to reinvent all of the past 20 years in productivity and capabilities increases we've gotten from the web. There is no reason to relive all of the same mistakes and relearn all of the best practices. More importantly there is no reason that the existing web can't continue to evolve and get better as a collaborative partner, or more accurately an elder sibling allowing WebVR to shine in ways that it is best designed rather than spending endless cycles trying to replace all of the web experiences we already have with VR versions of the same. 

Sunday, August 7, 2016

How I Used the Career Triforce to Change my Job

When I was leaving Microsoft, on my last day, I got a lot of questions around why I was leaving. I was a lynch pin of sorts in the team. I was positioned very well to get impactful work. Highly networked. Very happy with my day to day. Reporting to one of the best managers/mentors on the planet.

I was on a career trajectory for Microsoft that was almost unreal. Averaging better than a promotion every 2 years with no slow down just because I was cross bands eventually ending up at the top of the Principal ladder. Compensation was great when compared relatively across other MS employees (will not be discussing this further). Everything seemed to be going amazingly well.

What did I say to everyone? Well, I said there are three areas (these are not my own, but taken from a book on career advice that I found exceptionally relevant) on which you should judge your current career. You should start by looking at your Job. Do you love it? Are you able to make an impact? Are you passionate about what you are doing? All off my answers here would be positive. The first part of my TriForce is complete

Next you look at your Manager. This is the singular individual who has the most control over your happiness and your career path in most companies. Ask yourself questions like, are you aligned with your boss? Does your support you when you are about to fail? Does your boss accentuate your good qualities and help you improve on your bad qualities? Can your boss act as your manager, your friend, a leader and a mentor? Well, #FML, it turns out I just found the second part of my TriForce.

Lastly I said you look at your Team. For me, at my level this meant looking at my immediate team, the entire Edge WPT team and then finally up to the Windows organization as a whole. Those are the scales at which I had impact at Microsoft. When looking at all levels of the team you ask questions like, do I like working with these people? Are the politics manageable or are they over the top? Does the team exercise trust? Does the team exercise transparency? As I worked from my local team up to Windows the third component in my TriForce starts to crack a little bit, maybe it has a little bit less luster.

However, when I consider the most stress I faced while making my decision to change jobs, it came down to the people. I loved the people and I felt like we created an almost extended family like support system for one another. I wasn't concerned about my projects that wouldn't get done if I left. Instead I was worried about the people that I worked with on a daily basis that I could see growing and becoming amazing engineers in their own rights. I was worried there wouldn't be enough people left infusing positive energy into the team on a daily basis to keep the morale up. I was worried that I was failing my team by leaving. That's when you realize, yeah, you have a great team. There may be some scuffs on that TriForce shard, but its still shining just as brightly as the other 2. My TriForce was complete.

My Answer

Okay, so if I already had the TriForce what kind of answer could I give everyone then? Why was I leaving? This is when I learned something that I had learned earlier in my career, but it took another 11 and a half years to discover it again. Once you've built a TriForce there isn't as much exponential growth in your future and mostly you just end up making incremental improvements. You spend more time doing the things you know, rather than learning new things. Your awesomeness starts to atrophy. You rarely feel the stress of a complicated and new situation. You rarely push your boundaries.

That isn't to say there aren't still moments like that. There certainly are. They just aren't as often and so growth tends to become linear and plateau increasingly frequently.

You also don't know if you have the skills to build another TriForce. I spend a lot of time mentoring and I often reach out for new mentees. My dream is that they too can achieve their TriForce and that I'm an enabler for that. I provide experience and strategies for working with difficult situations and to figure out why some aspect of their career is not shining or working well with the rest. Are my recommendations good? Do I have enough experience to offer the types of career advice that they need? If I put myself in their shoes, with their knowledge, and took on their risk would I be able to replicate my experience?

That is an important question for me. Doing something once can be dumb luck. It doesn't mean you can make it happen. It means it happened and perhaps it has something to do with you. But perhaps you are unaware of the actual forces of nature that brought it into being and it turns out it had nothing to do with you. That is a scary thought. Am I successful because of me? Or am I successful because of a random set of circumstances that I only manipulated superficially.

This led me to my answer to the team, paraphrasing a bit I finally said, "When you make a career change you should look at your job, boss and team. If they are all great then you are probably on the right track. When I look at myself, I have a TriForce in these three areas. Everything is amazing. So I had to use other measures to figure out my future. Specifically to follow my passions in VR and to see if I can build my second TriForce."

Maybe everyone thinks that is bullshit and will point to other factors in my decision making. I had a lot. Compensation, family, location and friends were all additional complications. However, I can say after tons of cross comparison Excel tables, almost everything zero'ed out between Oculus and Microsoft. I was only left with a very real and pressing question, one that Brendan Iribe asked me during my process. Do you want to think about VR all day, every day? That was his pitch to me. An offer to work on a technology that would change the future with all of my insight and passion. And when my answer to that simpler question is, "Fuck Yeah!" you can see how my explanation to my former team was given in honesty.

Passion

Passion isn't on the TriForce, but it is part of how you feel about your Job, how you are supported by your Boss (does he let you run with your wacky ideas?) and how your Team adapts to a changing society and marketplace. That makes it is an integral component in all of them. When you are passionate you'll find that you can't sleep because you are still solving problems. You spring out of bed every morning to rush to work. You let everyone know what you are working on and why they should care. You see clearly how what you are doing is going to change the future, improve lives, connect you more closely to your friends/family and make the world a better place for everyone to live.

When I saw the opportunity to lend my passion and devote all of my ability to launching the VR revolution I couldn't pass it up. VR has to potential to change the way that we think about education, jobs and entertainment. It literally allows us to redefine space itself and transform a living room into an anything room. I didn't jump ship to VR in the beginning because my expertise wasn't needed yet. But now is the time to scale and build platforms for VR that extend to millions. This is where I thrive as a developer. This is where the web thrives as a platform for scale and accessibility. This is the time to deeply investment my time and effort and build my second career TriForce. With news like the HTC VR alliance offering 10 billion in VC capital to development of VR content and experiences, I think I'm in good company thinking this way.

My Final Advice

Most people in their careers I find are working on some aspect of building their TriForce, probably for the first time. I know because I mentor some amazing developers and almost always they have some sort of hang-up in one of these areas and they haven't yet figured out how to completely self-diagnose themselves when things are going wrong.

For this reason I think evaluating your job, boss and team is a great way for you to figure out two things. For instance, do you need to improve something in your current career in order to elevate yourself to the next level. You may find that your job sucks for some reason, but it is within your control to make it not suck. You should do that. The easiest thing to change is yourself.

If you evaluate these and find that there are things outside of your control that you don't see are going to change then you can use it as a way to figure out how you are going to change your career. Not everything is within your control and often times your happiness or passion requires an environmental change. Perhaps your current job would get you there, but in a time period that is longer than you would like. I always recommended being open and honest during this period just in case you've misread your situation. If you did, then making your situation apparent to your team can sometimes result in the change that you were going to switch jobs for.

If you are sitting on your TriForce though and you are happy with all three you shouldn't close your eyes off to the opportunities that might present themselves. Maintain your marketability and interview skills. From time to time, reach out and do an interview or two and see what else is available both in terms of unique job roles, but also life changing compensation. When an opportunity comes along and you do have to make the big decision, know that it will be stressful. Then calm down, evaluate everything objectively and if it looks like another opportunity to build your next TriForce then perhaps you should go for it!

Monday, August 1, 2016

Why we need high quality, native streaming on mobile devices

I've gotten to spend a good portion of my day figuring out how to do something I think is pretty basic. I want to stream both my screen and my camera at the same time. I want to do it while I'm walking around. I want to layer the two streams together. Since I'm on an awesome phone I'd like to swap between front and rear facing cameras. I'd like to swap between the screen being the focus of the stream and whatever camera I have chosen.

Why is this so frigin hard? It appears there is no device that can accomplish this and instead we have to cobble together a bunch of technologies to make this work. Even pick one of the requirements above and things fall apart either at the device or in the app store once you start exploring software options.

iOS Options

On iOS you can do amazing things with video and overlays using software like Live: Air Solo. Thank god for this application because I thought I was going to have no solution for quickly switching between my cameras to capture the rapid action of Pokemon GO rare chasing. But iOS has limitations that prevent on device screen capture so streaming the screen will be impossible without an external device. There are lots of external options, but once I use them I'm tethered to some sort of Mac/PC device and now we are talking a lot of weight, power, etc... for a sufficiently long streaming experience.

iOS has further limitations in that you can't stream the video camera from a background application. You can register as a VoIP app but then you can only do audio apparently. Even Facetime cuts off the video as soon as you multi-task.

Android Options

On Android you can capture the screen pretty easily, but the remaining applications are a mess. I found not a single camera capture application that had the same quality as the iOS Live: Air Solo app. Most of them failed to even start recording and would instead just crash. Most would then fail to function properly after being restarted from the crash. I finally landed on an Android application that I have almost configured to properly work for Twitch called Bitstream. It also has screen capture support, but once I get everything cranking and I'm reviewing my stream I see a lot of hiccups and glitches so something isn't working quite right. Its hard to debug these issues due to the stream delays as well.

In the end there is no single device that meets my requirements (arguably an nVidia Shield should do this with its native Twitch streaming capability). I may be able to use Bitstream and switch between sources, but I won't be able to overlay them. That is the closest to a complete solution available without going the laptop option.

Laptop Options?

What would it take to go the laptop option? I need a laptop with a really long batter life and I probably need to have an external battery pack for it as well. I found the Mikegyver series of batteries that work with the Surface Pro series of tablets. This would be nice. Once you have the laptop going you can power your phone off the laptop, you can tether the laptop to the phone, you can mirror the screen onto the laptop and you can use something like OBS to do all of the compositing. You'll probably still end up using your laptops cameras in this configuration instead of your phone.

There is another option for laptops, BlueStacks, but does this work with a mobile game like Pokemon GO and can I walk around outside? It seems to use Fake GPS and other hacks to get it working. That isn't interesting. I want to be mobile, I want to be legit and I want to play the game for real.

Using a laptop is both expensive and heavy, but they appear to be the only quality options.

Why No Native Application?

I really have to ask why though. Is this such a new phenomenon that nobody has thought about how to build hardware and software that makes this possible? Streaming your video to an ingestion server has been around for at least a couple of years now and you'd think the build in applications would allow this. Streaming video from a background app, also seems like another basic capability. Finally, streaming your screen so you can show it to and help others.

I think its time for vendors to build in an application with basic RTMP support and the ability to broadcast from the screens, cameras and microphones present on the devices. Its an amazingly useful way to share your experiences, provide help/support and otherwise express yourself with your friends. Third parties aren't doing a great job so its a huge hole in the application ecosystem and streaming is being dominated by the desktop market when all of the interesting stuff is happening outside.

There are some inroads from app makers. A lot of services are building applications for their dedicated endpoints. Periscope from Twitter is actually quite a powerful application, but it doesn't interact with other services. Meerkat and UStream also seemed interesting, but were closed. At the end of the day I want to send it to the ingestion servers of my choosing and record for a duration of my choosing. Many of the services have limitations in all of these areas.

I'm going to keep running down applications and solutions to this problem. I'm not willing to give up just yet and I think there is a solution hidden somewhere. If you have ideas leave them in the comments.