Tuesday, December 29, 2015

Progress Towards a Fully Tested Web API Surface Area

Back in September I was amazed by the lack of comprehensive testing present for the Web API surface area and as a result I proposed something akin to a basic surface level API suite that could be employed to make sure every API had coverage. Since that time a lot of forward progress has occurred, we've collected a bunch of additional data and we've come up with facilities for better ensuring such a suite is complete.

So let's start with the data again and figure out what we missed last time!

Moar Data and MOAR Test Suites

Adding more test suites is hopefully going to improve coverage in some way. It may not improve your API surface area coverage, but it may improve the deep usage of a given API. We originally pulled data from the following sources:
  1. Top 10k sites - This gives us a baseline of what the web believes is important.
  2. The EdgeHTML Regression Test Suite - By far the most comprehensive suite available at the time, this tested ~2500 API entry points well. It did hit more APIs, but we excluded tests which only enumerated and executed DOM dynamically.
  3. WebDriver Enabled Test Suites - At the time, we had somewhere between 18-20 different suites provided by the web community at large. This hit ~2200 APIs.
  4. CSS 2.1 Test Suite - Mostly not an OM test so only hit ~70 APIs
Since then we've added or improved the sources:
  1. Top 100k sites - Not much changed by adding sites.
  2. Web API Telemetry in EdgeHTML - This gave us a much larger set of APIs used by the web. It grew into the 3k+ range!! But still only about 50% of the APIs we export are used by the Web making for a very large, unused surface area.
  3. DOM TS - An internal test suite built during IE 9 to stand up more Standards based testing. This suite has comprehensive depth on some APIs not tested by our other measures.
  4. WPT (Web Platform Tests) - We found that the full WPT might not be being run under our harnesses, so we targeted it explicitly. Unfortunately, it didn't provide additional coverage over the other suites we were already running. It did end up becoming part of a longer term solution to web testing as a whole.
And thanks to one of our data scientists, Eric Olson, we have a nice Venn Diagram that demonstrates the intersection of many of these test suites. Note, I'm not including the split out WPT tests here, but if there is enough interest I can probably try to see if we can try a different Venn Diagram that can include more components or rework this one and pull out an existing pivot.


Since this is so well commented already, I won't go into too much, but I'll point out some key data points. The EdgeHTML DRTs have a lot of coverage not present in any public suites. That is stuff that is either vendor prefixed, MS specific or that we need to get into a public test suite. It likely requires that we do some work, such as conversion of the tests to test-harness.js before that happens, but we are very likely to contribute some things back to the WPT suite in the future. Merry Christmas!?!

We next found that the DOM TS had enough coverage that we would keep it alive. A little bit of data science here was the difference between deleting the suite and spending the development resources to bring it back and make it part of our Protractor runs (Protractor is our WebDriver enabled harness for running public and private test suites that follow the test-harness.js pattern).

The final observation to have is that there are still thousands of untested APIs even after we've added in all of the coverage we can throw together. This helped us to further reinforce the need for our Web API test suite and to try and dedicate the resources over the past few months to get it up and running.

WPT - Web Platform Test Suite

In my original article I had left out specific discussions of the WPT. While this was a joint effort amongst browsers, the layout of the suite and many aspects of its maintenance were questionable. At the time, for instance, there were tons of open issues, many pull requests, and the frequency of updates wasn't that great. More recently there appears to be a lot of new activity though so maybe this deserves to be revisited as one of the core suites.

The WPT is generally classified as suite based testing. It is designed to be as comprehensive as possible. It is organized by specification, which arguably means nothing to web developers, but does mean something to browser vendors. For this reason, many of the ad-hoc and suite based testing which was present in the DRTs, if upgraded to test-harness.js, could slot right in. I'm hopeful that sometime after our next release we are also able to accompany it with an update for WPT that includes many of our private tests so that everyone can take advantage of the collateral we've built up over the years.

Enhancing the WPT with this backlog of tests, and potentially increasing coverage by up to ~800 APIs, will be a great improvement I think. I'm also super happy to see so many recent commits from Mozilla and so many merge requests making it back into the suite!

Web API Suite

We still need to fix the API gap though and so for the past couple of months we've (mostly the work of Jesse Mohrland, I take no credit here) been working on a design which could take our type system information and automatically generate some set of tests. This has been an excellent process because we've now started to understand where more automatically generated tests can be created and that we can do much more than we originally thought without manual input. We've also discovered where the manual input would be required. Let me walk through some of our basic findings.

Instances are a real pain when it comes to the web API suite. We have about 500-600 types that we need to generate instances of. Some may have many different ways to create the instances that result in differences of behavior as well. Certainly creating some elements will result in differences in their tagName, but they may be of the same type. Since we are an API suite we don't want to force each element to have its own suite of tests, instead we focus on the DOM type and thus we just want to test 1 instance generically and then run some other set of tests on all instances.

We are not doing the web any service by only having EdgeHTML based APIs in our list. Since our dataset is our type system description, we had to find a way to add unimplemented stuff to our list. This was fairly trivial, but hasn't yet been patched into the primary type system. This has so many benefits though. Enough that I'll enumerate them in a list ;-)

  1. We can have a test score the represents even things we are missing. So instead of only having tests for things that exist, we have a score against things we haven't implemented yet. This is really key towards having a test suite not just useful to EdgeHTML but also to other vendors.
  2. True TDD (Test Driven Development) can ensue. By having a small ready-made basic suite of tests for any new APIs that we add, the developer can check in with higher confidence. The earlier you have tests available the higher quality your feature generally ends up being.
  3. This feeds into our other data collection. Since our type system has a representation of the DOM we don't support, we can also enable things like our crawler based Web API telemetry to gather details on sites that support APIs we don't yet implement.
  4. We can track status on APIs and suites within our data by annotating what things we are or are not working on. This can further be used to export to sites like status.modern.ie. We don't currently do this, nor do we have any immediate plans to change how that works, but it would be possible.
Many of these benefits are about getting your data closer to the source. Data that is used to build the product is always going to be higher quality than say data that was disconnect. Think about documentation for instance which is built and shipped out of a content management system. If there isn't a data feed from the product to the CMS then you end up with out of data articles for features from multiple releases prior, invalid documentation pages that aren't tracking the latest and greatest and even missing documentation for new APIs (or removing documentation for dead APIs).

Another learning is that we want the suite to be auto-generated for as many things as possible. Initial plans had us sucking in the tests themselves, gleaning user generated content out of them, regenerating and putting back the user generated content (think custom tests written by the user). The more we looked at this, the more we wanted to avoid such an approach. For the foreseeable future we want to stop at the point where our data doesn't allow us to continue auto-generation. And when that happens, we'll update the data further and continue regenerating.

That left us with pretty much a completed suite. As of now, we have a smallish suite with around 16k tests (only a couple of tests per API for now) that is able to run using test-harness.js and thus it will execute within our Protractor harness. It can trivially then be run by anyone else through WebDriver. While I still think we have a few months to bake on this guy I'm also hoping to release it publicly within the next year.

Next Steps

We are going to continue building this suite. It will be much more auto-generated than originally planned. Its goal will be to test the thousands of APIs which go untested today by more comprehensive suites such as WPT. It should test many more thousands of unimplemented APIs (at least by our standards) and also some APIs which are only present in specific device modes (WebKitPoint on Phone emulation mode). I'll report back on the effort as we make progress and also hope to announce a future date for the suite to go public. That, for me, will be an exciting day when all of this work is made real.

Also, look out for WPT updates coming in from some of  the EdgeHTML developers. While our larger test suite may not get the resources to push to WPT until after our next release I'm still hopeful that some of our smaller suites can be submitted earlier than that. One can always dream ;-)

No comments:

Post a Comment