Sunday, May 10, 2015

Constructors by Construction in JavaScript, MSHTML and EdgeHTML

Today we are going to be talking about Constructors and how they work in both Internet Explorer and Microsoft Edge. This is part 2 in a series about how the browser type system is constructed, the primitives that make it up and how you can use this to your advantage when integrating deeply with the browser. If you have not read Part 1: Type System, then you probably want to take a quick foray. These articles are designed to be read in order unless you are a JavaScript grand master (black belt's, go back and read ;-)

The History of Constructors


It turns out that Constructors in JavaScript are older than dirt and even Constructors in the browser have been around for a very long time. The first 2 supported constructors in IE were the Image and Option constructors. While they did allow you to create instances of the <img> and <option> elements, they weren't REALLY constructors. Here are some of the many infractions they suffered:
  1. They were "object" and not "function" types. This is because they were themselves DOM objects retrieved by property getter lookup when queried by name.
  2. They had a strange create method on them. This was for languages which didn't support the new operator through the DISPATCH_CONSTRUCT functionality of IDispatchEx::InvokeEx.
  3. They didn't have prototype properties that allowed the basic round-trip of constructor -> prototype -> constructor, etc...
  4. There were literally 2 of them for the longest time, then 2 more when XMLHttpRequest and XDomainRequest were added. By the time we added more FastDOM existed and so did a new method for doing constructors that was "proper" ;-)
So that was pretty much the world in IE 5 and 7 modes. When we introduced 8 we introduced MDP or Mutable DOM Prototypes (more information in Part 1 of this blog series) and this introduced a slightly more advanced implementation of constructors.

First and foremost we added names for all of the constructor types for at least the leaf types. This resulted in a little over 100 new constructor objects being added to the window namespace. Still, interrogating an IE 8 type system would be very foreign since the hierarchy represented OUR C++ hierarchy in the code more than it represented anything in a W3C spec. It was a solid attempt to provide value to web developers though and it was pretty cool to interact with the beginnings of a type system.

Just like with our COM implementation of constructors, the new system had some flaws. Here is another list of infractions observed in the IE 8 system:
  1. They were still "object" types, which means you still couldn't see them as functions. You also couldn't use call/apply on them. This may not strike you as odd, but in the same technology we added "native methods" which did support call and apply so it was odd that a constructor did not receive the love.
  2. All of the new methods added the "create" method which mapped to DISPID_VALUE behind the scenes. However, since this was still COM based, it meant that property gets would return the name of the constructor "[object Element]" for instance, and invokes would try the objects factory if one existed.
  3. The prototype properties pointed at an actual object this time around, but the constructor pointed back at a DIFFERENT object than the constructor instance on window. This meant that Element.prototype.constructor !== Element... But Element.prototype.constructor === Element.prototype.constructor.
  4. We didn't yet support new features like the "name" property on the constructors.
So IE 8 was, well, confusing and probably the features were only used by a handful of sites. In IE 9 and FastDOM we wanted to build a fully integrated type system combined with the new Chakra engine that was fully ES 5 compatible. It is here that we'll start describing in more detail how the browser is able to achieve this goal and the types of objects and APIs used to create a seamless JavaScript + Browser DOM experience. To close this down though, even IE 9 had some problems which I'll briefly list and then we'll describe in more detail:
  1. In FastDOM we had a choice of being a "function" or an "object" when making our constructors. The trade-off was memory. A "function" required that we fully initialize the instance with its prototype and any constant. An "object" we could defer this processing until first access. For this reason, we chose only to make objects which had a non-empty constructor function of type "function" and the rest "object".
  2. We in some cases still had "create" functions for legacy purposes. But they were implemented as static methods, a feature of WebIDL.
  3. We tolerated the use of call/apply on some constructors which seemed correct at the time. We still have such a tolerance while other browsers throw.

Constructor Basics


A constructor object is generally a field on the Global instance. It should be a proper-cased. It should be derived from the Function prototype and therefore be of type "function" when queried with the typeof operator. These requirements will mean that it implicitly gets call/apply/bind from Function. It will also get properties for "name" and "length". The name property should reflect the name of the field on the global window.Node.name === "Node", though type system modifications could break this semantic easily. The "length" property should report the number of required arguments. For constructors this should normally be 0 as traditionally all arguments have been optional.

To make this a bit more clear, I wanted to use JavaScript itself to make a mirror of the Node constructor which we'll call ANode. There aren't any complexities here like inheritance, etc... but those aren't actually that complex so this should cover all of the basics.



Also important is to understand what the WebIDL might look like for such an interface. Its very simple, and we'll build on this to add more features shortly.



Basically, the interface keyword specifies that we are building a concrete prototype object. The name in our case is ANode. This construct by itself tells us to build a constructor object, a prototype object with a default super (Object is the default super class), point the constructor and prototype at one another, and finally define a constant and a method.

Check out the definition of the constant itself. The const keyword in this context supplies some additional configuration information for us to follow. The property must be a) configurable: false and b) writable: false. You can't mess with these properties ;-)

Also, since we specified static for the create method, it goes on the constructor instead of the prototype object. Normally, we'd put any attributes or methods directly on the prototype, but not in this case.

There is one additional piece of information that I've left out, but is important. When you define constants, they must be written to both the constructor and the prototype. Our example ANode does not do this so it is technically incomplete, but I've left that out for brevity.

Specifying Constructability

There are two keywords in WebIDL that allow for a constructor function to be specified by a type. The first is the Constructor keyword and the second is the NamedConstructor keyword. The first says, use my current interface name, while the second says, use my specified name. We'll build one of each in the next WebIDL snippet. Namely the Image constructor and XMLHttpRequest.


These annotations simply mean, that we should expect a constructor function to be available for these types. This is how the IE 9 type system would pick between a constructor being an "object" or "function" type. If you didn't have a constructor, then we could get away with you being an "object" type instead and thus get some memory efficiency by deferring construction. This is demonstrated by Figure 1.

Figure 1: Type Configuration for Constructors

Notice how the NamedConstructor resulted in an object and an function that both point to the same prototype. Both constructors represent different ways of referring to objects of that type. The HTMLImageElement is the primary interface and so it gets a complete prototype -> constructor round-trip in the type system. The Image constructor is just an alias or name used in order to build instances of the HTMLImageElement type. The prototype also controls the final object. We don't build Image objects from the Image constructor, we instead return a new instance of the HTMLImageElement type instead.

As of EdgeHTML the "object" constructor difference has been eliminated. In EdgeHTML the HTMLImageElement is also a function. However, it implicitly binds its [[Code]] property to a default constructor function which simply throws. All browsers have this functionality now, so if you tried to execute code such as, new HTMLImageElement(), you would get an exception.

Browser vs JavaScript Constructors


A browser constructor is different from a standard JavaScript constructor in that the implicitly created and passed in "this" pointer is simply discarded. When you build a JavaScript constructor you set up new properties on the passed in "this" and it turns out that the passed in "this" already has its prototype set up and everything is cool. That would even work for the Image case for the Browser since its prototype property also points to the HTMLImageElementPrototype. Alas, things aren't that simple.

Turns out a generic object with a prototype property set up is still NOT an image element. Let's say for now that CImageElement, a C++ type in the backend, actually contains the code to be a full on image with rendering, ability to be put into a tree, etc... So we need one of those instead.

The browser resolve this issue by simply using the constructor to determine the scope of the script engine. Image was a function created by some script engine, that script engine in turn was created by some browsing context, so we get to the browsing context and create an image of the appropriate type.

Consequently this means if you screw up the type system, it doesn't matter. We'll still create the right type. For instance, if you null out Image.prototype, we can still create a proper image, because behind the scenes the browser has a complete and static view of the type mapping. You can still insert prototypes and do other tricks, but you can't fool the browser into creating the wrong type. At least not yet, we'll get to that in the "futures" section ;-).

Wrap-up and Apologies


I'm going to wrap this up. Its already getting very long and I feel like event constructors could be covered in their own right. They are a huge feature and I would do them a disservice by trying to describe them in 3-4 paragraphs. So look forward to event constructors in a future installment and I apologize if this is the only reason you dropped by.

So to wrap up how constructors work in general, you've seen that as a browser we implement the as either native JavaScript objects or as native JavaScript functions depending on the version of the browser and whether or not they have their own [[Code]] property set. For all of these cases we obey the Constructor and NamedConstructor properties to figure out who gets a [[Code]] property and who does not.

Because they are native, working with constructors and prototypes is extremely efficient. From Part 1 I showed that constants are entirely described in JavaScript (the constructor simply points fields at native JavaScript values). Further, the name, length, etc... are all inherited from the built-in Function object. We'll cover prototypes more in Part 3 along with inheritance, but they too are completely, 100% native, JavaScript objects. It turns out the only thing we don't implement natively as script objects are the native bindings, as you can see from the various green boxes spread throughout the diagrams, and also the gold boxes (again referring to Part 1) which describe the native instance of an object which is shadowed by its JavaScript instance.

You also got to see a rather cool feature called static. This was originally not in the WebIDL specification, but was inspired by IE's create methods. We needed a solution to implement legacy compatibility in IE 9 and thus were born static methods on constructors. Static can also be applied to properties (attributes in webIDL) but you'll be hard pressed to find a spec using such a complicated behavior today since usually constants are sufficient for this purpose. That is starting to get into Futures again, so lets just jump in!

Futures


We've been evolving WebIDL and the capabilities of the type system, very, very rapidly. This leads to some interesting potential for future specifications to take advantage of, or even for improvements to existing specifications.

Event constructors, was one such evolution. Once we described dictionaries as a way to describe the parameters of a function in a very loose, sparse way, we could then add a constructor which took such a dictionary and now events are super easy to create. It also beat the 27 parameter behemoth functions that were being created for initPointerEvent. It also allowed us, as browser developers, to begin adding new properties to existing events, without having to constantly evolve the init* methods to contain the new parameters. Basically this was becoming a mess, and constructors+dictionaries immediately rectified the mess.

Another evolution will be static methods and properties. This allows the use of an interface as a namespace on which to hang things. Namespaces or modules are something very familiar to JavaScript developers. We use them all the time to create hierarchies and scopes of isolation. The browser, however, doesn't really use such a model. Instead, we rely on instances to create the hierarchy and our constructors flatten the hierarchy right back onto the global anyway. An example would be things like window.navigator, where effectively all of the capabilities there are static capabilities. There is nothing truly instance based, yet, we use the instance to namespace the properties and methods available there. Shouldn't it be Gamepad.getGamepads() and not window.navigator.getGamepads()? Which one makes more sense to you?

Okay, so we can namespace and module and scope and all of that. What else? How about element constructors? Right now everything goes through document.createElement and then once you get an instance back you have to operate on said instance. IE used to have a strange feature where you could createElement("
") and we would return an instance with some attributes set. That was actually kind of cool and in fact, when we removed that, we broke a lot of sites that relied on it. Imagine if we combined dictionaries with constructors like we did for events? This might allow for some neat scenarios such as creating an element with all of it attributes in one shot. I'm not saying it will happen or even that its a full baked good idea, but its worth thinking about.


I hope you've enjoyed this deep dive into constructors. If I've missed anything or you have a nagging question, feel free to leave me comments or ping me on Twitter @JustrogDigiTec.

No comments:

Post a Comment