Thursday, June 25, 2015

Introduction to Browser Internals

V8, webkit are now common terms used in web development. But, are we able to connect dots between these terms?

How browsers render HTML, CSS and JavaScript and what all are the other tasks
of browser. Once we get deep understanding about browser internal, we make better decision
moving forward in web development.

Gecko and Webkit are different rendering engines. Gecko-based browsers use
SpiderMonkey as their JavaScript engine, while WebKit-based ones use either
JavaScriptCore or V8. Rendering engines are responsible for rendering of HTML,
images and PDF etc. on page.

Following are the mostly used browsers and their rendering engines:

Internet Explorer  - Trident
Firefox - Gecko
Safari - WebKit
Chrome and Opera (from version 15) -  Blink (derived from WebKit)

Here is the diagram explains high level structure of browser:

Rendering process:

1. Generation of DOM tree - Rendering engine takes HTML from networking layer, parses it and converts it to tree format.
2. Generation of render tree - Rendering engine takes internal and external
css and combines with visual HTML to form another tree structure. Render tree
also contains color and dimensions information.
3. Generation of layout in render tree - This step will decide the co-ordinate
to each node. So, Render tree will contain the co-ordinate information as well.
4. Painting - Finally, render tree will be traversed and painted using UI
backend layer.

Process of parsing: Output of parsing is parsing tree and It involves 2 steps:

1. Lexical Analysis - Converts input to tokens (language vocabulary symbols).

Example of vocabulary:

INTEGER: 0|[1-9][0-9]*

2. Syntax Analysis -  Verifies the token with syntax rules, if rule is matched
the it will be added to parse tree.
Example of syntax rule can be term followed by operation followed by term
(term operation term).

Example of syntax:

expression :=  term  operation  term
operation :=  PLUS | MINUS
term := INTEGER | expression

After the parsing, parsed document is sent to compiler which is responsible
to convert it into machine code.

WebKit uses two well known parser generators: Flex for creating a lexer and Bison
for creating a parser

Language given in above example is simple since it's Grammar is defined. There are parser generator present that can take above grammar and generate parse for you! So, above language is context free language since it's grammar is defined.

But, HTML is not a context free language since there are many cases when syntax of HTML is incorrect still the page renders. HTML parser validates wrong syntax as well so that page renders. It makes it HTML parser much more complex since there are so many cases to handle.

There is no grammar defined for HTML parser, but specification is defined in HTML5 which is also known as DTD (document type definition). Output tree is DOM for which also specification is defined. DOM contains document, body and other nodes in tree format as shown below:

HTML5 parser cannot use top down or bottom up parsing due to it's forgiving nature. It uses a custom parser which involves tokenization and tree construction. Tokenization takes character from network and converts it to token which is then sent to tree construction step.

1. Tokenization algorithm:

<div>Hello World</div>

  • Initial state is "data state"
  • when < will encounter it will move to "tag open state", determines tag is open
  • when character from a-z will encounter, it will move to "tag name state"
  • when > is encounter it will move to "data state" again
  • Now, it will capture the data "Hello World"
  • Once < start again it will move to "tag open state" and if it is followed by /, determines tag close

2. Tree construction Algorithm: Receives tokens from tokenization step and creates dom.

Basic algorithm works as below:

  • initial mode
  • before html mode
  • before head 
  • in head
  • after head
  • in body
  • after body
  • after after body

If head tag is not present in html, it is created by default and move to next step (body in).

HTML parser never gives an error and it handles following scenarios and so many  others as well:

  • <br> or </br> both are allowed
  • If there is a table inside table, but not inside td, parser will place table as siblings.
  • Second form is ignored in case of nested forms
  • Parser will only allow same nested element 20 times maximum
  • Close html and body tag if not closed
CSS Parser: Unlike HTML, css is context free language. Lexical grammar and syntax grammar is defined in case of css. Webkit users flex and bison parser to parse in bottom up fashion. CSS style sheet is parsed into stylesheet object which has following structure:

Layout Process: This will decide the co-ordinate that is how the elements will render.
1. Parent determines it's own width
2. Parent height is dependent of child height

Painting: CSS2 defines the order of rendering as below:
  1. background color
  2. background image
  3. border
  4. children
  5. outline
The box are stored in stack and ordering is determined by z-index. If they have same z-index the foremost will be drawn.

1 comment: