Save the (DOM) trees! Why direct DOM manipulation is not a bad idea

A generic, and so non-binary, unsorted, some labels duplicated, arbitrary diagram of a tree.
A generic, and so non-binary, unsorted, some labels duplicated, arbitrary diagram of a tree. CC BY-SA 4.0

Direct DOM manipulation has gotten a bad reputation in the last decade of web development. From Ruby on Rails to React, the DOM was seen as something to gloriously destroy and re-render from the server or even from the browser. Never mind that the browser already exerted a lot of effort parsing HTML and constructing this tree! Mind-numbingly complex HTML string regular expression tests and manipulations had to deal with low-level details of the HTML syntax to insert, delete and change elements, sometimes on every keystroke! Contrasting to that, functions like createElement, remove and insertBefore from the DOM world were largely unknown and unused, except perhaps in jQuery.

Processing of HTML is destructive: The original DOM is destroyed and garbage collected with a certain time delay. Attached event handlers are detached and garbage collected. A completely new DOM is created from parsing new HTML set via .innerHTML =. Event listeners will have to be re-attached from the user-land (this is no issue when using on* HTML attributes, but this has disadvantages as well).

It doesn’t have to be this way. Do not eliminate, but manipulate!

Save the (DOM) trees!

sanitize-dom crawls a DOM sub-tree (beginning from a given node, all the way down to its ancestral leaves) and filters and manipulates it non-destructively. This is very efficient: The browser doesn’t have to re-render everything; it only re-renders what has been changed (sound familiar from React?).

The benefits of direct DOM manipulation:

  • Nodes stay alive.
  • References to nodes (i.e. stored in a Map or WeakMap) stay alive.
  • Already attached event handlers stay alive.
  • The browser doesn’t have to re-render entire sections of a page; thus no flickering, no scroll jumping, no big CPU spikes.
  • CPU cycles for repeatedly parsing and dumping of HTML are eliminated.

sanitize-dom

I just released version 4 of my sanitize-dom library which is my tool of choice to sanitize user-generated content. As mentioned in this post, my project takes a different approach: It doesn’t work with HTML, it operates only on DOM nodes.

sanitize-doms further advantages:

  • No dependencies.
  • Small footprint (only about 7 kB minimized).
  • Faster than other HTML sanitizers because there is no HTML parsing and serialization.

Check out sanitize-dom here on Github!

Vagrant hangs with message “Waiting for domain to get an IP address…”

The problem may be (it was in my case) that there are firewall rules preventing NAT firewall rules for the virbr0 network device created by Vagrant via libvirt, which may look like the following (excerpt of iptables -L -n):

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
ACCEPT     all  --  0.0.0.0/0            192.168.121.0/24     ctstate RELATED,ESTABLISHED
ACCEPT     all  --  192.168.121.0/24     0.0.0.0/0

A possible fix is to disable those restrictive rules (for example, clear all iptables rules before starting the Vagrant machine).


Wake-on-LAN (WoL) in Linux is disabled by default. How to enable it without ethtool.

Accomplishing the simple often takes a day or two.

According to countless tutorials on the internet, in order for a working WoL, you need to fulfill the following preconditions (where point 4 below is rarely mentioned [that’s why this article exists]):

  1. An ethernet hardware/card supporting WoL
  2. Enabling WoL in the BIOS
  3. Enabling the WoL feature on the ethernet card (usually using the user space tool ethtool)
  4. Enabling the WoL feature in the Linux kernel (and this was the major pitfall for me)

Support of WoL can be checked for a given ethernet card in the following way:

ethtool  | grep Wake-on

According to my own experiments on my local hardware, WoL seems to always be enabled by default:

Supports Wake-on: pumbg
Wake-on: g

Because it already prints Wake-on: g, signalling that WoL is enabled (for Magic Packets), one is tempted to not run ethtool -s <card_label> wol g to achieve the same.

However, this ethtool command does at least one other thing, being equivalent to the following:

echo enabled > /sys/class/net/<card_label>/device/power/wakeup

And the main pitfall is: After a reboot, cat /sys/class/net/<card_label>/device/power/wakeup will return disabled.

That is why attempts to make WoL work without ethtool may fail. The feature will stay disabled in the Linux kernel, which will completely shut off the ethernet card in most powersave and power-off states. (I verified this observing the LEDs of a connected ethernet switch).

Conclusion

On modern Linux systems, given supporting hardware, it seems to be enough to write the string enabled to /sys/class/net/<card_label>/device/power/wakeup to make WoL work. ethtool doesn’t seem to be required any more for that, since Ethernet cards seem to already be configured properly by default. The Ethernet card will stay powered in most power-save and power-off states, so that it can receive Wake-on-LAN packets.

Resources

https://www.kernel.org/doc/html/v4.14/driver-api/pm/devices.html
https://www.kernel.org/doc/Documentation/power/states.txt