A Radix Story
About routing and searching using radix trees.
Published on February 27, 2018
Go Radix Tree Router SearchAbout 10 minutes of reading.
A Radix Story
Yes, All You Need is Standard “net/http”
There are, out there in the wild, an abundance of routers that a developer can use. In my opinion, one thing that they all share is the lack of separation of concerns. By separation of concerns I mean mixing handlers with routes and conditions, putting on top of that some cool optimized algorithm for looking up a route (or maybe I should say inspecting a string…). If you don’t need everything those packages provide and you feel you should get rid of those training wheels, read on.
A Small Detour
I should probably write a separate article on the subject of what I consider to be a good Go developer, but here is short version of my opinion.
As you all can observe, the whole variety of packages are competing with each other on the matters of allocations and speed. From my point of view this hides a powerful truth about Go language and the developers who use it.
One of the main reasons for switching languages and becoming a Golang developer has to do with speed execution. For this reason and this reason only, I think that we can tell a good developer from a mediocre one, just by looking at his abilities to use the tools that he has and create new ones specialized for the problem that needs a solution.
Long story short, I strongly believe that a good Golang developer should have a wide enough knowhow and apply it whenever needed by adapting a generic solution to a particular case - for the sake of simplicity, speed and allocations.
What We Are Trying To Solve?
The problem is really simple : let’s separate the concerns in such manner that we can adapt a sufficient small piece of code to our needs, whatever those needs might be.
In the case of a router, http.Request gives us access to the requested path. So, if we are able to process that path in a way that we need, we might not need a router at all. In the end we should be able to write a well anchored set of business logic rules without making the effort to adapt our mental model to a set of rules and conventions imposed by the authors of a package.
To give you a simple example, most routers won’t allow you to have a otherwise
route : match this route with this handler - and so on, but otherwise, put all other requests on this handler. This comes handy when you need to cover a dynamic defined route, like serving a request based on the slug of a title or the slug of category (title slug being the otherwise
handler).
Let’s try to forget about what a route really is, by splitting it into three main properties : a http method (GET/POST/PUT/etc.), a path string and a function which needs to be called if that path string looks like something particular.
From now on, we’re going to deal with that string only. Leaving the theory alone, the applications of a radix tree are constructing a tree of pieces of strings which hold a reference to an object (fancy called associative arrays with keys).
Let’s get coding:
package radix
type (
Tree struct {
root Node
}
Edge struct {
label string
parent *Node
child *Node
}
Edges []Edge
leafNode struct {
key string
value interface{}
}
Node struct {
leaf *leafNode
edges Edges
}
)
func (n *Node) isLeaf() bool {
return n.leaf != nil && len(n.edges) == 0
}
func (n *Node) createNode(edgeKey, leafKey string, value interface{}) {
leafNode := &Node{
leaf: &leafNode{
key: leafKey,
value: value,
},
}
newEdge := Edge{
label: edgeKey,
parent: n,
child: leafNode,
}
n.edges = append(n.edges, newEdge)
}
func (n *Node) createNodeWithEdges(newKey string, edgeKey string) *Node {
if n.isLeaf() {
//node is leaf node could not split, return nil
return nil
}
for idx, edge := range n.edges {
if edge.label == edgeKey {
// backup for split
oldNode := edge.child
// create a new node
newNode := &Node{}
// replace current edge with a new one
n.edges[idx] = Edge{
label: newKey,
parent: n,
child: newNode,
}
// connect to original node
remainKey := strings.TrimPrefix(edgeKey, newKey)
newEdge := Edge{
label: remainKey,
parent: newNode,
child: oldNode,
}
// append the edges with that new edge
newNode.edges = append(newNode.edges, newEdge)
return newNode
}
}
return nil
}
// it's recursive
func (t *Tree) insert(target *Node, edgeKey string, leafKey string, value interface{}) {
// we've reached leaf
if target.isLeaf() {
if leafKey == target.leaf.key {
// the same leaf key, update value
// if overwriting values is by convention forbidden, should panic
target.leaf.value = value
} else {
// insert leaf key value as new child node of target
// original leaf node, became another leaf of target
target.createNode(edgeKey, leafKey, value)
// we have a convention here, regarding empty strings
target.createNode("", target.leaf.key, target.leaf.value)
target.leaf = nil
}
return
}
// second case, checking edges
for _, edge := range target.edges {
compare, found := longestPrefix(edgeKey, edge.label)
if found {
if compare == edge.label {
// trim edge.label from new key
nextTargetKey := strings.TrimPrefix(edgeKey, edge.label)
// recurse
t.insert(edge.child, nextTargetKey, leafKey, value)
} else {
// using compare to create new node and separate two edges
splitNode := target.createNodeWithEdges(compare, edge.label)
if splitNode == nil {
panic("Unexpected error on creating new node and separating edges")
}
splitContainKey := strings.TrimPrefix(edgeKey, compare)
splitNode.createNode(splitContainKey, leafKey, value)
}
return
}
}
// new edge with new leaf key on leaf node
target.createNode(edgeKey, leafKey, value)
}
func (t *Tree) Insert(what string, value interface{}) {
//leaf key and edge key are the same
t.insert(&t.root, what, what, value)
}
func (t *Tree) search(where *Node, what string) (interface{}, bool) {
if where.isLeaf() {
return where.leaf.value, true
}
for _, edge := range where.edges {
if compare, found := longestPrefix(what, edge.label); found {
nextSearchKey := strings.TrimPrefix(what, compare)
return t.search(edge.child, nextSearchKey)
}
}
return nil, false
}
func (t *Tree) Search(what string) (interface{}, bool) {
return t.search(&t.root, what)
}
func longestPrefix(s1, s2 string) (string, bool) {
found := false
for i := 0; i < len(s1) && i < len(s2); i++ {
if s1[i] != s2[i] {
result := s1[:i]
return result, found
}
found = true
}
if len(s1) > len(s2) {
return s2, found
} else if len(s1) == len(s2) {
// special case : "" is not a subset of ""
return s1, s1 == s2
}
return s1, found
}
The code is pretty straight forward, however I’ve placed some comments here and there, just in case.
As you can see inside the test, I’ve tested the search against some HTTP headers, just in case someone might find it useful to discard any unwanted headers inside a middleware. Later on, I’ll probably benchmark the classical method of collecting HTTP headers against this method. However, this is not the point here.
Usually, we, developers need to match URLs like /posts/{tag}/{id:[0-9]+}
in the HTTP path and while we’re at it also extract those meaningful arguments. Things seem even more complicated when the string contains query values, like /posts/{tag}?filter=date&author=3
.
The Star Lookup
Let’s make the following convention : we don’t care (at the moment) how an argument look like. For us, everything that is an argument is a star (*
), leaving the decision of converting and validation to the next business logic level (which might be as well the handler itself).
Given the above convention, the path /posts/{tag}/{id:[0-9]+}
would translate to /posts/*/*
, ok?
Also, in order to capture the query values /posts/{id:[0-9]+}?filter=date
would translate to /posts/*?*
.
Here’s the code :
func (e Edge) isStar() bool {
return strings.HasPrefix(e.label, "*") || strings.HasPrefix(e.label, "/*") || strings.HasPrefix(e.label, "?*")
}
func (t *Tree) starSearch(where *Node, what string) (interface{}, bool) {
if where.isLeaf() {
return where.leaf.value, true
}
for _, edge := range where.edges {
if edge.isStar() {
// search key is empty and we're on the "/*" means we're looking for the last sibling of the edge
if what == "" && strings.HasPrefix(edge.label, "/*") {
continue
}
// different kind of star... ("*" or "?*")
// remove the common parts
remove, found := longestPrefix(what, edge.label)
if found {
what = strings.TrimPrefix(what, remove)
}
// split by slashes so we can build a new key
parts := strings.Split(what, "/")
switch len(parts) {
case 1:
// ok, we had one piece
// looking for the question mark - might be handy to give up on this for speed
index := strings.Index(what, "?")
if index > 0 {
//TODO : collect star key after question mark
fmt.Println("Collect after question mark ", what[index:])
// lookup question marks - down in the tree
return t.starSearch(edge.child, what[index:])
}
// don't have a question mark, but we have a star (continue) - looking for the last sibling edge
if strings.HasPrefix(edge.label, "?*") && remove != "?" {
continue
}
//TODO : collect star key
fmt.Println("Collect Path Part", what)
// we have a star, no question mark - looking for the node leaf
return t.starSearch(edge.child, "")
default:
//TODO : collect star key part of the path
fmt.Println("Collect Path Part", parts[0])
// building a new key with the parts that we have
what = strings.Join(parts[1:], "/")
return t.starSearch(edge.child, what)
}
}
if compare, found := longestPrefix(what, edge.label); found {
nextSearchKey := strings.TrimPrefix(what, compare)
return t.starSearch(edge.child, nextSearchKey)
}
}
return nil, false
}
func (t *Tree) StarSearch(what string) (interface{}, bool) {
// TODO : use a mutex here if you are using this concurrent
return t.starSearch(&t.root, what)
}
Remember, you have to adapt this to your specific needs and for this reason I haven’t imposed a way of capturing path parts that match star (*
). I’ve left comments where you can do that.
In the above starSearch function there is a catch : if the edges slice is not sorted, so the last sibling is the star (*
), the search will not work. To sort those slices, we need the following:
func (e Edges) Len() int {
return len(e)
}
func (e Edges) Less(i, j int) bool {
return e[i].label > e[j].label
}
func (e Edges) Swap(i, j int) {
e[i], e[j] = e[j], e[i]
}
func (e Edges) Sort() {
sort.Sort(e)
}
The functions createNode and createNodeWithEdges will have the following additional piece of code:
if useSortEdges {
// we're always sorting in reverse, so stars are last siblings
newNode.edges.Sort()
}
Recommendations
First, don’t forget to use mutex for search - I’ve left a comment there - if you are using it in a concurrent way.
Secondly, the problem of defining one route is that you can’t do this /*/*
and expect it to work only in one route. In order to make it work you need an exception string
(see TestStarOneRoute) and that exception string to look like */something-never-matched
.
Regarding leafNode struct value property, which is typed interface{}
in the above code - you should do whatever you like with it. To give you an obvious example:
leafNode struct {
key string
value http.HandlerFunc
}
Last, but not least, you can alter things like adding a flag to the tree to mark it as dirty, then process all the edges in the tree and mark them as isStar true/false
so isStar() bool
function can be replaced like that. Or you could rewrite some parts to reduce allocations.
Other usages
Besides request routing, radix trees can be used in building ACLs (storing capabilities/permissions in the value
property), spell checkers, auto-complete suggestions and in memory database.
Code
The complete code (tests included) is here.
Related
Interview Questions for Go Developer Position - Part II
Measuring And Classifying Go Developer Knowledge
Published on December 7, 2018
Go Developer InterviewAbout 3 minutes of reading.
Changing Perspective
Changing Perspective Might Help You Understand
Published on November 20, 2018
Go Channels Grouping MethodsAbout 7 minutes of reading.
Interview Questions for Go Developer Position
Measuring And Classifying Go Developer Knowledge
Published on November 18, 2018
Go Developer InterviewAbout 7 minutes of reading.