Mirror Mirror on ...
Of Mice (Unsafe) and Men (Reflect)
Published on March 10, 2018
Go Advanced Reflect UnsafeAbout 12 minutes of reading.
TL;DR
While I was mentoring, I encouraged my pupils to break things so they understand how they work. Using reflect package seems easy, but understanding the mechanics is hard. So, this week, following my own advice, I’ve tried to create my own reflect package. Here is what I’ve learned.
Playing with Fire
Most of the articles on the subject I’ve read have (more or less) the following advice : “if you find yourself doing this in a real program, stop immediately and seek help. You are doing something wrong. You’ve been warned!”. Now, wait a minute, mister. That is hypocrisy!
If you take a look at the importers of reflect you will easily find that using the fmt
implies you are using reflect. Using “unsafe” features in Golang is only for developers that develop the language itself? Maybe looking on importers of unsafe tells you otherwise.
unsafe
- after that we reflect
I quote from the documentation : “unsafe.Pointer type allows a program to defeat the type system and read and write arbitrary memory. It should be used with extreme care”.
Let’s say you have type John
which you are trying to convert to type Ivan
. The documentation states that Ivan has to be smaller or equal with John (in terms of properties it has) and those to share the `equivalent memory layout.
Let’s code:
func TestIsJohnIvan(t *testing.T) {
type John struct {
Name string
Age uint
Powers uint
}
type Ivan struct {
givenName string // yes, you can use private fields
_ uint
ThirdField uint
}
john := John{Name: "John", Age: 40, Powers: 3}
ivan := *(*Ivan)(unsafe.Pointer(&john))
t.Logf("John as Ivan : GivenName %v ThirdField %d", ivan.givenName, ivan.ThirdField)
}
First observation is that you can violate access to private fields using this conversion. Secondly, as long as you respect the same number of fields and their types, you can omit properties. You can violate the second rule and get unexpected results, as below:
type ShortIvan struct {
Age uint
}
smallIvan := *(*ShortIvan)(unsafe.Pointer(&john))
t.Logf("Small Ivan (just powers) : %v", smallIvan.Age)
You would expect that age to be 40, but it’s not : it’s 5717318. Why? Because an uint is built by taking the required value from the Name property of John. The correct way to get a smaller Ivan is to omit the name property (observe that the third property is omitted too):
type ShortIvan struct {
_ string
Age uint
}
What if you violate the first rule, which states that types have to have an equal amount of properties:
type UpgradedIvan struct {
//_ string // adding this at the beginning crashes
Name string
Age uint
Powers uint
Address string // will get filled with the Name
Guns uint // will get filled with Age
//Say string // adding yet another one will crash : "bad pointer in frame"
//Data []byte // same adding this or more
AFloat float32 // adding a different type seems safe
}
chuckNorris := *(*UpgradedIvan)(unsafe.Pointer(&john))
t.Logf("Chuck Ivan : %v", chuckNorris)
Well, it works, but with side effects : Address gets filled with same value as Name, Guns with Age and AFloat get a value of 4e-45. So, this the non-safety point that a developer should never touch. As long as we’re respecting the rules, it’s safe to play unsafe.
Also, upgrading
John seems better (simpler) by using embedding:
type EmbeddedJohn struct {
John
Address string // will get filled with the Name ??? Weird huh
Guns uint // will get filled with Age ???
}
// convert John to EmbeddedJohn
upgradedIvan := EmbeddedJohn{John: john}
t.Logf("Upgraded Ivan : %v", upgradedIvan)
Surely, the bellow code is dangerous if it is misused. The code speaks for itself:
func TestAlteredPeople(t *testing.T) {
type John struct {
Name string
Age int
Altered bool
}
john := John{Name: "John", Age: 30, Altered: false}
ptrToJohn := unsafe.Pointer(&john)
ptrToName := (*string)(unsafe.Pointer(uintptr(ptrToJohn) + unsafe.Offsetof(john.Name)))
ptrToAge := (*int)(unsafe.Pointer(uintptr(ptrToJohn) + unsafe.Offsetof(john.Age)))
ptrToAltered := (*bool)(unsafe.Pointer(uintptr(ptrToJohn) + unsafe.Offsetof(john.Altered)))
*ptrToName = "Chuck"
*ptrToAge = 100000
*ptrToAltered = true
t.Logf("Now John is %v", john)
}
Unsafe
conclusions
The unsafe package is serving for Go compiler instead of Go runtime, because it has facilities for low-level programming including operations that violate the type system.
I would never use the above method of conversion, but investigation was needed because of what’s about to be described regarding reflect.
type Point struct {
x, y int
}
func Extract(ptr unsafe.Pointer, size uintptr) []byte {
out := make([]byte, size)
for i := range out {
out[i] = *((*byte)(unsafe.Pointer(uintptr(ptr) + uintptr(i))))
}
return out
}
func TestExtract(t *testing.T) {
p := Point{3, 4}
mem := Extract(unsafe.Pointer(&p), unsafe.Sizeof(p))
t.Logf("What's the Point? %v", mem)
}
Yes, you can extract the content of the memory, but what’s the point? Well, a friend of mine (with the same name and the same passion for Golang) might find this as a useful way to hide sensitive data, by reversing the extract into carefully filling it with secrets which comes from somewhere else.
reflect
If you kept in mind that unsafe
is about the compiler and not the runtime, here is the proof:
func TestInTheBeginning(t *testing.T) {
type r struct {
sz uintptr
dt uintptr
_ uint32
f uint8
_ uint8
_ uint8
knd uint8
_ *struct{}
c *byte
str int32
w int32
}
type e struct {
abracadabra *r
}
t := func(p interface{}) *r {
return (*(*e)(unsafe.Pointer(&p))).abracadabra
}
p := Point{3, 4}
v := t(&p)
t.Logf("After looking in the mirror : %v %v %v %v %v %v", v.sz, v.dt, v.f, v.knd, v.str, v.w)
}
Once you run the above test, you will get the properties filled in with some values which seem pure magic. But there must be an explanation.
We didn’t import reflect
package. Also, the code is unreadable thus proving there is no magic convention like structs named in certain way or properties have some particular names.
So, what happen? Well, these data structures (e
and r
types) are known to the compiler which does it’s job and at the runtime we’re getting those results. To reinforce that truth, if we’re replacing that t
function with it’s body v :=(*(*e)(unsafe.Pointer(&p))).abracadabra
, it won’t work anymore. And even more, if we’re changing the parameter type of the t
function from interface{}
to *Point
it will not work as expected.
If you look in reflect package, you will see that rtype
struct looks exactly the same as our r
struct, even if the properties are named different. Same goes for emptyInterface
and our e
struct - despite the fact that we are not using the word
property - remember omitting properties in the unsafe example above?
Building your own reflection package
Can you build your own reflect
package? So far my conclusion is yes, you can. At least for reading and writing the properties of structs it’s quite easy.
However, I’ve encountered some problems that I want to present here. First, the (long but minimal) code (mostly copy pasted from reflect):
import (
"testing"
"unsafe" // also for linkname
)
const (
Invalid Kind = iota
Bool
Int
Int8
Int16
Int32
Int64
Uint
Uint8
Uint16
Uint32
Uint64
Uintptr
Float32
Float64
Complex64
Complex128
Array
Chan
Func
Interface
Map
Ptr
Slice
String
Struct
UnsafePointer
)
const (
tflagUncommon tflag = 1 << 0
tflagExtraStar tflag = 1 << 1
)
const (
kindMask = (1 << 5) - 1
)
type (
Kind uint
nameOff int32
typeOff int32
textOff int32
tflag uint8
name struct {
bytes *byte
}
uncommonType struct {
pkgPath nameOff
mcount uint16
_ uint16
moff uint32
_ uint32
}
rtype struct {
size uintptr
ptrdata uintptr
hash uint32
tflag tflag
align uint8
fieldAlign uint8
kind uint8
alg *typeAlg
gcdata *byte
str nameOff
ptrToThis typeOff
}
typeAlg struct {
hash func(unsafe.Pointer, uintptr) uintptr
equal func(unsafe.Pointer, unsafe.Pointer) bool
}
method struct {
name nameOff
mtyp typeOff
ifn textOff
tfn textOff
}
structField struct {
name name
typ *rtype
offsetAnon uintptr
}
structType struct {
rtype `reflect:"struct"`
pkgPath name
fields []structField
}
emptyInterface struct {
typ *rtype
word unsafe.Pointer
}
stringHeader struct {
Data unsafe.Pointer
Len int
}
ptrType struct {
rtype `reflect:"ptr"`
elem *rtype // pointer element (pointed at) type
}
)
func resolveReflectName(n name) nameOff {
return nameOff(addReflectOff(unsafe.Pointer(n.bytes)))
}
func add(p unsafe.Pointer, x uintptr) unsafe.Pointer {
return unsafe.Pointer(uintptr(p) + x)
}
func fnv1(x uint32, list ...byte) uint32 {
for _, b := range list {
x = x*16777619 ^ uint32(b)
}
return x
}
func rtypeOff(section unsafe.Pointer, off int32) *rtype {
return (*rtype)(add(section, uintptr(off)))
}
func typesByString(s string) []*rtype {
sections, offset := typelinks()
var ret []*rtype
for offsI, offs := range offset {
section := sections[offsI]
i, j := 0, len(offs)
for i < j {
h := i + (j-i)/2
if !(rtypeOff(section, offs[h]).String() >= s) {
i = h + 1
} else {
j = h
}
}
for j := i; j < len(offs); j++ {
typ := rtypeOff(section, offs[j])
if typ.String() != s {
break
}
ret = append(ret, typ)
}
}
return ret
}
func newName(n, tag string, exported bool) name {
if len(n) > 1<<16-1 {
panic("reflect.nameFrom: name too long: " + n)
}
if len(tag) > 1<<16-1 {
panic("reflect.nameFrom: tag too long: " + tag)
}
var bits byte
l := 1 + 2 + len(n)
if exported {
bits |= 1 << 0
}
if len(tag) > 0 {
l += 2 + len(tag)
bits |= 1 << 1
}
b := make([]byte, l)
b[0] = bits
b[1] = uint8(len(n) >> 8)
b[2] = uint8(len(n))
copy(b[3:], n)
if len(tag) > 0 {
tb := b[3+len(n):]
tb[0] = uint8(len(tag) >> 8)
tb[1] = uint8(len(tag))
copy(tb[2:], tag)
}
return name{bytes: &b[0]}
}
func (n name) isExported() bool {
return (*n.bytes)&(1<<0) != 0
}
func (n name) name() (s string) {
if n.bytes == nil {
panic("no name")
}
b := (*[4]byte)(unsafe.Pointer(n.bytes))
hdr := (*stringHeader)(unsafe.Pointer(&s))
hdr.Data = unsafe.Pointer(&b[3])
hdr.Len = int(b[1])<<8 | int(b[2])
return s
}
func (t *rtype) nameOff(off nameOff) name {
return name{(*byte)(resolveNameOff(unsafe.Pointer(t), int32(off)))}
}
func (t *rtype) typeOff(off typeOff) *rtype {
return (*rtype)(resolveTypeOff(unsafe.Pointer(t), int32(off)))
}
func (t *rtype) Kind() Kind { return Kind(t.kind & kindMask) }
func (t *rtype) String() string {
s := t.nameOff(t.str).name()
if t.tflag&tflagExtraStar != 0 {
return s[1:]
}
return s
}
func (t *uncommonType) methods() []method {
if t.mcount == 0 {
panic("zero methods")
}
return (*[1 << 16]method)(add(unsafe.Pointer(t), uintptr(t.moff)))[:t.mcount:t.mcount]
}
func (t *rtype) uncommon() *uncommonType {
if t.tflag&tflagUncommon == 0 {
return nil
}
if t.Kind() != Struct && t.Kind() != Ptr {
panic("not struct or pointer")
}
ptrToT := unsafe.Pointer(t)
switch t.Kind() {
case Struct:
type u struct {
structType
u uncommonType
}
return &(*u)(ptrToT).u
case Ptr:
type u struct {
ptrType
u uncommonType
}
return &(*u)(ptrToT).u
default:
type u struct {
rtype
u uncommonType
}
return &(*u)(ptrToT).u
}
}
func (t *rtype) exportedMethods() []method {
ut := t.uncommon()
if ut == nil {
return nil
}
allMethods := ut.methods()
allExported := true
for _, method := range allMethods {
name := t.nameOff(method.name)
if !name.isExported() {
allExported = false
break
}
}
var methods []method
if allExported {
methods = allMethods
} else {
methods = make([]method, 0, len(allMethods))
for _, m := range allMethods {
name := t.nameOff(m.name)
if name.isExported() {
methods = append(methods, m)
}
}
methods = methods[:len(methods):len(methods)]
}
return methods
}
func (t *rtype) ptrTo() *rtype {
if t.ptrToThis != 0 {
return t.typeOff(t.ptrToThis)
}
s := "*" + t.String()
for _, tt := range typesByString(s) {
p := (*ptrType)(unsafe.Pointer(tt))
if p.elem != t {
continue
}
return &p.rtype
}
var iptr interface{} = (*unsafe.Pointer)(nil)
prototype := *(**ptrType)(unsafe.Pointer(&iptr))
pp := *prototype
pp.str = resolveReflectName(newName(s, "", false))
pp.ptrToThis = 0
pp.hash = fnv1(t.hash, '*')
pp.elem = t
return &pp.rtype
}
func TypeOf(i interface{}) *rtype {
return (*(*emptyInterface)(unsafe.Pointer(&i))).typ.ptrTo()
}
Of course, to run tests, we have to create an empty.s
file in the same folder and to add the linkname directives for two functions:
//go:linkname resolveTypeOff runtime.resolveTypeOff
func resolveTypeOff(rtype unsafe.Pointer, off int32) unsafe.Pointer
//go:linkname resolveNameOff runtime.resolveNameOff
func resolveNameOff(ptrInModule unsafe.Pointer, off int32) unsafe.Pointer
//go:linkname typelinks reflect.typelinks
func typelinks() (sections []unsafe.Pointer, offset [][]int32)
//go:linkname addReflectOff reflect.addReflectOff
func addReflectOff(ptr unsafe.Pointer) int32
On the Point
struct declared above, we’re adding the followings:
func (p Point) AnotherMethod(scale int) int {
return -1
}
func (p Point) Dist(scale int) int {
return p.x*p.x*scale + p.y*p.y*scale
}
func (p Point) NoArgs() {
println("NoArgs called.")
}
func (p Point) TotalDist(points ...Point) int {
tot := 0
for _, q := range points {
dx := q.x - p.x
dy := q.y - p.y
tot += dx*dx + dy*dy
}
return tot
}
func (p Point) NoArgsButReturn() string {
return "something"
}
And finally, the test :
func TestMethod(t *testing.T) {
p := Point{3, 4}
pType := TypeOf(p)
t.Logf("%v", pType)
methods := pType.exportedMethods()
for idx, method := range methods {
name := pType.nameOff(method.name)
typ := pType.typeOff(method.mtyp)
t.Logf("%d : Method %q %v %v %v\n", idx, name.name(), typ, method.tfn, method.ifn)
}
}
When we run this test, we’re going to see that the methods signature are reported differently than what we’ve declared. This means we are not doing something that reflect
package does.
Our version of TypeOf function doesn’t return an interface and also, that interface is built by calling toPtr() method of the rType. However, with that code added, the problem still doesn’t get fixed.
Adding the following code, fixes the test (the signatures are correct).
type dummy struct{}
func (d dummy) A() {}
var _ = reflect.TypeOf(dummy{}).Method(0)
Seems the function func addReflectOff(ptr unsafe.Pointer) int32
which is implemented in the runtime package gets called from reflect
package which creates reflectOffs structs for later lookups. We need to force the compiler to allow us to use the same functions as reflect does. Since we’re not using reflect anywhere, dead code removal does not allow us to initialize properly - so we need to force it.
Indeed, we’re importing reflect to write our own reflect, but we’re not using it in other than dumb init.
In the larger version (my own version of reflect), all Value.Call() tests were failing in a segmentation fault, with a reason (method types were zero) - the code being the same as in reflect package. For this reason I’ve presented you with this small test and it’s conclusions.
Conclusion
It took me four days to learn the internals and modify the reflect package for my needs, but in the end I’ve done it and later I will probably integrate it into the reflector package.
Probably the lack of documentation made things harder to understand and follow. Probably some things are never meant to be - that - public, due to some sort of programming language politics. Who knows but mostly who cares?
I encourage you to take my advice and break things so you can learn how they work, how other developers solved problems that you cannot think about while just reading the code.
Related
Interview Questions for Go Developer Position - Part II
Measuring And Classifying Go Developer Knowledge
Published on December 7, 2018
Go Developer InterviewAbout 3 minutes of reading.
Changing Perspective
Changing Perspective Might Help You Understand
Published on November 20, 2018
Go Channels Grouping MethodsAbout 7 minutes of reading.
Interview Questions for Go Developer Position
Measuring And Classifying Go Developer Knowledge
Published on November 18, 2018
Go Developer InterviewAbout 7 minutes of reading.