map is not the fastest in go

· Read in about 1 min · (174 Words)

I wrote a bioinformatics package in golang, in which I used a function to check whether a letter(byte) is a valid DNA/RNA/Protein letter.

The easy way is storing the letters of alphabet in a map and check the existance of a letter. However, when I used go tool pprof to profile the performance, I found the hash functions (mapaccess2, memhash8, memhash) of map cost much time (see figure below).

Then I found a faster way: storing letters in a slice, in detail, saving a letter(byte) at position int(letter) of slice. To check a letter, just chech the value of slice[int(letter)], non-zero means valid letter.

[update at 2016-06-02] Two switch versions were also tested. They were faster than map version, but still slower than slice version. Besides, it was affected by the number of case sentences in switch, i.e. the bigger the alphabet size is, the slower it runs.

See the benchmark result:

Tests Iterations Time/operation
BenchmarkCheckLetterWithMap-4 2000000000 0.18 ns/op
BenchmarkCheckLetterWithSwitch-4 1000000000 0.02 ns/op
BenchmarkCheckLetterWithSwitchWithLargerAlphabetSize-4 1000000000 0.03 ns/op
BenchmarkCheckLetterWithSlice-4 2000000000 0.01 ns/op

CPU profile

source code: checkLetter_test.go