Skip to content
This repository has been archived by the owner on Jan 2, 2019. It is now read-only.

Latest commit

 

History

History
65 lines (45 loc) · 2.31 KB

README.md

File metadata and controls

65 lines (45 loc) · 2.31 KB

intrinsic

Build Status Go Report Card

Provide Golang native SIMD intrinsics on x86/amd64 platform

  • SSE2 godoc reference
  • SSE3 godoc reference
  • SSSE3 godoc reference
  • SSE41 godoc reference
  • SSE42 godoc reference

Usage

package main

import (
    "fmt"

    "github.com/mengzhuo/intrinsic/sse2"
)

func main() {
    src := []float32{3.14, 2.17}
    dst := []float32{2.17, 3.15}
    sse2.MAXSDm64float32(src, dst)
    fmt.Print(src, dst) //[2.17 3.15] [2.17 3.15]
}

Benchmarks

SSE2 it will provide about 6x-7x performance enhancement.

BenchmarkPMINUBByte-4         	1000000000	         2.65 ns/op	       0 B/op	       0 allocs/op
BenchmarkGeneralPMINUBByte-4   	100000000	        15.8 ns/op	       0 B/op	       0 allocs/op
BenchmarkPAND-4               	1000000000	         2.61 ns/op	       0 B/op	       0 allocs/op
BenchmarkGeneralAND-4         	100000000	        15.4 ns/op	       0 B/op	       0 allocs/op

Development

All codes in subdir is generated by scanner.go , see Makefile for more detail.

x86.csv and x86desc.csv are from another repos in https://github.com/mengzhuo/x86data

TODO

  • resolve immediate opcode generate
  • SSE2 gen=80, total=141, ratio=56.74%
  • SSE3 gen=6, total=10, ratio=60.00%
  • SSSE3 gen=15, total=32, ratio=46.88%
  • SSE4_1 gen=26, total=49, ratio=53.06%
  • SSE4_2 gen=1, total=5, ratio=20.00%
  • AVX gen=66, total=378, ratio=17.46%
  • AVX2 gen=8, total=159, ratio=5.03%
  • FMA