Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

instructions: GFNI #335

Closed
klauspost opened this issue Nov 14, 2022 · 5 comments
Closed

instructions: GFNI #335

klauspost opened this issue Nov 14, 2022 · 5 comments
Labels
instructions Instruction database

Comments

@klauspost
Copy link
Contributor

klauspost commented Nov 14, 2022

I would like to use GFNI for my reedsolomon package and they have a variety of other interesting uses

It seems like the Go Assembler supports the instructions, but avo doesn't.

It seems like they are missing from https://github.com/Maratyszcza/Opcodes

I wouldn't mind sending a PR, but without an upstream change I don't see any "sustainable" way to add them.

@mmcloughlin
Copy link
Owner

Ah, yes this would be awesome, and should be supported by avo.

I wish the fix were simpler. As you can tell, I've been procrastinating on these kinds of things because fixing it is either nasty or a significant undertaking. Options are basically:

(1) Maintain a fork of Opcodes database or a collection of patch files. Potentially contribute these back to upstream.

(2) Add the instructions during the loading phase. That would be done here:

avo/internal/load/tables.go

Lines 146 to 178 in fa3cfb0

// extras is simply a list of extra instructions to add to the database.
var extras = []*inst.Instruction{
// MOVLQZX does not appear in either x86 CSV or Opcodes, but does appear in stdlib assembly.
//
// Reference: https://github.com/golang/go/blob/048c9164a0c5572df18325e377473e7893dbfb07/src/runtime/asm_amd64.s#L451-L453
//
// TEXT ·reflectcall(SB), NOSPLIT, $0-32
// MOVLQZX argsize+24(FP), CX
// DISPATCH(runtime·call32, 32)
//
// Reference: https://github.com/golang/go/blob/048c9164a0c5572df18325e377473e7893dbfb07/src/cmd/internal/obj/x86/asm6.go#L1217
//
// {AMOVLQZX, yml_rl, Px, opBytes{0x8b}},
//
// Reference: https://github.com/golang/go/blob/048c9164a0c5572df18325e377473e7893dbfb07/src/cmd/internal/obj/x86/asm6.go#L515-L517
//
// var yml_rl = []ytab{
// {Zm_r, 1, argList{Yml, Yrl}},
// }
//
{
Opcode: "MOVLQZX",
Summary: "Move with Zero-Extend",
Forms: []inst.Form{
{
Operands: []inst.Operand{
{Type: "m32", Action: inst.R},
{Type: "r64", Action: inst.W},
},
},
},
},
}

(3) Generate the instructions database from a better source. Intel XED would be the best starting point #23. I've been meaning to do this since the beginning, as you can see from the issue number. Intel XED is unwieldy so it's not a trivial project. There are other third-party databases that are more up-to-date and easier to use, such as asmjit https://github.com/asmjit/asmdb.

Since GFNI is only three instructions, maybe we can support it with option (1) or (2), and replace it with (3) when I finally get around to it.

@mmcloughlin mmcloughlin added the instructions Instruction database label Nov 27, 2022
@mmcloughlin mmcloughlin changed the title GFNI Instructions instructions: GFNI Nov 27, 2022
@klauspost
Copy link
Contributor Author

klauspost commented Nov 27, 2022

I am mostly leaning towards 2) I don't feel like forking the XML is making things that much easier. I missed the 'extras' when looking though the code, and it seems like a nice, clean way to extend functionality safely - at least for a limited set of instructions.

I will try it out and report back either here or with a PR.

klauspost added a commit to klauspost/avo that referenced this issue Nov 27, 2022
GFNI instructions added as 'extras'.

See mmcloughlin#335
@mmcloughlin
Copy link
Owner

They're defined in the Go optabs:

	{as: AVGF2P8AFFINEINVQB, ytab: _yvgf2p8affineinvqb, prefix: Pavx, op: opBytes{
		avxEscape | vex128 | vex66 | vex0F3A | vexW1, 0xCF,
		avxEscape | vex256 | vex66 | vex0F3A | vexW1, 0xCF,
		avxEscape | evex128 | evex66 | evex0F3A | evexW1, evexN16 | evexBcstN8 | evexZeroingEnabled, 0xCF,
		avxEscape | evex256 | evex66 | evex0F3A | evexW1, evexN32 | evexBcstN8 | evexZeroingEnabled, 0xCF,
		avxEscape | evex512 | evex66 | evex0F3A | evexW1, evexN64 | evexBcstN8 | evexZeroingEnabled, 0xCF,
	}},
	{as: AVGF2P8AFFINEQB, ytab: _yvgf2p8affineinvqb, prefix: Pavx, op: opBytes{
		avxEscape | vex128 | vex66 | vex0F3A | vexW1, 0xCE,
		avxEscape | vex256 | vex66 | vex0F3A | vexW1, 0xCE,
		avxEscape | evex128 | evex66 | evex0F3A | evexW1, evexN16 | evexBcstN8 | evexZeroingEnabled, 0xCE,
		avxEscape | evex256 | evex66 | evex0F3A | evexW1, evexN32 | evexBcstN8 | evexZeroingEnabled, 0xCE,
		avxEscape | evex512 | evex66 | evex0F3A | evexW1, evexN64 | evexBcstN8 | evexZeroingEnabled, 0xCE,
	}},
	{as: AVGF2P8MULB, ytab: _yvandnpd, prefix: Pavx, op: opBytes{
		avxEscape | vex128 | vex66 | vex0F38 | vexW0, 0xCF,
		avxEscape | vex256 | vex66 | vex0F38 | vexW0, 0xCF,
		avxEscape | evex128 | evex66 | evex0F38 | evexW0, evexN16 | evexZeroingEnabled, 0xCF,
		avxEscape | evex256 | evex66 | evex0F38 | evexW0, evexN32 | evexZeroingEnabled, 0xCF,
		avxEscape | evex512 | evex66 | evex0F38 | evexW0, evexN64 | evexZeroingEnabled, 0xCF,

https://github.com/golang/go/blob/go1.19.3/src/cmd/internal/obj/x86/avx_optabs.go#L2250-L2269

The operand tables are:

var _yvgf2p8affineinvqb = []ytab{
	{zcase: Zvex_i_rm_v_r, zoffset: 2, args: argList{Yu8, Yxm, Yxr, Yxr}},
	{zcase: Zvex_i_rm_v_r, zoffset: 2, args: argList{Yu8, Yym, Yyr, Yyr}},
	{zcase: Zevex_i_rm_v_r, zoffset: 0, args: argList{Yu8, YxmEvex, YxrEvex, YxrEvex}},
	{zcase: Zevex_i_rm_v_k_r, zoffset: 3, args: argList{Yu8, YxmEvex, YxrEvex, Yknot0, YxrEvex}},
	{zcase: Zevex_i_rm_v_r, zoffset: 0, args: argList{Yu8, YymEvex, YyrEvex, YyrEvex}},
	{zcase: Zevex_i_rm_v_k_r, zoffset: 3, args: argList{Yu8, YymEvex, YyrEvex, Yknot0, YyrEvex}},
	{zcase: Zevex_i_rm_v_r, zoffset: 0, args: argList{Yu8, Yzm, Yzr, Yzr}},
	{zcase: Zevex_i_rm_v_k_r, zoffset: 3, args: argList{Yu8, Yzm, Yzr, Yknot0, Yzr}},
}

https://github.com/golang/go/blob/go1.19.3/src/cmd/internal/obj/x86/avx_optabs.go#L483-L492

var _yvandnpd = []ytab{
	{zcase: Zvex_rm_v_r, zoffset: 2, args: argList{Yxm, Yxr, Yxr}},
	{zcase: Zvex_rm_v_r, zoffset: 2, args: argList{Yym, Yyr, Yyr}},
	{zcase: Zevex_rm_v_r, zoffset: 0, args: argList{YxmEvex, YxrEvex, YxrEvex}},
	{zcase: Zevex_rm_v_k_r, zoffset: 3, args: argList{YxmEvex, YxrEvex, Yknot0, YxrEvex}},
	{zcase: Zevex_rm_v_r, zoffset: 0, args: argList{YymEvex, YyrEvex, YyrEvex}},
	{zcase: Zevex_rm_v_k_r, zoffset: 3, args: argList{YymEvex, YyrEvex, Yknot0, YyrEvex}},
	{zcase: Zevex_rm_v_r, zoffset: 0, args: argList{Yzm, Yzr, Yzr}},
	{zcase: Zevex_rm_v_k_r, zoffset: 3, args: argList{Yzm, Yzr, Yknot0, Yzr}},
}

https://github.com/golang/go/blob/go1.19.3/src/cmd/internal/obj/x86/avx_optabs.go#L137-L146

@mmcloughlin
Copy link
Owner

@mmcloughlin
Copy link
Owner

mmcloughlin commented Nov 27, 2022

The other factor here is the way avo handles AVX-512 instructions. In particular, merging, zeroing, broadcast, etc result in the duplication of instruction forms with slight modifications.

avo/internal/load/load.go

Lines 377 to 393 in fa3cfb0

// Apply modification stages to produce final list of forms.
stages := []func(string, inst.Form) []inst.Form{
avx512rounding,
avx512sae,
avx512bcst,
avx512masking,
avx512zeroing,
}
forms := []inst.Form{form}
for _, stage := range stages {
var next []inst.Form
for _, f := range forms {
next = append(next, stage(opcode, f)...)
}
forms = next
}

I see you did this manually for broadcast forms, but I think zeroing forms might need to be handled too.

I think it will be more robust if I make a change so the AVX-512 post-processing happens after extras are added to the list. Then we can ensure they're handled the same way.

mmcloughlin pushed a commit that referenced this issue Nov 27, 2022
GFNI instructions added as 'extras'.

See #335
mmcloughlin added a commit that referenced this issue Nov 28, 2022
Supporting extra instructions not included in the Opcodes database is
currently a challenge. Short of migrating to an entirely different source
(such as #23), the options are either to patch the XML data file or to append
additional instructions at the loading phase.

An example of patching the XML was shown in the as-yet unlanded PR #234. This
shows the XML patching approach is unwieldy and requires more information than
we actually need (for example instruction form encodings).

In #335 we discussed the alternative of adding extra instructions during
loading. This has the advantage of using avo's simpler internal data
structure.

This PR prepares for using that approach by adding an `internal/opcodesextra`
package, intended to contain manually curated lists of extra instructions to
add to the instruction database during loading. At the moment, the only
instruction added here is the `MOVLQZX` instruction that's already handled
this way.

Updates #335 #234 #23
mmcloughlin pushed a commit that referenced this issue Nov 28, 2022
GFNI instructions added as 'extras'.

See #335
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
instructions Instruction database
Projects
None yet
Development

No branches or pull requests

2 participants