-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
instructions: GFNI #335
Comments
Ah, yes this would be awesome, and should be supported by I wish the fix were simpler. As you can tell, I've been procrastinating on these kinds of things because fixing it is either nasty or a significant undertaking. Options are basically: (1) Maintain a fork of Opcodes database or a collection of patch files. Potentially contribute these back to upstream. (2) Add the instructions during the loading phase. That would be done here: Lines 146 to 178 in fa3cfb0
(3) Generate the instructions database from a better source. Intel XED would be the best starting point #23. I've been meaning to do this since the beginning, as you can see from the issue number. Intel XED is unwieldy so it's not a trivial project. There are other third-party databases that are more up-to-date and easier to use, such as Since GFNI is only three instructions, maybe we can support it with option (1) or (2), and replace it with (3) when I finally get around to it. |
I am mostly leaning towards 2) I don't feel like forking the XML is making things that much easier. I missed the 'extras' when looking though the code, and it seems like a nice, clean way to extend functionality safely - at least for a limited set of instructions. I will try it out and report back either here or with a PR. |
GFNI instructions added as 'extras'. See mmcloughlin#335
They're defined in the Go optabs: {as: AVGF2P8AFFINEINVQB, ytab: _yvgf2p8affineinvqb, prefix: Pavx, op: opBytes{
avxEscape | vex128 | vex66 | vex0F3A | vexW1, 0xCF,
avxEscape | vex256 | vex66 | vex0F3A | vexW1, 0xCF,
avxEscape | evex128 | evex66 | evex0F3A | evexW1, evexN16 | evexBcstN8 | evexZeroingEnabled, 0xCF,
avxEscape | evex256 | evex66 | evex0F3A | evexW1, evexN32 | evexBcstN8 | evexZeroingEnabled, 0xCF,
avxEscape | evex512 | evex66 | evex0F3A | evexW1, evexN64 | evexBcstN8 | evexZeroingEnabled, 0xCF,
}},
{as: AVGF2P8AFFINEQB, ytab: _yvgf2p8affineinvqb, prefix: Pavx, op: opBytes{
avxEscape | vex128 | vex66 | vex0F3A | vexW1, 0xCE,
avxEscape | vex256 | vex66 | vex0F3A | vexW1, 0xCE,
avxEscape | evex128 | evex66 | evex0F3A | evexW1, evexN16 | evexBcstN8 | evexZeroingEnabled, 0xCE,
avxEscape | evex256 | evex66 | evex0F3A | evexW1, evexN32 | evexBcstN8 | evexZeroingEnabled, 0xCE,
avxEscape | evex512 | evex66 | evex0F3A | evexW1, evexN64 | evexBcstN8 | evexZeroingEnabled, 0xCE,
}},
{as: AVGF2P8MULB, ytab: _yvandnpd, prefix: Pavx, op: opBytes{
avxEscape | vex128 | vex66 | vex0F38 | vexW0, 0xCF,
avxEscape | vex256 | vex66 | vex0F38 | vexW0, 0xCF,
avxEscape | evex128 | evex66 | evex0F38 | evexW0, evexN16 | evexZeroingEnabled, 0xCF,
avxEscape | evex256 | evex66 | evex0F38 | evexW0, evexN32 | evexZeroingEnabled, 0xCF,
avxEscape | evex512 | evex66 | evex0F38 | evexW0, evexN64 | evexZeroingEnabled, 0xCF, The operand tables are: var _yvgf2p8affineinvqb = []ytab{
{zcase: Zvex_i_rm_v_r, zoffset: 2, args: argList{Yu8, Yxm, Yxr, Yxr}},
{zcase: Zvex_i_rm_v_r, zoffset: 2, args: argList{Yu8, Yym, Yyr, Yyr}},
{zcase: Zevex_i_rm_v_r, zoffset: 0, args: argList{Yu8, YxmEvex, YxrEvex, YxrEvex}},
{zcase: Zevex_i_rm_v_k_r, zoffset: 3, args: argList{Yu8, YxmEvex, YxrEvex, Yknot0, YxrEvex}},
{zcase: Zevex_i_rm_v_r, zoffset: 0, args: argList{Yu8, YymEvex, YyrEvex, YyrEvex}},
{zcase: Zevex_i_rm_v_k_r, zoffset: 3, args: argList{Yu8, YymEvex, YyrEvex, Yknot0, YyrEvex}},
{zcase: Zevex_i_rm_v_r, zoffset: 0, args: argList{Yu8, Yzm, Yzr, Yzr}},
{zcase: Zevex_i_rm_v_k_r, zoffset: 3, args: argList{Yu8, Yzm, Yzr, Yknot0, Yzr}},
} var _yvandnpd = []ytab{
{zcase: Zvex_rm_v_r, zoffset: 2, args: argList{Yxm, Yxr, Yxr}},
{zcase: Zvex_rm_v_r, zoffset: 2, args: argList{Yym, Yyr, Yyr}},
{zcase: Zevex_rm_v_r, zoffset: 0, args: argList{YxmEvex, YxrEvex, YxrEvex}},
{zcase: Zevex_rm_v_k_r, zoffset: 3, args: argList{YxmEvex, YxrEvex, Yknot0, YxrEvex}},
{zcase: Zevex_rm_v_r, zoffset: 0, args: argList{YymEvex, YyrEvex, YyrEvex}},
{zcase: Zevex_rm_v_k_r, zoffset: 3, args: argList{YymEvex, YyrEvex, Yknot0, YyrEvex}},
{zcase: Zevex_rm_v_r, zoffset: 0, args: argList{Yzm, Yzr, Yzr}},
{zcase: Zevex_rm_v_k_r, zoffset: 3, args: argList{Yzm, Yzr, Yknot0, Yzr}},
} |
See also the auto-generated test cases: |
The other factor here is the way Lines 377 to 393 in fa3cfb0
I see you did this manually for broadcast forms, but I think zeroing forms might need to be handled too. I think it will be more robust if I make a change so the AVX-512 post-processing happens after extras are added to the list. Then we can ensure they're handled the same way. |
GFNI instructions added as 'extras'. See #335
Supporting extra instructions not included in the Opcodes database is currently a challenge. Short of migrating to an entirely different source (such as #23), the options are either to patch the XML data file or to append additional instructions at the loading phase. An example of patching the XML was shown in the as-yet unlanded PR #234. This shows the XML patching approach is unwieldy and requires more information than we actually need (for example instruction form encodings). In #335 we discussed the alternative of adding extra instructions during loading. This has the advantage of using avo's simpler internal data structure. This PR prepares for using that approach by adding an `internal/opcodesextra` package, intended to contain manually curated lists of extra instructions to add to the instruction database during loading. At the moment, the only instruction added here is the `MOVLQZX` instruction that's already handled this way. Updates #335 #234 #23
GFNI instructions added as 'extras'. See #335
I would like to use GFNI for my reedsolomon package and they have a variety of other interesting uses
It seems like the Go Assembler supports the instructions, but avo doesn't.
It seems like they are missing from https://github.com/Maratyszcza/Opcodes
I wouldn't mind sending a PR, but without an upstream change I don't see any "sustainable" way to add them.
The text was updated successfully, but these errors were encountered: