Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[X86] Support EGPR (R16-R31) for APX #67702

Merged
merged 4 commits into from
Oct 10, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions llvm/lib/Target/X86/MCTargetDesc/X86BaseInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -1208,6 +1208,43 @@ namespace X86II {
return false;
}

inline bool canUseApxExtendedReg(const MCInstrDesc &Desc) {
uint64_t TSFlags = Desc.TSFlags;
uint64_t Encoding = TSFlags & EncodingMask;
// EVEX can always use egpr.
if (Encoding == X86II::EVEX)
return true;

// To be conservative, egpr is not used for all pseudo instructions
// because we are not sure what instruction it will become.
// FIXME: Could we improve it in X86ExpandPseudo?
if (isPseudo(TSFlags))
return false;

// MAP OB/TB in legacy encoding space can always use egpr except
// XSAVE*/XRSTOR*.
unsigned Opcode = Desc.Opcode;
switch (Opcode) {
default:
break;
case X86::XSAVE:
case X86::XSAVE64:
case X86::XSAVEOPT:
case X86::XSAVEOPT64:
case X86::XSAVEC:
case X86::XSAVEC64:
case X86::XSAVES:
case X86::XSAVES64:
case X86::XRSTOR:
case X86::XRSTOR64:
case X86::XRSTORS:
case X86::XRSTORS64:
return false;
}
uint64_t OpMap = TSFlags & X86II::OpMapMask;
return !Encoding && (OpMap == X86II::OB || OpMap == X86II::TB);
}

/// \returns true if the MemoryOperand is a 32 extended (zmm16 or higher)
/// registers, e.g. zmm21, etc.
static inline bool is32ExtendedReg(unsigned RegNo) {
Expand Down
2 changes: 2 additions & 0 deletions llvm/lib/Target/X86/X86.td
Original file line number Diff line number Diff line change
Expand Up @@ -331,6 +331,8 @@ def FeatureMOVDIRI : SubtargetFeature<"movdiri", "HasMOVDIRI", "true",
"Support movdiri instruction (direct store integer)">;
def FeatureMOVDIR64B : SubtargetFeature<"movdir64b", "HasMOVDIR64B", "true",
"Support movdir64b instruction (direct store 64 bytes)">;
def FeatureEGPR : SubtargetFeature<"egpr", "HasEGPR", "true",
"Support extended general purpose register">;

// Ivy Bridge and newer processors have enhanced REP MOVSB and STOSB (aka
// "string operations"). See "REP String Enhancement" in the Intel Software
Expand Down
31 changes: 31 additions & 0 deletions llvm/lib/Target/X86/X86InstrInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,37 @@ X86InstrInfo::X86InstrInfo(X86Subtarget &STI)
Subtarget(STI), RI(STI.getTargetTriple()) {
}

const TargetRegisterClass *
X86InstrInfo::getRegClass(const MCInstrDesc &MCID, unsigned OpNum,
const TargetRegisterInfo *TRI,
const MachineFunction &MF) const {
auto *RC = TargetInstrInfo::getRegClass(MCID, OpNum, TRI, MF);
// If the target does not have egpr, then r16-r31 will be resereved for all
// instructions.
if (!RC || !Subtarget.hasEGPR())
return RC;

if (X86II::canUseApxExtendedReg(MCID))
return RC;

switch (RC->getID()) {
default:
KanRobert marked this conversation as resolved.
Show resolved Hide resolved
return RC;
case X86::GR8RegClassID:
return &X86::GR8_NOREX2RegClass;
case X86::GR16RegClassID:
return &X86::GR16_NOREX2RegClass;
case X86::GR32RegClassID:
return &X86::GR32_NOREX2RegClass;
case X86::GR64RegClassID:
return &X86::GR64_NOREX2RegClass;
case X86::GR32_NOSPRegClassID:
return &X86::GR32_NOREX2_NOSPRegClass;
case X86::GR64_NOSPRegClassID:
return &X86::GR64_NOREX2_NOSPRegClass;
}
}

bool
X86InstrInfo::isCoalescableExtInstr(const MachineInstr &MI,
Register &SrcReg, Register &DstReg,
Expand Down
5 changes: 5 additions & 0 deletions llvm/lib/Target/X86/X86InstrInfo.h
Original file line number Diff line number Diff line change
Expand Up @@ -150,6 +150,11 @@ class X86InstrInfo final : public X86GenInstrInfo {
public:
explicit X86InstrInfo(X86Subtarget &STI);

const TargetRegisterClass *
KanRobert marked this conversation as resolved.
Show resolved Hide resolved
getRegClass(const MCInstrDesc &MCID, unsigned OpNum,
const TargetRegisterInfo *TRI,
const MachineFunction &MF) const override;

/// getRegisterInfo - TargetInstrInfo is a superset of MRegister info. As
/// such, whenever a client has an instance of instruction info, it should
/// always be able to get register info as well (through this method).
Expand Down
12 changes: 12 additions & 0 deletions llvm/lib/Target/X86/X86RegisterInfo.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,10 @@ X86RegisterInfo::getLargestLegalSuperClass(const TargetRegisterClass *RC,
case X86::GR16RegClassID:
case X86::GR32RegClassID:
case X86::GR64RegClassID:
case X86::GR8_NOREX2RegClassID:
case X86::GR16_NOREX2RegClassID:
case X86::GR32_NOREX2RegClassID:
case X86::GR64_NOREX2RegClassID:
Comment on lines +161 to +164
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear to me when we need to distinguish X86::GR8_NOREX2RegClassID from X86::GR8RegClassID and when not. We have some other places, e.g., here that using X86::GRxxRegClassID, shouldn't need to update with X86::GRxx_NOREX2RegClassID
Besides, we also have some place to createVirtualRegister(&X86::GRxxRegClass), should they need to update too?

Copy link
Contributor Author

@KanRobert KanRobert Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have comments in X86BaseInfo.h about when we need to distinguish them. PREFETCH instructions are in map TB and they can use r16-r31, so X86::GRxxRegClassID does not need to be updated.

Yes, createVirtualRegister(&X86::GRxxRegClass) needs to be updated if the instruction can not encode r16-r31. But I haven't found such place so far.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will you update in this patch or a follow up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't look through one by one, but here seems have risk since it itrates different instructions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no createVirtualRegister(&X86::GRxxRegClass) for instructions that can not encode EGPR so far. If there is any in the future, we will directly use createVirtualRegister(&X86::GRxx_NOREX2RegClass) at those places.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the risk?

    Register Reg = MRI->createVirtualRegister(
        TII->getRegClass(TII->get(DstOpcode), 0, MRI->getTargetRegisterInfo(),
                         *MBB->getParent()));
    MachineInstrBuilder Bld = BuildMI(*MBB, MI, DL, TII->get(DstOpcode), Reg);

The code calls getRegClass to get the register class and then build the machine instruction with the same opcode. It looks safe to me.

In X86, (If I remember correctly) pseudo instruction COPY is either a MOV or KMOV. Both of them can encode r16-31.

case X86::RFP32RegClassID:
case X86::RFP64RegClassID:
case X86::RFP80RegClassID:
Expand Down Expand Up @@ -610,6 +614,14 @@ BitVector X86RegisterInfo::getReservedRegs(const MachineFunction &MF) const {
}
}

// Reserve the extended general purpose registers.
if (!Is64Bit || !MF.getSubtarget<X86Subtarget>().hasEGPR()) {
for (unsigned n = 0; n != 16; ++n) {
for (MCRegAliasIterator AI(X86::R16 + n, this, true); AI.isValid(); ++AI)
Reserved.set(*AI);
}
}

assert(checkAllSuperRegsMarked(Reserved,
{X86::SIL, X86::DIL, X86::BPL, X86::SPL,
X86::SIH, X86::DIH, X86::BPH, X86::SPH}));
Expand Down
Loading