Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opening the elaborated design fails with error on SHARED_QPLL value #27

Open
tweinschenk opened this issue Nov 14, 2024 · 9 comments
Open

Comments

@tweinschenk
Copy link

Expected Behavior

It should be possible to open the elaborated design with port_sgmii_gtx_X0Yx included in the block design.

Current Behavior

Opening the block design fails with the following error messages:

[IP_Flow 19-155] Failed to convert string value '0' to HDL value.
[IP_Flow 19-3814] Failed to get HDL value for model parameter 'SHARED_QPLL'.

Possible Solution

I think the problem is somewhere in create_port_sgmii_gtx.tcl and the SHARED_QPLL variable or maybe in the port_sgmii_gtx.vhd and the usage of this variable. But I was not able to find a fix yet, maybe you can help me :)

Steps to Reproduce (for bugs)

  1. Add a SatCat5 SGMII PHY to the block design.
  2. Try to open the elaborated design

Context

Maybe this is again a issue with the newer Vivado version?

Your Environment

  • SatCat5 version used: master branch, commit bf2a647
  • Platform and version:
Vivado v2022.2 (64-bit)
SW Build 3671981 on Fri Oct 14 04:59:54 MDT 2022
IP Build 3669848 on Fri Oct 14 08:30:02 MDT 2022
Tool Version Limit: 2022.10
@tweinschenk
Copy link
Author

Removing the lines:

ipcore_add_param SHARED_QPLL string $shared_qpll\
    {Does this MGT use a shared QPLL resource?} false

from the file create_port_sgmii_gtx.tcl solves the problem for me, its not a nice fix but works for me.

@tweinschenk
Copy link
Author

After I fixed the previous problem I encountered another one. If I add a SatCat5 SGMII PHY which includes the shared components it fails with the following errors:

ERROR: [Synth 8-485] no port 'gtrefclk_bufg_out' on instance [/home/localadmin/xxx/xxx.gen/sources_1/bd/design_1/ipshared/2142/src/port_sgmii_gtx.vhd:425]
ERROR: [Synth 8-485] no port 'gtrefclk_bufg_out' on instance [/home/localadmin/xxx/xxx.gen/sources_1/bd/design_1/ipshared/2142/src/port_sgmii_gtx.vhd:425]
ERROR: [Synth 8-485] no port 'gtrefclk_bufg_out' on instance [/home/localadmin/xxx/xxx.gen/sources_1/bd/design_1/ipshared/2142/src/port_sgmii_gtx.vhd:96]
ERROR: [Synth 8-285] failed synthesizing module 'port_sgmii_gtx__parameterized0' [/home/localadmin/xxx/xxx.gen/sources_1/bd/design_1/ipshared/2142/src/port_sgmii_gtx.vhd:86]
ERROR: [Synth 8-285] failed synthesizing module 'wrap_port_sgmii_gtx__parameterized2' [/home/localadmin/xxx/xxx.gen/sources_1/bd/design_1/ipshared/ea3d/src/wrap_port_sgmii_gtx.vhd:73]
ERROR: [Synth 8-285] failed synthesizing module 'design_1_port_sgmii_gtx_X0Y4_0_0' [/home/localadmin/xxx/xxx.gen/sources_1/bd/design_1/ip/design_1_port_sgmii_gtx_X0Y4_0_0_1/synth/design_1_port_sgmii_gtx_X0Y4_0_0.vhd:88]
ERROR: [Synth 8-285] failed synthesizing module 'design_1' [/home/localadmin/xxx/xxx.gen/sources_1/bd/design_1/synth/design_1.vhd:2090]
ERROR: [Synth 8-285] failed synthesizing module 'design_1_wrapper' [/home/localadmin/grape_backplane/grape_backplane.gen/sources_1/bd/design_1/hdl/design_1_wrapper.vhd:57]

Removing the line 96 gtrefclk_bufg_out : out std_logic; from the file src/vhdl/xilinx/port_sgmii_gtx.vhd fixes the issue for me but probably would fail for zynq devices as the xilinx ip has this port only for 7 series and Zynq devices.

@ooterness
Copy link
Member

Unfortunately, we've only tested this core on Vivado 2019.1. Xilinx frequently changes the port definitions on their auto-generated IP-cores from version to version, in ways that make it difficult to use in this fashion. The error message above looks like exactly that situation.

(Aside: We need to distribute the gtrefclk_bufg_out signal in cases where there's multiple SGMII ports sharing a QPLL resource. However, if your use case only uses a single SGMII port, then removing that signal is safe.)

I would recommend trying the "sgmii_raw" variant. Feature-wise, it's nearly a drop-in replacement for "sgmii_gtx", but the lower-level cores tend to have a more stable API.

@tweinschenk
Copy link
Author

tweinschenk commented Nov 19, 2024

Removing the gtrefclk_bufg_out seems fine even with multiple SGMII ports in this case because there is still the gtrefclk_out available. Accroding to this AMD/Xilinx documentation Shared Signals Connectivity this is the only gtrefclk output for Ultrascale+ devices which are not Zynq devices.

But I ran into issues with the multiple ports because of the reuse of the IP (I ran the create_port script for every GTH location and I have a IP block for every location now) . I will share the error messages later on. One generall question here: is it even a feasible apporach to use the block design for this?

I then tried to use the "sgmii_raw" variant. I also needed to change the name of some ports in the vhdl because the xcau15p-sbvb484-1-i I am using has GTH transcievers and not GTY transcievers, but that was not a big problem. But now I have the issue that it would need more block-RAM than this device has. Is this "sgmii_raw" variant so memory hungry or is this some kind of bug? I even ran into this issue with only one of these SGMII blocks. With the "sgmii_gtx" variant 10 ports are no problem.

Thank you already for your help and also for this great open-source project :)

@ooterness
Copy link
Member

Unfortunately, the way that AMD/Xilinx defines these IP cores puts us between a rock and a hard place.

For a user that's using all four lanes of a quad, our ideal use case is:

  • User instantiates four copies of the "port_sgmii_gtx" IP core. (Each containing one copy of the Xilinx IP.)
  • Set one of four instances to enable "shared logic", with the other three disabling shared logic.
  • Cross-link the required shared signals, such as gtrefclk_bufg_out.

The alternative would be to put all four SGMII ports in one IP-core, which seems to be the vendor preference. But to do that, we would need component definitions for every pairwise combination of device and port count. (e.g.., 7series with 1 port, ultrascale with 1 port, 7 series with 2 ports, ultrascale with 2 ports, 7 series with 3 ports...) That is not sustainable, doubly so if the port list varies by Vivado version.

What we chose was a compromise using separate IP-cores, but it seems that we're relying on an unstable API. I'm not sure that a quick fix exists. All I can recommend is using Vivado 2019.1.

@tweinschenk
Copy link
Author

Yes, I see the problem with the changing API. I think this is not related to the Vivado version, as the document I linked last time mentions differences between product families in the same generation. e.g. Zynq Ultrascale+ vs. Kintex or Artix Ultrascale+. This is a problem for the gtx version.

I also saw problems with the lower level raw version that the port names changes with tranceiver generation. e.g. GTY vs GTH.

Nevertheless, these problems are ones which would requiere a lot of work to really fix but are easy to change by hand for a specific chip.

My main problem is that I am not able to get multiple instances of "port_sgmii_*" to work in one block design and I think the issue is somewhere in the creation process of the IP core package. For both types I ran into issues with the location constrains:

GTX:
grafik

RAW:
grafik

Additionally for the raw variant I think the variant without the shared parts is missing for the ultrascale version, which results in this error:

grafik

The IBUFDS_GTE4 is generated three times in this design because I added the raw port three times, even with two of them set to not include the shared parts.

I may be able to solve the issue with the IBUFDS_GTE4 and also push a pull request with the changes but I am lost with the location constrains.

@ooterness
Copy link
Member

We do have a workaround for the location constraints, but it's a bit ugly.

Option A is to ignore the warnings. Unfortunately, the Xilinx IP sets the LOC constraint on each GTHE4_CHANNEL instance, and there's no way to disable this behavior. Since that defaults to X0Y0 for every instance of the core, this results in a warning. However, Vivado will usually override this based on the pin-location constraint for the transceiver signals. (There's only one channel attached to each physical pin, after all.) In such cases, the warnings are a false alarm, and place-and-route figures things out correctly to generate a functional design.

Option B is to create separate variants of the IP-core for each channel, so the LOC constraints are set correctly from the beginning. Unfortunately, I haven't found a way to do this with the user-configurable parameters; AFAICT it has to be set when you create the IP-core.

To allow this, there's a TCL variable sgmii_gtx_loc that you can set before sourcing "create_port_sgmii_gtx.tcl". If you don't set anything, it defaults to "X0Y0" as above. However, If you set a different value, it'll create a variant of the IP-core with the specified location constraint. If you know which lanes you need, it's easy enough to script this:

set sgmii_gtx_loc X0Y0
source create_port_sgmii_gtx.tcl
set sgmii_gtx_loc X0Y1
source create_port_sgmii_gtx.tcl
set sgmii_gtx_loc X0Y2
source create_port_sgmii_gtx.tcl
...

Then, you'll need to create one instance of each core variant in your block design.

I'm not sure what's going on with the IBUFDS_GTE4 error. Best guess: We've had problems in the past where the block-diagram tools automatically insert IBUF or similar primitives that aren't appropriate for clock signals. We've got an io_buffer_type attribute in "wrap_port_sgmii_raw.vhd" to try and prevent this, but it's possible that wasn't effective in this case.

@tweinschenk
Copy link
Author

Actually I already used your Option B and still got this critical warnings. Thats what I meant with "I ran the create_port script for every GTH location and I have a IP block for every location now" in an earlier comment.

But maybe this is all running fine then and I have another problem. At least I was able to build a bitstream with multiple sgmii_gtx ports and the n-port ethernet switch. I also tried to debug the errvec_t output from the switch and I see errors regularly there:

grafik

Does this message mean error with "MII_TX"? I just found this in the switch_aux and io_error_reporting but I was not sure in which direction the offset is going with the messages.

@ooterness
Copy link
Member

It seems that you are a step ahead of me in debugging the SGMII ports. ;)

The errvec_t signal is our legacy error-reporting scheme, superseded by the err_sw / err_switch signal. It reports exactly the same error events, but the new format is easier to decipher. (If you're curious, the conversion is performed by the swerr2vector function in "switch_types.vhd".)

In any case, bit 5 is equivalent to err_sw.mii_rx (MII_RX), which indicates a problem reported by the SGMII block itself. For clock-crossing reasons, every change in that signal indicates an error event. For "port_sgmii_gtx", there are two events that can trigger the underlying error:

  • An 8b/10b decode error (as indicated by status_disperr or status_badsymb)
  • An error reported by the Ethernet PHY (as indicated by gmii_rx_en + gmii_rx_er simultaneously)

To be clear, all of those alarms are sourced from the AMD/Xilinx IP-core, which suggests a configuration, clocking, or signal-integrity problem of some kind before it hits the SatCat5 logic. I'd start with the troubleshooting steps recommended in the PG047 user's guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants