Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Driver for Mellanox Connect-X 4/5/6 cards #1461

Merged
merged 87 commits into from
Jan 21, 2022

Conversation

eugeneia
Copy link
Member

Finally a PR to include all the hard work by @capr @lukego @alexandergall and myself in the next release (TBA)!

This work was made possible by Mellanox releasing a public Programmer’s Reference Manual (PRM) for their Connect-X 4 cards!

This driver can operate Mellanox Connect-X 4 and 5 cards, and has been tested at SWITCH and in lwAFTR.

It supports RSS and MAC+VLAN switching (and combinations thereof). Currently not supported is VLAN insertion and stripping (use apps.vlan).

There is a branch eugeneia/snabb@mellanox-2021...eugeneia:mellanox-2021-vlan-strip-insert that implements VLAN stripping and insertion in the driver, but I’m not sure if the extra code is worth its weight or if its cleaner to delegate to apps.vlan instead. Opinions welcome!

capr and others added 30 commits March 30, 2016 20:29
This function can be useful for resetting a device that has persistent
state, for example the firmware state on a Mellanox ConnectX-4 device.
This is useful because some differences that are subtle when comparing
source code are obvious when comparing hexdumps. If the card does not
respond to a command the way we expect then we can check what we are
doing differently to the Linux driver.
Commands are being successfully executed towards the card.

The full initialization procedure is not in place yet. Support for
commands that span multiple input/output pages needs to be implemented.

Current expected behavior when running the selftest is to successfully
execute the commands ENABLE_HCA, QUERY_ISSI, QUERY_PAGES, MANAGE_PAGES,
and then to fail in QUERY_HCA_CAP (likely because it has multipage
output).
The physical address of DMA memory can be determined at runtime (cheaply
and reliably) using memory.virtual_to_physical(). Now we do this
whenever we need a physical address rather than caching the value
returned by memory.dma_alloc().

Just means less state to keep track of in our data structures.
Now it is possible to request specific alignment for DMA memory.

This is practical. For example, Mellanox ConnectX-4 requires specific
alignments (e.g. 4KB).
Alignment was already checked with an assertion but this would not
necessarily succeed.
Command inputs and outputs are now split into multiple chained mailbox
records that each hold up to 512 bytes of data. This is mandatory for
large messages.
Maybe more work needed to correctly interpret the result.
Partial implementation of the initialization procedure.
Complete debug messages have become a little overwhelming now that we
are allocating thousands of pages of memory for the adapter. Just for
the moment disabling the hexdumps is the more sensible default.

More fine-grained debug logging is likely needed.
Refactored the error checking to always be done when posting a command
to the command queue. Previously this was a manual step for each command
and that seems more error prone.
Now successfully:

- Providing boot memory to the adapter (6 pages)
- Querying adapter capabilities (current and maximum)
- Setting adapter capabilities (keep current)
- Providing init memory to the adapter (4232 pages !)

The output from the init sequence looks like this:

    TRACE   Read the initialization segment
    TRACE   Write the physical location of the command queues to the init segment.
    TRACE   Wait for the 'initializing' field to clear
    fw_rev                          14      12      1220
    cmd_interface_rev               5
    cmdq_phy_addr                   cdata<void *>: 0x1f000000
    log_cmdq_size                   5
    log_cmdq_stride                 6
    ready                           true
    nic_interface_supported         true
    internal_timer                  2.0108995831647e+14
    health_syndrome                 0
    Command: ENABLE_HCA
    Command: QUERY_ISSI
      cur_issi            =         0
      sup_issi            =
         01
    Command: QUERY_PAGES
    query_pages'boot'               6
    Command: MANAGE_PAGES
    Command: QUERY_HCA_CAP
    Command: QUERY_HCA_CAP
    Capabilities - current and (maximum):
      eth_net_offloads         = 0   (0)
      end_pad                  = 1   (1)
      cq_eq_remap              = 1   (1)
      device_frequency_mhz     = 275 (275)
      log_max_vlan_list        = 12  (12)
      log_min_stride_sz_rq     = 0   (0)
      log_max_klm_list_size    = 16  (16)
      log_max_rqt              = 0   (0)
      log_max_l2_table         = 16  (16)
      log_max_current_uc_list  = 10  (10)
      log_min_stride_sz_sq     = 0   (0)
      log_uar_page_sz          = 0   (8)
      log_max_wq_sz            = 0   (0)
      log_max_current_mc_list  = 14  (14)
      log_max_msg              = 30  (30)
      log_max_stride_sz_rq     = 0   (0)
      max_flow_counter         = 0   (0)
      log_max_eq_sz            = 22  (22)
      log_max_rqt_size         = 0   (0)
      basic_cyclic_rcv_wqe     = 0   (0)
      cache_line_128byte       = 0   (0)
      max_tc                   = 0   (0)
      cmdif_checksum           = 0   (3)
      driver_version           = 0   (0)
      log_max_tis              = 0   (0)
      port_type                = 1   (1)
      wq_signature             = 1   (1)
      log_max_tir              = 0   (0)
      max_indirection          = 4   (4)
      log_max_rq               = 0   (0)
      cq_resize                = 1   (1)
      cq_oi                    = 1   (1)
      cq_moderation            = 1   (1)
      log_max_pd               = 24  (24)
      log_max_mkey             = 24  (24)
      log_max_transport_domain = 0   (0)
      rc                       = 1   (1)
      num_ports                = 1   (1)
      bf                       = 1   (1)
      vport_counters           = 1   (1)
      log_max_eq               = 8   (8)
      pad_tx_eth_packet        = 0   (0)
      log_pg_sz                = 12  (12)
      uar_sz                   = 5   (5)
      cq_period_start_from_cqe = 1   (1)
      uc                       = 1   (1)
      log_max_mrw_sz           = 64  (64)
      log_max_cq               = 24  (24)
      vport_group_manager      = 1   (1)
      log_max_tis_per_sq       = 0   (0)
      start_pad                = 0   (0)
      log_max_cq_sz            = 22  (22)
      nic_flow_table           = 0   (0)
      scqe_break_moderation    = 1   (1)
      ud                       = 1   (1)
      log_max_sq               = 0   (0)
      cqe_version              = 0   (0)
      log_bf_reg_size          = 9   (9)
      sctr_data_cqe            = 1   (1)
      log_max_rmp              = 0   (0)
      cqe_version              = 0   (0)
      log_bf_reg_size          = 9   (9)
      sctr_data_cqe            = 1   (1)
      log_max_rmp              = 0   (0)
      log_max_stride_sz_sq     = 0   (0)
      imaicl                   = 0   (0)
      xrc                      = 1   (1)
    Command: SET_HCA_CAP
    Command: QUERY_PAGES
    query_pages'init'               4232
    Command: MANAGE_PAGES
    Command: INIT_HCA
Pulls in more of the initialization procedure. See especially commit
4575dc7.
Added QUERY_VPORT_STATE, MODIFY_VPORT_STATE, QUERY_NIC_VPORT_CONTEXT.

Note: I am not sure that these commands are actually needed since we are
not using SR-IOV. The PRM mandates using some VPORT commands but I don't
see them in the trace from the Linux mlx5 driver. So we may be able to
remove this code.
Completed Mellanox initialization sequence
Required argument for new code merged from master in v2016.06.

Request exclusive lock on the device.
This commit introduces a clean and working version of the device
initialization.
eugeneia and others added 12 commits August 30, 2021 14:37
…nsmit

This is a bug where the physical addresses wider that 53 bits of payloads inserted into descriptors for DMA are truncated.

The fix here is to truncate after masking. Probably better would be to use lib.htonl instead of bswap(tonumber(...)) throughout the driver.
…ckets)

Merge remote-tracking branch 'alexandergall/mellanox' into mellanox-lwaftr

# Conflicts:
#	src/apps/mellanox/connectx.lua
# Conflicts:
#	src/apps/mellanox/connectx.lua
apps.connectx: use lib.macaddress instead of ptoi
@lukego
Copy link
Member

lukego commented Nov 12, 2021

Great going Max :)

@eugeneia eugeneia changed the title Driver for Mellanox Connect-X 4 & 5 cards Driver for Mellanox Connect-X 4/5/6 cards Dec 3, 2021
@eugeneia
Copy link
Member Author

eugeneia commented Dec 3, 2021

Seems that the driver also works for at least some Connect-X 6 cards, which is cool.

We should extend the tests a bit more before merging, and I think a blocker that remains is proper support for app:stop(). Right now the implementation is a stub and doesn’t really work. Needs to:

  • transition queues from IDLE->DEAD and delete the objects
  • free remaining packets on queues

@eugeneia
Copy link
Member Author

eugeneia commented Dec 6, 2021

Added support for stop(), extended the tests a bit more, and added a README.

I noticed that local-loopback between queues is currently not implemented. I tend towards leaving that as a TBD. (Couldn’t quite figure out how to enable it from a quick survey of the PRM.)

@eugeneia eugeneia merged commit bcf7d3c into snabbco:master Jan 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants