Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add resource pool client chain element #179

Merged
merged 5 commits into from
Jul 29, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
111 changes: 111 additions & 0 deletions pkg/networkservice/common/resourcepool/client.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
// Copyright (c) 2021 Nordix Foundation.
//
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at:
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package resourcepool

import (
"context"
"sync"

"github.com/golang/protobuf/ptypes/empty"
"github.com/networkservicemesh/api/pkg/api/networkservice"
"github.com/networkservicemesh/api/pkg/api/networkservice/mechanisms/common"
"github.com/networkservicemesh/sdk/pkg/networkservice/core/next"
"github.com/networkservicemesh/sdk/pkg/tools/log"
"github.com/pkg/errors"
"google.golang.org/grpc"

"github.com/networkservicemesh/sdk-sriov/pkg/sriov"
"github.com/networkservicemesh/sdk-sriov/pkg/sriov/config"
)

type resourcePoolClient struct {
resourcePool *resourcePoolConfig
}

// NewClient returns a new resource pool client chain element
func NewClient(
driverType sriov.DriverType,
resourceLock sync.Locker,
pciPool PCIPool,
resourcePool ResourcePool,
cfg *config.Config,
) networkservice.NetworkServiceClient {
return &resourcePoolClient{resourcePool: &resourcePoolConfig{
driverType: driverType,
resourceLock: resourceLock,
pciPool: pciPool,
resourcePool: resourcePool,
config: cfg,
selectedVFs: map[string]string{},
}}
}

func (i *resourcePoolClient) Request(ctx context.Context, request *networkservice.NetworkServiceRequest, opts ...grpc.CallOption) (*networkservice.Connection, error) {
logger := log.FromContext(ctx).WithField("resourcePoolClient", "Request")

oldPCIAddress := request.GetConnection().GetMechanism().GetParameters()[common.PCIAddressKey]
oldTokenID := request.GetConnection().GetMechanism().GetParameters()[TokenIDKey]

conn, err := next.Client(ctx).Request(ctx, request, opts...)
if err != nil {
return nil, err
}

tokenID, ok := conn.GetMechanism().GetParameters()[TokenIDKey]
if !ok {
logger.Infof("no token id present for endpoint connection %v", conn)
return conn, nil
}

err = assignVF(ctx, logger, conn, tokenID, i.resourcePool)
if err != nil {
_ = i.resourcePool.close(conn)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add here next.Close() in such case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

if _, closeErr := next.Client(ctx).Close(ctx, conn, opts...); closeErr != nil {
logger.Errorf("failed to close failed connection: %s %s", conn.GetId(), closeErr.Error())
}
return nil, err
}

Copy link

@Bolodya1997 Bolodya1997 Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: probably you can add here another next.Request() to share selected VF with NSE.

Copy link
Member Author

@pperiyasamy pperiyasamy Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bolodya1997 this seems to be good idea! how do you want to communicate selected VF to NSE in next.Request() ? via mechanism preference pci address parameter ? which mechanism preference to use here ?
or can we simply do request.Connection = conn before invoking next.Request() ?

Copy link

@Bolodya1997 Bolodya1997 Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I just have meant doing request.Connection = conn, but actually we probably also need here some condition to check if we need to update the remote side (we don't need it if there is no new data):

oldPCIAddress := request.GetConnection().GetMechanism().GetParameters[common.PCIAddressKey]

conn, err := next.Request() // <-- first Request
...
// Don't make second request if PCI address wasn't changed
if request.GetConnection().GetMechanism().GetParameters[common.PCIAddressKey] == oldPCIAddress {
    return conn, nil
}

request.Connection = conn.Clone()
if conn, err = next.Client(ctx).Request(ctx, request); err != nil {
    // Perform local cleanup in case of second Request failed
    _ = i.resourcePool.close(request.Connection)
}

return conn, err

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Bolodya1997 I'm bit confused with above code. on the first request endpoint would return token id in mechanism parameter, then resource pool client assigns VF for the given token id and also sets pci address (with new change) in the mechanism parameter and this is communicated into endpoint via another Request call. I just committed changes for it, just see if I'm missing any one your point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or do you meant endpoint returning pci address instead of token id ?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any hop sends refresh Requests - it is both need as a liveness proof and to update security tokens.
https://github.com/networkservicemesh/sdk/blob/main/pkg/networkservice/common/refresh/client.go
So any refresh Request will contain mechanism from the previous Request, so both SR-IOV token ID and PCI address will be already set. So there is no need to send an update Request in such case.
But here can potentially be a case, when for some strange reason Endpoint starts using another SR-IOV token ID. In such case even if it is a refresh Request, you may probably need to send renewed PCI address to the Endpoint.

So also ignoring Connection returned from the 2 Request can probably lead to some very strange bugs. What if something has changed on the Endpoint side and it needs to send back some of these changes and we are just ignoring them? I suppose that it would be better to return back the last returned Connection.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I understand it now, changed the code to address above concerns. please have a look.

// Don't make second request if PCI address, token id weren't changed
if conn.GetMechanism().GetParameters()[common.PCIAddressKey] == oldPCIAddress && oldTokenID == tokenID {
return conn, nil
}

// communicate assigned VF's pci address to endpoint by making another Request.
// this would also need subsequent chain elements to ignore handling of response
// for 2nd Request.
request.Connection = conn.Clone()
if conn, err = next.Client(ctx).Request(ctx, request); err != nil {
// Perform local cleanup in case of second Request failed
_ = i.resourcePool.close(request.Connection)
}

return conn, err
}

func (i *resourcePoolClient) Close(ctx context.Context, conn *networkservice.Connection, opts ...grpc.CallOption) (*empty.Empty, error) {
rv, err := next.Client(ctx).Close(ctx, conn, opts...)
closeErr := i.resourcePool.close(conn)

if err != nil && closeErr != nil {
return nil, errors.Wrapf(err, "failed to free VF: %v", closeErr)
}
if closeErr != nil {
return nil, errors.Wrap(closeErr, "failed to free VF")
}
return rv, err
}
142 changes: 142 additions & 0 deletions pkg/networkservice/common/resourcepool/common.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
// Copyright (c) 2021 Nordix Foundation.
//
// SPDX-License-Identifier: Apache-2.0
//
// Licensed under the Apache License, Version 2.0 (the "License");
// you may not use this file except in compliance with the License.
// You may obtain a copy of the License at:
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing, software
// distributed under the License is distributed on an "AS IS" BASIS,
// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

package resourcepool

import (
"context"
"sync"

"github.com/networkservicemesh/api/pkg/api/networkservice"
"github.com/networkservicemesh/api/pkg/api/networkservice/mechanisms/common"
"github.com/networkservicemesh/api/pkg/api/networkservice/mechanisms/vfio"
"github.com/networkservicemesh/sdk-kernel/pkg/kernel/networkservice/vfconfig"
"github.com/networkservicemesh/sdk/pkg/tools/log"
"github.com/pkg/errors"

"github.com/networkservicemesh/sdk-sriov/pkg/sriov"
"github.com/networkservicemesh/sdk-sriov/pkg/sriov/config"
)

const (
// TokenIDKey is a token ID mechanism parameter key
TokenIDKey = "tokenID" // TODO: move to api
)

// PCIPool is a pci.Pool interface
type PCIPool interface {
GetPCIFunction(pciAddr string) (sriov.PCIFunction, error)
BindDriver(ctx context.Context, iommuGroup uint, driverType sriov.DriverType) error
}

// ResourcePool is a resource.Pool interface
type ResourcePool interface {
Select(tokenID string, driverType sriov.DriverType) (string, error)
Free(vfPCIAddr string) error
}

type resourcePoolConfig struct {
driverType sriov.DriverType
resourceLock sync.Locker
pciPool PCIPool
resourcePool ResourcePool
config *config.Config
selectedVFs map[string]string
}

func (s *resourcePoolConfig) selectVF(connID string, vfConfig *vfconfig.VFConfig, tokenID string) (vf sriov.PCIFunction, err error) {
vfPCIAddr, err := s.resourcePool.Select(tokenID, s.driverType)
if err != nil {
return nil, errors.Wrapf(err, "failed to select VF for: %v", s.driverType)
}
s.selectedVFs[connID] = vfPCIAddr

for pfPCIAddr, pfCfg := range s.config.PhysicalFunctions {
for i, vfCfg := range pfCfg.VirtualFunctions {
if vfCfg.Address != vfPCIAddr {
continue
}

pf, err := s.pciPool.GetPCIFunction(pfPCIAddr)
if err != nil {
return nil, errors.Wrapf(err, "failed to get PF: %v", pfPCIAddr)
}
vfConfig.PFInterfaceName, err = pf.GetNetInterfaceName()
if err != nil {
return nil, errors.Errorf("failed to get PF net interface name: %v", pfPCIAddr)
}

vf, err := s.pciPool.GetPCIFunction(vfPCIAddr)
if err != nil {
return nil, errors.Wrapf(err, "failed to get VF: %v", vfPCIAddr)
}

vfConfig.VFNum = i

return vf, err
}
}

return nil, errors.Errorf("no VF with selected PCI address exists: %v", s.selectedVFs[connID])
}

func (s *resourcePoolConfig) close(conn *networkservice.Connection) error {
vfPCIAddr, ok := s.selectedVFs[conn.GetId()]
if !ok {
return nil
}
delete(s.selectedVFs, conn.GetId())

s.resourceLock.Lock()
defer s.resourceLock.Unlock()

return s.resourcePool.Free(vfPCIAddr)
}

func assignVF(ctx context.Context, logger log.Logger, conn *networkservice.Connection, tokenID string, resourcePool *resourcePoolConfig) error {
Copy link

@Bolodya1997 Bolodya1997 Jul 29, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: for our CI use-cases there were no need to add PCI address to mechanism here, but you can do it here for the mechanism(s) you need.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea, done!

vfConfig := vfconfig.Config(ctx)
resourcePool.resourceLock.Lock()
defer resourcePool.resourceLock.Unlock()

logger.Infof("trying to select VF for %v", resourcePool.driverType)
vf, err := resourcePool.selectVF(conn.GetId(), vfConfig, tokenID)
if err != nil {
return err
}
logger.Infof("selected VF: %+v", vf)

iommuGroup, err := vf.GetIOMMUGroup()
if err != nil {
return errors.Wrapf(err, "failed to get VF IOMMU group: %v", vf.GetPCIAddress())
}

if err = resourcePool.pciPool.BindDriver(ctx, iommuGroup, resourcePool.driverType); err != nil {
return err
}

switch resourcePool.driverType {
case sriov.KernelDriver:
vfConfig.VFInterfaceName, err = vf.GetNetInterfaceName()
if err != nil {
return errors.Wrapf(err, "failed to get VF net interface name: %v", vf.GetPCIAddress())
}
case sriov.VFIOPCIDriver:
vfio.ToMechanism(conn.GetMechanism()).SetIommuGroup(iommuGroup)
}
conn.GetMechanism().GetParameters()[common.PCIAddressKey] = vf.GetPCIAddress()

return nil
}
Loading