Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Efficient NChooseKs in BO #156

Merged
merged 21 commits into from
Apr 24, 2023
Merged

Conversation

jduerholt
Copy link
Contributor

This PR adds the following things:

  • Hartmann benchmark function including nchooseks
  • efficient implementation of nchooseks
  • tidy up of ask for BO strategies

This PR depends on this PR in botorch: pytorch/botorch#1779. Tests will only able to run when the thing is merged in botorch,

@Osburg
Copy link
Collaborator

Osburg commented Apr 6, 2023

I think get_nchoosek_constraints() does not work properly if you have a domain with more than one NChooseK constraint. I think this is due to c not being constant when you loop over all NChooseK constraints. The returned functions will all behave according to the last NChooseK constraint, while all the others are not represented. I haven't opened an issue yet, bc I haven't seen any application of get_nchoosek_constriant() so far and wanted to provide a fix when adding the narrow gaussian relaxation to doe. However, since you make use of this function here, I thought I'd mention it. Here is a minimal example where things go wrong.

from bofire.data_models.features.api import ContinuousInput
from bofire.data_models.constraints.api import NChooseKConstraint
from bofire.data_models.domain.api import Domain
import torch

domain = Domain(
    input_features=[
        ContinuousInput(key="x1", bounds=[0,1]),
        ContinuousInput(key="x2", bounds=[0,1]),
        ContinuousInput(key="x3", bounds=[0,1]),
    ],
    output_features=[
        ContinuousOutput(key="y")
    ],
    constraints=[
        NChooseKConstraint(features=["x1","x2","x3"], min_count=0, max_count=1, none_also_valid=False),
        NChooseKConstraint(features=["x1","x2","x3"], min_count=0, max_count=2, none_also_valid=False)
    ]
)

nck_constraints = get_nchoosek_constraints(domain)
t = torch.tensor([[1,0,0],[1,1,0],[1,1,1]])

print(nck_constraints[0](t))
print(nck_constraints[1](t))

Since there is no implementation for __call__ and jacobian for NChooseK constraints, my suggestion would be to use these to evaluate NChooseK constraints using the narrow gaussians. This should solve the above problem and could look similar to this.

def narrow_gaussian(x, ell=1e-3):
    return torch.exp(-0.5 * (x / ell) ** 2)


class NChooseKConstraint(Constraint):
    """NChooseK constraint that defines how many ingredients are allowed in a formulation.

    Attributes:
        features (List[str]): List of feature keys to which the constraint applies.
        min_count (int): Minimal number of non-zero/active feature values.
        max_count (int): Maximum number of non-zero/active feature values.
        none_also_valid (bool): In case that min_count > 0,
            this flag decides if zero active features are also allowed.
    """

    type: Literal["NChooseKConstraint"] = "NChooseKConstraint"
    features: FeatureKeys
    min_count: int
    max_count: int
    none_also_valid: bool

    @validator("features")
    def validate_features_unique(cls, features: List[str]):
        """Validates that provided feature keys are unique."""
        if len(features) != len(set(features)):
            raise ValueError("features must be unique")
        return features

    @root_validator(pre=False, skip_on_failure=True)
    def validate_counts(cls, values):
        """Validates if the minimum and maximum of allowed features are smaller than the overall number of features."""
        features = values["features"]
        min_count = values["min_count"]
        max_count = values["max_count"]

        if min_count > len(features):
            raise ValueError("min_count must be <= # of features")
        if max_count > len(features):
            raise ValueError("max_count must be <= # of features")
        if min_count > max_count:
            raise ValueError("min_values must be <= max_values")

        return values

    # TODO: test
    def __call__(self, experiments: pd.DataFrame) -> pd.Series:
        """Smooth relaxation of NChooseK constraint by countig the number of zeros in a candidate by a sum of 
        narrow gaussians centered at zero.

        Args:
            experiments (pd.DataFrame): Data to evaluate the constraint on.

        Returns:
            pd.Series containing the constraint violation for each experiment (row in experiments argument).
        """
        indices = torch.tensor([i for i,name in enumerate(experiments.columns) if name in self.features], dtype=torch.int64)
        experiments_tensor = torch.tensor(experiments.to_numpy(), requires_grad=False)

        max_count_violation = torch.zeros(experiments_tensor.shape[0])
        min_count_violation = torch.zeros(experiments_tensor.shape[0])

        if self.max_count != len(self.features):
            max_count_violation = torch.relu(-1 * narrow_gaussian(x=experiments_tensor[..., indices]).sum(dim=-1) + (len(self.features) - self.max_count))  # type: ignore
            

        if self.min_count > 0:
            min_count_violation = torch.relu(narrow_gaussian(x=experiments_tensor[..., indices]).sum(dim=-1) - (len(self.features) - self.min_count))  # type: ignore

        return pd.Series(max_count_violation + min_count_violation)

    def is_fulfilled(self, experiments: pd.DataFrame, tol: float = 1e-6) -> pd.Series:
        """Check if the concurrency constraint is fulfilled for all the rows of the provided dataframe.

        Args:
            experiments (pd.DataFrame): Dataframe to evaluate constraint on.
            tol (float,optional): tolerance parameter. A constraint is considered as not fulfilled
                if the violation is larger than tol. Defaults to 1e-6.

        Returns:
            bool: True if fulfilled else False.
        """
        cols = self.features
        sums = (np.abs(experiments[cols]) > tol).sum(axis=1)

        lower = sums >= self.min_count
        upper = sums <= self.max_count

        if not self.none_also_valid:
            # return lower.all() and upper.all()
            return pd.Series(np.logical_and(lower, upper), index=experiments.index)
        else:
            none = sums == 0
            return pd.Series(
                np.logical_or(none, np.logical_and(lower, upper)),
                index=experiments.index,
            )

    def __str__(self):
        """Generate string representation of the constraint.

        Returns:
            str: string representation of the constraint.
        """
        res = (
            "of the features "
            + ", ".join(self.features)
            + f" between {self.min_count} and {self.max_count} must be used"
        )
        if self.none_also_valid:
            res += " (none is also ok)"
        return res

    # TODO: test
    def jacobian(self, experiments: pd.DataFrame) -> pd.DataFrame:
        """Smooth relaxation of NChooseK constraint jacobian by countig the number of zeros in a candidate by a sum of 
        narrow gaussians centered at zero.

        Args:
            experiments (pd.DataFrame): Data to evaluate the constraint jacobian on.

        Returns:
            pd.DataFrame containing the constraint violation jacobian for each experiment (row in experiments argument).
        """
        indices = torch.tensor([i for i,name in enumerate(experiments.columns) if name in self.features], dtype=torch.int64)
        experiments_tensor = torch.tensor(experiments.to_numpy(), requires_grad=True)

        max_count_violation = torch.zeros(experiments_tensor.shape[0])
        min_count_violation = torch.zeros(experiments_tensor.shape[0])

        if self.max_count != len(self.features):
            max_count_violation = torch.relu(-1 * narrow_gaussian(x=experiments_tensor[..., indices]).sum(dim=-1) + (len(self.features) - self.max_count))  # type: ignore
            

        if self.min_count > 0:
            min_count_violation = torch.relu(narrow_gaussian(x=experiments_tensor[..., indices]).sum(dim=-1) - (len(self.features) - self.min_count))  # type: ignore

        violation = max_count_violation + min_count_violation

        violation.backward(torch.ones(experiments_tensor.shape[0]))

        jacobian = experiments_tensor.grad.detach().numpy()

        return pd.DataFrame(jacobian, columns=[f"dg/{name}" for name in self.features])

@jduerholt
Copy link
Contributor Author

Thanks for spotting this, honestly I have no idea why it is not working as intended. What do you mean with c not being constant? It should not be constant.

@Osburg
Copy link
Collaborator

Osburg commented Apr 6, 2023

I think I didn't explain what I meant well. I think the problem is that each of the function refers to c. But c changes from iteration to iteration. In the end of the loop, the last value that c took will remain and all functions will access this latest value of c. Here is a more simple example where we get the same problem: Say we want to have a list of functions, the first one should be constant 0, the second constant 1 ... If we do it like in get_nchoosek_constraints() this looks like this

funs = []
for i in range(2):
    funs.append(lambda x: i)

However, calling both functions

print(funs[0](1))
print(funs[1](2))

shows that both functions will return 1, since both functions will access i when being called and since i=1 after the loop both will return 1. We can also set i=2, now both functions will return 2!

i = 2

print(funs[0](1))
print(funs[1](2))

I haven't looked it up, but to my understanding this is because the there is only one i for both functions. However, if we an own scope for each function, this will not happen since every function has it's "own" i. As I said, this is only my intuitive understanding, but at least this logic seems to work: If we define these functions inside the scopes of separate function calls, we do not have a problem! For example by doing it like this:

def get_fun(i):
    return lambda x: i

funs = []
for i in range(2):
    funs.append(get_fun(i))

print(funs[0](1))
print(funs[1](2))

@jduerholt
Copy link
Contributor Author

You are completely correct. I overlooked this, when coding this up. Many thanks for spotting this! I just fixed it and added some tests for it.

Concerning your suggestion of implementing this narrow gaussian approach also directly in the the NChooseK class. Looks fine for me. Via this, you could then use the Jacobian in DoE. For botorch, this was not needed as botorch is creating the jacobian under the hood.

But I have at least one question: why do you need the relu?

Best,

Johannes

@Osburg
Copy link
Collaborator

Osburg commented Apr 6, 2023

Sure, np! Ok, then I will proceed as planned with the NChooseK constraint. The relu is needed to represent both the min_count and max_count constraints in one number. In the convention where a nonpositive number means the constraint is fulfilled (opposite sign to your convention but consistent with the definitions in doe) we can set negative values to zero using relu.
Then we get a positive number if one of the two constraints is violated and 0 otherwise. We can then define the total constraint violation by adding up the violations of the min_count and max_count constraint (we could not do this without ReLU, because then a positive contribution from one of the constraint could cancel out with a negative contribution from the other one).
This is at least what I planned, but I haven't written any tests so far.

Side note: for every NChooseK constraint that is not ill-posed at most one of the two constraint evaluations will give you a positive number. So basically the ReLU just prevents unwanted behaviour due to cancellation of negative and positive terms if one of the min_count/max_count constraint is violated while the other one isn't.

@jduerholt
Copy link
Contributor Author

This makes absolutely sense. Nice idea, I had to use the other convention, since botorch is working with this one.

Have you already tried if ipopt is able to handle it properly?

@Osburg Osburg mentioned this pull request Apr 9, 2023
@jduerholt jduerholt merged commit e3fa241 into main Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants