Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SIP-99C] Proposal for model and business validation #25828

Closed
john-bodley opened this issue Nov 1, 2023 · 1 comment
Closed

[SIP-99C] Proposal for model and business validation #25828

john-bodley opened this issue Nov 1, 2023 · 1 comment
Labels
sip Superset Improvement Proposal

Comments

@john-bodley
Copy link
Member

john-bodley commented Nov 1, 2023

[SIP-99C] Proposal for model and business validation

This SIP is part of the [SIP-99] Proposal for correctly handling business logic series. Specifically, it proposes formalizing where and how model and business validation should occur.

Validation

There are typically four places where validation can occur:

  1. Database: via CHECK, FOREIGN KEY, or UNIQUE constraints, triggers, etc.
  2. SQLAlchemy ORM: via the validates() decorator
  3. DAOs: via the various validate_*() methods
  4. Commands: via the BaseCommand.validate() class method

These are ranked and thus preference should be given—per the "shift left" approach—for validation to occur at the lowest level possible to ensure consistency, adherence of the DRY principle, etc.

It's worth noting that validation at the database level is nuanced given how different databases handle NULL values from a distinct uniqueness perspective. A great example of this is ensuring that a dataset is unique—in terms of the (schema, table_name) tuple—where an explicit UNIQUE constraint is missing, given that (as mentioned),

The reason it does not physically exist is MySQL, PostgreSQL, etc. have a different interpretation of uniqueness when it comes to NULL which is problematic given the schema is optional.

i.e., in MySQL the (NULL, foo) and (NULL, foo) tuples are perceived as being different, yet from a dataset perspective we would perceive these as being the same object.

The main challenges with our current approach are:

  1. We tend to specify/enforce non-business validation logic at the DAO or Command level.
  2. We tend to "ask for permission" as opposed to "ask for forgiveness" which is both inefficient (due to the redundancy of first checking and then actioning) and—when combined with (1)—does not protect against a potential race condition.
  3. The validate() method tends to augment state—likely to address the inefficiency of (2)—which is an anti-pattern.

Examples

The following examples illustrate where the validation occurs at the wrong place:

The following examples illustrate where the validation checks could be violated:

  • The ChartDAO.add_favorite() method first checks whether the user has favorited said chart and if not creates the favorite as opposed to simply trying to favorite the chart and relying on a UNIQUE constraint to block (at the database level) the creation if it already exists.

The following examples illustrate where validation augments state:

  • The UpdateDatabaseCommand.validate() method assigns the self._model attribute after fetching the model so it can be leveraged during the run phase.
  • The CreateChartCommand.validate() method assigns the self._model attribute and augments the self._properities attribute.
  • The ChartWarmUpCacheCommand.validate() method reassigns the self._chart_or_id attribute if the variable is an ID. Post validation the attribute name is misleading given it only pertains to a Chart object—hence this confusing statement.

The following examples illustrate inconsistencies between where validation occurs:

Proposed Change

  • Validation should occur at the lowest possible (left most) level, given not all database operations originate at the DAO or Command level.
  • The DAO and Command should primarily be used to validate business logic, e.g., ownership, access controls, etc.
  • We should adopt a "ask for forgiveness" as opposed to "ask for permission" type approach by leveraging try/except blocks.
  • The BaseCommand.validate() method should be removed in favor of in-place validation, i.e., leveraging the “validate as you go” approach during the run() phase.

Examples

The following code illustrates how the model and DAO should be constructed, where it is evident that the model validation is handled at the database level and the DAO uses the “ask for forgiveness” approach*.

class FavStar(Model):
    __tablename__ = "favstar"
    __table_args__ = (UniqueConstraint("user_id", "class_name", "obj_id"),)

    id = Column(Integer, primary_key=True)
    user_id = Column(Integer, ForeignKey("ab_user.id"))
    class_name = Column(String(50))
    obj_id = Column(Integer)
class ChartDAO(BaseDAO[Slice]):
    def favorite(chart: Slice) -> None:
        try:
            db.session.begin_nested():
                db.session.add(
                    FavStar(
                        class_name=FavStarClassName.CHART,
                        obj_id=chart.id,
                        user_id=get_user_id(),
                    )
                )
        except IntegrityError:
            pass  # Already favorited.
  • The try/except approach is based on the fact that the code block will mostly succeed (which is the case in this example given one can really only favorite an object which is unfavorited). Additionally this approach removes the redundant check of first checking whether said item has already been favorited by the user.

The following code illustrates the use of a CHECK constraint in conjunction with the “ask for forgiveness” approach.

class Tag(Model):
    __tablename__ = "tag"
    __table_args__ = (CheckConstraint(func.regexp_like(name, "[^:,]")),)

    id = Column(Integer, primary_key=True)
    name = Column(String(250), unique=True)
    type = Column(Enum(TagType))
    description = Column(Text)
TagDAO(BaseDAO[Tag]):
    def create(object_type: ObjectType, object_id: int, name: str) -> None:
        try:
            with db.session.begin_nested():
                db.session.add(
                    TaggedObject(
                        object_id=object_id,
                        object_type=object_type,
                        tag=TagDAO.get_by_name(name, TagType.custom)
                    )
                )
        except IntegrityError as ex:
            raise DAOCreateFailedError() from ex

New or Changed Public Interfaces

None.

New Dependencies

None.

Migration Plan and Compatibility

  • A wiki page will be created to specify the rules which should be adhered to.
  • A series of PRs to address existing rule violations.
  • New PRs (where possible) should adhere to the rules.

Rejected Alternatives

None.

@john-bodley john-bodley changed the title SIP-99C] Proposal for model and business validation [SIP-99C] Proposal for model and business validation Nov 2, 2023
@john-bodley john-bodley added the sip Superset Improvement Proposal label Nov 2, 2023
@rusackas rusackas moved this to [RESULT][VOTE] Approved in SIPs (Superset Improvement Proposals) Dec 6, 2023
@rusackas
Copy link
Member

rusackas commented Dec 6, 2023

[VOTE] approved! 🎉

@rusackas rusackas closed this as completed Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sip Superset Improvement Proposal
Projects
Status: [RESULT][VOTE] Approved
Development

No branches or pull requests

2 participants