Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unknown parameters, the CONSTRUCT phase, and "strict" class constructors #110

Open
sanko opened this issue Jul 27, 2024 · 6 comments
Open

Comments

@sanko
Copy link

sanko commented Jul 27, 2024

Why are classes always strict?

My use case for using perl's new class feature is wrapping the AT protocol which backs the Bluesky social network. I started the project last year purely to work with perl's new OOP system. In AT.pm, I'm passing decoded JSON directly to class constructors (often recursively in the ADJUST phase) so any public or private change they make to the backend ("lexicon" as they call them) potentially breaks my project. Production and live environment are one in the same for Bluesky so every commit they push with lexicon modifications could add new fields to a response. Often, the fields are part of features still under construction and essentially useless for clients but they break my project because perl classes choke on any unrecognized field.

I've spent a little time trying to figure out if and how I could mitigate the sky falling on my head randomly and quickly found the unfortunately named CONSTRUCT phase as described in the wiki. I also read about some of the potential issues discussed in #77 but I didn't see anything directly related to the lede I'm eventually getting back to. perlclass doesn't come with CONSTRUCT support yet so I set aside time today to emulate it, manually writing dozens of functions to filter out parameters before passing what's left to the correct class constructor the same way I was manually writing getters before :reader was included with 5.40.

It didn't take me more than a few minutes to realize that sifting through incoming parameters that way was... painfully tedious and, even worse, redundant. I'm already defining the list of expected parameters when I define the class so why would I do it a second time in my CONSTRUCT phase? Then it hit me: why are classes always strict?

If classes are the future of OOP in Perl, does it not make sense that the strict and warnings pragmas become class aware like feature? Since use v5.12; enables strict 'all' and thus so does use v5.38 which is required to use classes, wouldn't it make sense to have

use v5.42; # I'm in the future, of course
no strict 'class';

disable, well, "strict" classes? Turned off, class.c would warn instead of croak. To extend that, no warnings 'class'; would disable the warning as well. With both off, unrecognized (or 'unrecognised' as Paul is in the UK) parameters would be silently ignored.

@Ovid
Copy link
Collaborator

Ovid commented Jul 27, 2024

There are always the unfortunate cases where best practices become painful. In the early days of Perl 5, people would often ask how to make variable variable names instead of using a hash. They asked this so often that the explanation made it into perlfaq7.

So if you take the (naïve) Cache::LRU example I often use.

class Cache::LRU {
    use Hash::Ordered;
    field $cache = Hash::Ordered->new;
    field $max_size :param = 20;

    method set ( $key, $value ) {
        $cache->unshift( $key, $value );
        if ( $cache->keys >= $max_size ) {
            $cache->pop;
        }
    }
    method get($key) {
        return unless $cache->exists($key);
        my $value = $cache->get($key);
        $self->set( $key, $value );
        return $value;
    }
}

You notice that production is slow, and someone points out that there is tons of database traffic, mostly to a single table. LRU caches are often great when you have an uneven distribution of data requested, so when you contruct your cache, you triple the number of cache entries.

But you discover that performance hasn't improved. Maybe you have an even distribution of cache entries and this caching strategy is bad? Maybe you didn't have enough entries in the cache?

And then you see this in your code:

my $cache = Cache::LRU->new( max_entries => 60 );

You used the wrong param name, it's silently dropped to the floor, and you're none the wiser. For most OOP systems in Perl, silently dropping extra arguments is the default behavior. Even dumping an object out often won't tell you what's going on.

This can be an extremely hard bug to track down because it's silent. In your case, you want it to be a warning, so that will help, but warnings on "this can't work" are a code smell (not necessarily wrong, but a smell, nonetheless).

Thus, we optimize for the common case of failure. I've definitely hit this bug before, as have others, which is why modules like MooseX::StrictConstructor exist.

In your case, you understandably have a problem where an API is heavily under flux. I get that this is frustrating. In the future, a MOP will be added which can let you extend a class, but as we're experimental, we're not there yet.

Another possibility is to "grin and bear it", but since you're writing this, I can understand why you don't want to do that.

There's also another issue that I teach people when I'm explaining OOP: OOP is great if you have a well-defined problem space. It often breaks down horribly when you don't. Again, this is very common source of bugs. This is probably why many OOP languages I've worked with don't allow unknown arguments to the constructor.

However, the canonical way of passing extra, unknown args is via a payload (or whatever you want to call it). It might look like this:

#!/usr/bin/env perl

use v5.40.0;
use experimental 'class';

class Payload {
    use Carp;
    field $name    :param;
    field $payload :param = {};

    method get ($key) {
        unless (exists $payload->{$key}) {
            croak "Key $key does not exist in payload";
        }
        return $payload->{$key};
    }
}

With that, you have a get method (though it's poorly named).

You can run this:

my $payload = Payload->new(name => 'test', payload => {key => 'value'});
say $payload->get('key');
say $payload->get('non-existent-key');

When you get to that final line, the code croaks, but when you contruct the object, you can pass in anything you want.

I realize this is not a satisfying answer and the design team knew that not everyone would be happy with this (and I seem to recall that agreement was not uniform).

The above, I might add, is conceptually similar to how other languages do this. For example, in Python:

class Payload:
    def __init__(self, name, **kwargs):
        self.name = name
        self.__payload = kwargs

    def get(self, key):
        if key not in self.__payload:
            raise KeyError(f"Key: {key} not found in payload")
        return self.__payload.get(key)

(There's also the possibility of auto-generating code, but I've gone on long enough)

@HaraldJoerg
Copy link
Contributor

There's also another issue that I teach people when I'm explaining OOP: OOP is great if you have a well-defined problem space. It often breaks down horribly when you don't. Again, this is very common source of bugs. This is probably why many OOP languages I've worked with don't allow unknown arguments to the constructor.

I think this is rubbing the camel the wrong way. In my opinion Perl is the programming language of choice if you do not have a well-defined problem space. The fact that many OOP languages don't allow unknown arguments to the constructor underpins this: Perl OO from its crudest my $class = shift; bless { @_ },$class; to Moose allow and silently ignore unknown parameters, and if Perl is the only programming language which allows them, then this is a feature.

I am absolutely fine with MooseX::StrictConstructor as an add-on (and have used it frequently), and I am also ok with Corinna being strict per default. Yet I consider having a opt-in feature to disable these checks useful and consistent with Perl's history and strengths. The MOP of Object::Pad allows this for Object::Pad classes, and I hope we'll have one for Corinna, too.

I am aware that non-strictness is a source of bugs, but in this particular case it needs sloppy, sloppy, sloppy developers to run into that bug. Sloppy when copying the parameter name from the docs to the constructor code, sloppy when doing code reviews and sloppy when writing tests. Punishing all developers with strictness because sloppy, sloppy, sloppy developers are known to exist seems patronizing.

@sanko
Copy link
Author

sanko commented Jul 27, 2024

I am aware that non-strictness is a source of bugs, but in this particular case it needs sloppy, sloppy, sloppy developers to run into that bug.

And that sloppy dev would need to explicitly ask perl to allow them to be sloppy in the first place.

I agree, perl has a history of allowing you to do 'odd' things when you explicitly ask it to take the training wheels off. An example that came to mind when I was creating this issue was the redefined warnings category which is on by default.

sub alpha {'one thing'}
{
    #~ no warnings 'redefine';
    *alpha = sub {'another'};
}

This code produces an error even without use strict; or use warnings;. You need to explicitly turn the category off with no warnings 'redefine'; or no warnings; and unless you're just used to producing bad code, you're not going to write either of those by accident. A similar strict-by-default fatal and default warning for classes seemed to be in line with perl's history. 🤷🏽‍♂️

@sanko
Copy link
Author

sanko commented Jul 28, 2024

When you get to that final line, the code croaks, but when you contruct the object, you can pass in anything you want.

@Ovid, your reply was made first but I put off responding because I had to read it again with fresh eyes. I'm playing a tourist IRL right now, working from a phone, and wasn't sure where most of what you wrote was going or how this Payload scheme fit. It's basically the opposite of what I'm describing so maybe I should just provide a hypothetical code example instead.

This is what I want:

use v5.42.0;            # because we're still in the future
use experimental 'class';
no warnings 'class';    # or no strict 'class';

class Whatever {
    field $name : param;
    field $timestamp : param;
}

# ...later...
my $obj = Whatever->new(
    name      => 'Jake',
    timestamp => time,

    # ...okay so far but...
    all     => 'of',
    this    => 'is',
    ignored => 'by',
    the     => 'constructor'
);

I want to be able to ask perl to ignore unknown parameters passed to class constructors. I do not want to parse out unknown data to a special catch-all payload field. I do not want to access the data in unknown fields with a getter. I don't want data in unknown fields to be stored in the object at all. I want to be able to have perlclass ignore parameters I didn't explicitly tell it to look for.

I want the default behavior (a fatal error) to remain unchanged but am proposing the ability to make it non-fatal and potentially produce no warning at all. As @HaraldJoerg best described it, I would need to opt-in to this behavior. I would have to specifically ask perl to do this.

Additionally, if I opted in and later accidentally passed along max_entries instead of max_size as in your LRU example, the constructor would still throw a fatal error when the stash is sealed because it would be looking for the missing max_size parameter. Changing that behavior is not a part of what I'm proposing.

@HaraldJoerg
Copy link
Contributor

Perhaps this should indeed be done with the meta object protocol (which I hope will be available soon, mostly for debugging) - and then implemented as a CPAN module.

Here's a rewrite of your code in terms of Object::Pad which is already available:

use v5.34.0;
use Object::Pad;
no warnings; # The MOP of Object::Pad is experimental

class Whatever :strict(params) { # strictness is opt-in with Object::Pad
    apply Object::PadX::SloppyConstructor;
    field $name : param;
    field $timestamp : param;
}

# ...later...
my $obj = Whatever->sloppy_new(
    name      => 'Jake',
    timestamp => time,

    # ...okay so far but...
    all     => 'of',
    this    => 'is',
    ignored => 'by',
    the     => 'constructor'
);

if ($ENV{TESTING}  && (my %args = $obj->extra_args)) {
    warn "At your leisure, check these extra arguments:\n";
    warn (join "\n", map { qq('$_' => $args{$_}) } sort keys %args);
}

Changes:

  • I use Object::Pad instead of feature 'class'. Object::Pad is lax by default, but we can make it Corinna-like by adding :strict(params) to the class declaration.
  • I add a role (roles are also not yet available for feature 'class') named Object::PadX::SloppyConstructor;
  • I change the constructor from new() to sloppy_new() which is a method provided by the role. This is another opt-in layer: You can have both strict construction with new() and sloppy construction with sloppy_new(). The caller decides whether extra arguments should be ignored. I think this is better than having the class decide as in the case of no strict class.
  • The extra arguments can be made available for debugging or during testing by the role.
    The role itself isn't all that difficult:
use 5.034;
use warnings;
use feature 'signatures';
use Object::Pad 0.809;
use Object::Pad::MOP::Class;

role Object::PadX::SloppyConstructor {
    no warnings 'experimental';

    field %extra_args :reader;

    my sub _collect_params ($metaclass) {
        return
            map { $_->get_attribute_value('param') }
            grep { $_->has_attribute('param') }
            $metaclass->fields;
    }

    method set_extra_args (%args){
        %extra_args = %args;
        return $self;
    }

    method sloppy_new :common (%args) {
        my @params; # List of permitted parameters

        my $metaclass = Object::Pad::MOP::Class->for_class( $class );
        for my $metarole ($metaclass->all_roles) {
            push @params,_collect_params($metarole);
        }
        while ($metaclass) {
            push @params,_collect_params($metaclass);
            $metaclass = ($metaclass->superclasses)[0];
        }
        my %params =
            map { ($_ => delete $args{$_}) }
            grep { exists $args{$_} }
            @params;
        my $self = $class->new(%params)->set_extra_args(%args);
    }
};

1;

@sanko
Copy link
Author

sanko commented Jul 29, 2024

Neat bit of code. I'm sure someone out there would appreciate it being on CPAN.

I've been planning to raise this here for a few months and already understood that Object::Pad does precisely what I'm looking for right out of the box:

use v5.40;
use Object::Pad;

class Whatever {
    field $name : param;
    field $timestamp : param;
}

my $obj = Whatever->new(
    name      => 'Jake',
    timestamp => time,

    # ...okay so far but...
    all     => 'of',
    this    => 'is',
    ignored => 'by',
    the     => 'constructor'
);

I was hoping for the option of feature parity in CORE but I'll just use the best tool for the job instead. Thanks, @HaraldJoerg.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants