Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong similar code lint #6039

Closed
leonardo-m opened this issue Sep 14, 2020 · 13 comments · Fixed by #6086
Closed

Wrong similar code lint #6039

leonardo-m opened this issue Sep 14, 2020 · 13 comments · Fixed by #6086
Labels
A-lint Area: New lints

Comments

@leonardo-m
Copy link

This is a very common type of bugs that I'd like Clippy to catch (clippy::correctness), I think it's too much to ask rustc to do this:

struct Vec3 { x: f64, y: f64, z: f64 }

impl Eq for Vec3 {}

impl PartialEq for Vec3 {
    fn eq(&self, other: &Self) -> bool {
        self.x == other.y && self.y == other.y && self.z == other.z
    }
}

fn main() {}

For C++ similar to that Rust code PVS-Studio gives the error:

V1013 Suspicious subexpression a.x == b.y in a sequence of similar comparisons. vec3.h 182

I predict that if you implement a similar lint in a reasonable way then people that use Clippy regularly will suddenly find several new bugs in their code :-)

@leonardo-m leonardo-m added the A-lint Area: New lints label Sep 14, 2020
@Ryan1729
Copy link
Contributor

Ryan1729 commented Sep 24, 2020

I recently had a copy-paste mistake of of this form involving < comparisons instead of ==.
This leads me to two thoughts:

  1. This lint seems like it would be useful
  2. This lint could be generalized to something like the following
expr.ident_1a binop1 expr.ident_1b binop2 expr.ident_2a binop1 expr.ident_2b binop2 ... binop2 expr.ident_Na binop1 expr.ident_Nb

where expr is some expression, the each variable starting with indent_... is some identifier,
and, where binop1 and binop2 can be any two BinOpKinds where the precedence and/or parenthesis are such that binop1 happens first.

The lint would then trigger if one of the identifiers in the expr.ident a-b pairs don't match, given that there are at least two binop1 clauses, and all the rest of the a-b pairs do.

@leonardo-m
Copy link
Author

I recently had a copy-paste mistake of of this form involving

I read the PVS-Studio blog posts, and I think the lint I've shown here is among the few ones that catches more of the bugs that rustc/Clippy don't catch already.

@Ryan1729
Copy link
Contributor

Okay, I think I'll give implementing this one a go. For a name, I think suspicious_chained_operators fits the naming conventions.

One thing I'm unsure of is whether the lint should be warn or deny by default. This is pretty clearly a correctness lint, and all the other correctness lints are deny by default, so I guess I'll start with deny. On the other hand this lint does technically have the potential for false positives, however, I suspect that cases like that will be vanishingly small in practice. I cannot imagine a case where someone would actually want a.x < b.x && a.x < b.y for example.

@leonardo-m
Copy link
Author

Some examples where it could fire, they were real bugs in the code of very famous open source projects:

// ------------------------------------

struct cyapa_softc { int delta_x, delta_y, cap_resx, cap_resy; };

void CodeRevew(cyapa_softc *sc) {
  if (sc->delta_x > sc->cap_resx)
    sc->delta_x = sc->cap_resx;
  if (sc->delta_x < -sc->cap_resx)
    sc->delta_x = -sc->cap_resx;
  if (sc->delta_y > sc->cap_resx)
    sc->delta_y = sc->cap_resy;
  if (sc->delta_y < -sc->cap_resy)
     sc->delta_y = -sc->cap_resy;
}

// ------------------------------------

void SetDelta()
{
  if (Id == k_IA64)
    Delta = 16;
  else if (Id == k_ARM || Id == k_PPC || Id == k_PPC)    // <=
    Delta = 4;
  else if (Id == k_ARMT)
    Delta = 2;
  else
    Delta = 0;
}

// ------------------------------------

static void AddRenamePair(...., NRecursedType::EEnum type, ....)
{
  ....
  if (type == NRecursedType::kRecursed)
    val.AddAscii("-r");
  else if (type == NRecursedType::kRecursed)    // <=
    val.AddAscii("-r0");
  ....
}

// ------------------------------------

#define MEGASAS_MAX_SGE 128             /* Firmware limit */
....
static void megasas_scsi_realize(PCIDevice *dev, Error **errp)
{
  ....
  if (s->fw_sge >= MEGASAS_MAX_SGE - MFI_PASS_FRAME_SIZE) {
    ....
  } else if (s->fw_sge >= 128 - MFI_PASS_FRAME_SIZE) {
    ....
  }
  ....
}

// ------------------------------------

target_ulong helper_mftc0_cause(CPUMIPSState *env)
{
  ....
  CPUMIPSState *other = mips_cpu_map_tc(env, &other_tc);

  if (other_tc == other->current_tc) {
    tccause = other->CP0_Cause;
  } else {
    tccause = other->CP0_Cause;    // <=
  }
  ....
}

// ------------------------------------

point3D operator/(const point3D &p1, const point3D &p2)
{
  return point3D( p1.x/p2.x , p1.y/p2.y , p1.z/p1.z );
}

// ------------------------------------

class CGitStatusListCtrl :
  public CListCtrl
{
  ....
  CString m_Rev1;
  CString m_Rev2;
  ....
};

void CGitStatusListCtrl::OnContextMenuList(....)
{
  ....
  if( (!this->m_Rev1.IsEmpty()) || (!this->m_Rev1.IsEmpty()) )
  ....
}

// ------------------------------------

int ASN1_PRINTABLE_type(const unsigned char *s, int len)
{
  int c;
  int ia5 = 0;
  ....
  if (!(((c >= 'a') && (c <= 'z')) ||
        ((c >= 'A') && (c <= 'Z')) ||
        (c == ' ') ||                   // <=
        ((c >= '0') && (c <= '9')) ||
        (c == ' ') || (c == '\'') ||    // <=
        (c == '(') || (c == ')') ||
        (c == '+') || (c == ',') ||
        (c == '-') || (c == '.') ||
        (c == '/') || (c == ':') ||
        (c == '=') || (c == '?')))
    ia5 = 1;
  ....
}

// ------------------------------------

Vector3dF
Matrix3F::SolveEigenproblem(Matrix3F* eigenvectors) const
{
  // The matrix must be symmetric.
  const float epsilon = std::numeric_limits<float>::epsilon();
  if (std::abs(data_[M01] - data_[M10]) > epsilon ||
      std::abs(data_[M02] - data_[M02]) > epsilon ||
      std::abs(data_[M12] - data_[M21]) > epsilon) {
    NOTREACHED();
    return Vector3dF();
  }
  ....
}

// ------------------------------------

bool IsTextField(const FormFieldData& field) {
  return
    field.form_control_type == "text" ||
    field.form_control_type == "search" ||
    field.form_control_type == "tel" ||
    field.form_control_type == "url" ||
    field.form_control_type == "email" ||
    field.form_control_type == "text";
}

// ------------------------------------

int
key_parse(struct mbuf *m, struct socket *so)
{
  ....
  if ((m->m_flags & M_PKTHDR) == 0 ||
      m->m_pkthdr.len != m->m_pkthdr.len) { // <=
    ....
    goto senderror;
  }
  ....
}

// ------------------------------------

static int
qla_tx_tso(qla_host_t *ha, struct mbuf *mp, ....)
{
  ....
  if ((*tcp_opt != 0x01) || (*(tcp_opt + 1) != 0x01) ||
    (*(tcp_opt + 2) != 0x08) || (*(tcp_opt + 2) != 10)) { // <=
    return -1;
  }
  ....
}

// ------------------------------------

int dbg_check_nondata_nodes_order(....)
{
  ....
  sa = container_of(cur, struct ubifs_scan_node, list);
  sb = container_of(cur->next, struct ubifs_scan_node, list);

  if (sa->type != UBIFS_INO_NODE && sa->type != UBIFS_DENT_NODE &&
      sa->type != UBIFS_XENT_NODE) {
    ubifs_err(c, "bad node type %d", sa->type);
    ubifs_dump_node(c, sa->node);
    return -EINVAL;
  }
  if (sa->type != UBIFS_INO_NODE && sa->type != UBIFS_DENT_NODE &&
      sa->type != UBIFS_XENT_NODE) {
    ubifs_err(c, "bad node type %d", sb->type);
    ubifs_dump_node(c, sb->node);
    return -EINVAL;
  }
  ....
}

// ------------------------------------

static inline bool isAudioPlaybackRateEqual(
  const AudioPlaybackRate &pr1,
  const AudioPlaybackRate &pr2)
{
  return fabs(pr1.mSpeed - pr2.mSpeed) <
           AUDIO_TIMESTRETCH_SPEED_MIN_DELTA &&
         fabs(pr1.mPitch - pr2.mPitch) <
           AUDIO_TIMESTRETCH_PITCH_MIN_DELTA &&
         pr2.mStretchMode == pr2.mStretchMode &&
         pr2.mFallbackMode == pr2.mFallbackMode;
}

// ------------------------------------

@Ryan1729
Copy link
Contributor

struct cyapa_softc { int delta_x, delta_y, cap_resx, cap_resy; };

void CodeRevew(cyapa_softc *sc) {
  if (sc->delta_x > sc->cap_resx)
    sc->delta_x = sc->cap_resx;
  if (sc->delta_x < -sc->cap_resx)
    sc->delta_x = -sc->cap_resx;
  if (sc->delta_y > sc->cap_resx) // <=
    sc->delta_y = sc->cap_resy;
  if (sc->delta_y < -sc->cap_resy)
     sc->delta_y = -sc->cap_resy;
}

Since this is about if statements, I think this could be a separate lint. suspicious_grouped_if_conditions? It might make since for the same lint to handle matches as well.

void SetDelta()
{
  if (Id == k_IA64)
    Delta = 16;
  else if (Id == k_ARM || Id == k_PPC || Id == k_PPC)    // <=
    Delta = 4;
  else if (Id == k_ARMT)
    Delta = 2;
  else
    Delta = 0;
}

I think this would already be caught by the logic_bug lint.

static void AddRenamePair(...., NRecursedType::EEnum type, ....)
{
  ....
  if (type == NRecursedType::kRecursed)
    val.AddAscii("-r");
  else if (type == NRecursedType::kRecursed)    // <=
    val.AddAscii("-r0");
  ....
}

I think this one should probably be a different lint. And that lint should also catch things like this:

match variable {
    Some(x) if condition => {
        ....
    },
    Some(x) if condition => { // <=
        ....
    }
   ....
}
#define MEGASAS_MAX_SGE 128             /* Firmware limit */
....
static void megasas_scsi_realize(PCIDevice *dev, Error **errp)
{
  ....
  if (s->fw_sge >= MEGASAS_MAX_SGE - MFI_PASS_FRAME_SIZE) {
    ....
  } else if (s->fw_sge >= 128 - MFI_PASS_FRAME_SIZE) {
    ....
  }
  ....
}

I think this could be caught by the same lint as the previous one.

target_ulong helper_mftc0_cause(CPUMIPSState *env)
{
  ....
  CPUMIPSState *other = mips_cpu_map_tc(env, &other_tc);

  if (other_tc == other->current_tc) {
    tccause = other->CP0_Cause;
  } else {
    tccause = other->CP0_Cause;    // <=
  }
  ....
}

The if_same_then_else lint should handle this one.

point3D operator/(const point3D &p1, const point3D &p2)
{
  return point3D( p1.x/p2.x , p1.y/p2.y , p1.z/p1.z );
}

This would already be caught by the eq_op lint. On the other hand, there probably should be a separate lint that would catch something like:

  return point3D( p1.x/p2.x , p1.y/p2.y , p2.y/p1.y );

That is, apply the same checks this lint would, but to function arguments instead.

class CGitStatusListCtrl :
  public CListCtrl
{
  ....
  CString m_Rev1;
  CString m_Rev2;
  ....
};

void CGitStatusListCtrl::OnContextMenuList(....)
{
  ....
  if( (!this->m_Rev1.IsEmpty()) || (!this->m_Rev1.IsEmpty()) )
  ....
}

I think this would already be caught by the eq_op lint too.

int ASN1_PRINTABLE_type(const unsigned char *s, int len)
{
  int c;
  int ia5 = 0;
  ....
  if (!(((c >= 'a') && (c <= 'z')) ||
        ((c >= 'A') && (c <= 'Z')) ||
        (c == ' ') ||                   // <=
        ((c >= '0') && (c <= '9')) ||
        (c == ' ') || (c == '\'') ||    // <=
        (c == '(') || (c == ')') ||
        (c == '+') || (c == ',') ||
        (c == '-') || (c == '.') ||
        (c == '/') || (c == ':') ||
        (c == '=') || (c == '?')))
    ia5 = 1;
  ....
}

The logic_bug lint should catch this too.

Vector3dF
Matrix3F::SolveEigenproblem(Matrix3F* eigenvectors) const
{
  // The matrix must be symmetric.
  const float epsilon = std::numeric_limits<float>::epsilon();
  if (std::abs(data_[M01] - data_[M10]) > epsilon ||
      std::abs(data_[M02] - data_[M02]) > epsilon ||
      std::abs(data_[M12] - data_[M21]) > epsilon) {
    NOTREACHED();
    return Vector3dF();
  }
  ....
}

// ------------------------------------

bool IsTextField(const FormFieldData& field) {
  return
    field.form_control_type == "text" ||
    field.form_control_type == "search" ||
    field.form_control_type == "tel" ||
    field.form_control_type == "url" ||
    field.form_control_type == "email" ||
    field.form_control_type == "text";
}

// ------------------------------------

int
key_parse(struct mbuf *m, struct socket *so)
{
  ....
  if ((m->m_flags & M_PKTHDR) == 0 ||
      m->m_pkthdr.len != m->m_pkthdr.len) { // <=
    ....
    goto senderror;
  }
  ....
}

These three should all be caught by eq_op.

static int
qla_tx_tso(qla_host_t *ha, struct mbuf *mp, ....)
{
  ....
  if ((*tcp_opt != 0x01) || (*(tcp_opt + 1) != 0x01) ||
    (*(tcp_opt + 2) != 0x08) || (*(tcp_opt + 2) != 10)) { // <=
    return -1;
  }
  ....
}

Hmm. Is the issue that it should be (*(tcp_opt + 2) != 0x10) instead? If so, I'm not sure whether this should be part of this lint or not. Maybe a style lint that suggested not to mix numeric literal bases in a single boolean expression would be better?

int dbg_check_nondata_nodes_order(....)
{
  ....
  sa = container_of(cur, struct ubifs_scan_node, list);
  sb = container_of(cur->next, struct ubifs_scan_node, list);

  if (sa->type != UBIFS_INO_NODE && sa->type != UBIFS_DENT_NODE &&
      sa->type != UBIFS_XENT_NODE) {
    ubifs_err(c, "bad node type %d", sa->type);
    ubifs_dump_node(c, sa->node);
    return -EINVAL;
  }
  if (sa->type != UBIFS_INO_NODE && sa->type != UBIFS_DENT_NODE &&
      sa->type != UBIFS_XENT_NODE) {
    ubifs_err(c, "bad node type %d", sb->type);
    ubifs_dump_node(c, sb->node);
    return -EINVAL;
  }
  ....
}

I think this should be a separate lint. And that lint should be careful of false positives. Something like the following may be intentional:

if condition {
     ....
}
....
condition = ....
....
if condition {
     ....
}

Finally:

static inline bool isAudioPlaybackRateEqual(
  const AudioPlaybackRate &pr1,
  const AudioPlaybackRate &pr2)
{
  return fabs(pr1.mSpeed - pr2.mSpeed) <
           AUDIO_TIMESTRETCH_SPEED_MIN_DELTA &&
         fabs(pr1.mPitch - pr2.mPitch) <
           AUDIO_TIMESTRETCH_PITCH_MIN_DELTA &&
         pr2.mStretchMode == pr2.mStretchMode &&
         pr2.mFallbackMode == pr2.mFallbackMode;
}

I think this one should be caught by the eq_op lint, but it did inspire me to add the following test case to this one:

fn inside_larger_boolean_expression(s1: &S, s2: &S) -> bool {
    // There's no `s1.c`
    s1.a > 0 && s1.b > 0 && s1.d == s2.c && s1.d == s2.d
}

@leonardo-m
Copy link
Author

leonardo-m commented Sep 28, 2020

Hmm. Is the issue that it should be (*(tcp_opt + 2) != 0x10) instead? If so, I'm not sure whether this should be part of this lint or not. Maybe a style lint that suggested not to mix numeric literal bases in a single boolean expression would be better?

It's code like:

if x != 1 || x != 2 {
    println!("*");
}

It's always true :-) So it could hide a bug.

@Ryan1729
Copy link
Contributor

It's code like:

if x != 1 || x != 2 {
    println!("*");
}

It's always true :-) So it could hide a bug.

Aha! So the (*(tcp_opt + 2) != 10)) was perhaps meant to be (*(tcp_opt + 3) != 10)) then.

This kind of bug was discussed in #853, and it was suggested that it could be rolled into logic_bug, but that hasn't happened yet.

@euclio
Copy link
Contributor

euclio commented Oct 1, 2020

For the case in the OP, it would be worthwhile to suggest #[derive(PartialEq, Eq)] as well.

@Ryan1729
Copy link
Contributor

Ryan1729 commented Oct 1, 2020

For the case in the OP, it would be worthwhile to suggest #[derive(PartialEq, Eq)] as well.

For floats in particular that would run into the gotcha that floats are not Eq because of NaN. Also relevant for floats and PartialEq is this suggested lint, (not implemented as of this writing).

But this suggestion would make sense for structs/enumerated composed only of Eq types. Alternatively it could only suggest PartialEq. However it should also be a separate lint, so that if someone implements the equality correctly, they still get the derive suggestion. Given two separate lints, it might be fine to leave it as is, at the cost of the user needing to make two changes.

Given that skipping one of the changes is strongly desired, I think a reasonable implementation plan would be to implement the derive lint separately and then later upgrade this lint, (potentially by sharing code with the derive one.)

@flip1995
Copy link
Member

flip1995 commented Oct 5, 2020

Since this is about if statements, I think this could be a separate lint. suspicious_grouped_if_conditions? It might make since for the same lint to handle matches as well.

Where this patter occurs doesn't really matter, so I wouldn't split this up.

I think this one should probably be a different lint. And that lint should also catch things like this:

Such a lint would be awesome, IMO. But definitely a different lint.

About your remarks referring to other lints: They're all correct, IMO

Given that skipping one of the changes is strongly desired, I think a reasonable implementation plan would be to implement the derive lint separately and then later upgrade this lint, (potentially by sharing code with the derive one.)

Yes this isn't really related to this issue.

@Ryan1729
Copy link
Contributor

Ryan1729 commented Oct 5, 2020

Since this is about if statements, I think this could be a separate lint. suspicious_grouped_if_conditions? It might make since for the same lint to handle matches as well.

Where this patter occurs doesn't really matter, so I wouldn't split this up.

Fair enough. I suppose I can't think of a reason why someone would want to disable one the lint for only operators or only for if statements. I'm not sure what to call the lint though if it doesn't only pertain to operators. Although this issue probably occurs most often with conditionals that is not the only place it occurs, so a name that only mentions conditionals doesn't seem appropriate.

For example, I have this test in my draft PR for this lint:

fn non_boolean_operators(s1: &S, s2: &S) -> i32 {
    // There's no `s2.c`
    s1.a * s2.a + s1.b * s2.b + s1.c * s2.b + s1.d * s2.d
}

Does suspicious_grouped_operations fit? Or maybe just suspicious_groupings?

@flip1995
Copy link
Member

flip1995 commented Oct 9, 2020

suspicious_operation_groupings sounds best to me (as a non-native english speaker).

Do you think this lint should also apply when there is no grouping but only one expression, e.g. just self.a == other.b? I think we need context, that there is at least one ... self.x <op> other.x ... in the grouping before or after ... self.y <op> other.z .... For example the Cross product will have multiple self.x * other.y without any self.x * other.x and would be a FP if we lint without context of other operations.

@Ryan1729
Copy link
Contributor

suspicious_operation_groupings sounds best to me (as a non-native english speaker).

Sounds fine to me, as a native English speaker.

Do you think this lint should also apply when there is no grouping but only one expression, e.g. just self.a == other.b? I think we need context, that there is at least one ... self.x <op> other.x ... in the grouping before or after ... self.y <op> other.z .... For example the Cross product will have multiple self.x * other.y without any self.x * other.x and would be a FP if we lint without context of other operations.

I agree that we shouldn't lint single expressions without more context. In the course of writing my (currently in-progress) PR for this lint, I realized that I should skip empty lists of pairs of operations, and when that happened I also realized that single pairs would be likely false positives: see here. There's a few more early-out cases in there too, but I'm not sure at the moment whether they would prevent a correct implementation of the cross product from being linted. I'll add the cross product as a test case, to make sure that it doesn't false positive.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-lint Area: New lints
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants