Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: introduce varchar<L> type #82

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

dshaaban01
Copy link
Collaborator

@dshaaban01 dshaaban01 commented Feb 6, 2025

This PR introduces the varchar compound type, which represents a variable-length Unicode string of up to L characters.

@dshaaban01 dshaaban01 changed the title feat: introduce varchar<L> type feat: introduce varchar<L> type Feb 6, 2025
} else if (auto varCharType = dyn_cast<VarCharType>(literalType)) {
auto varChar =
std::make_unique<::substrait::proto::Expression_Literal_VarChar>();
varChar->set_length(mlir::cast<StringAttr>(value).size());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take the length from the type, not from the value.

@dshaaban01 dshaaban01 force-pushed the dalia/compound-types/varchar branch from 10fc673 to b225641 Compare February 24, 2025 15:11
@dshaaban01
Copy link
Collaborator Author

dshaaban01 commented Feb 24, 2025

@ingomueller-net I made some thoughts about the creation of a VarCharAttr with the goal of implementing a mechanism that verifies that the length of the string is at most L characters.

This is what I came up with:

  • If we have a string "hello" (5 characters), but declare it as type VarChar<6> (can be at most 6 characters, then this means that the same string "hello" declared as type VarChar<5> is different. So I thought it would make sense to add the VarChar type as parameter to the VarChar attribute type. I didn't know any other way of verifying that the length of the string e.g. "hello" is at most L e.g. 6 characters.
  • Furthermore, I guess its not statically typed anymore so I have it as a TypeInterferableAttrInterface.

Let me know if this is overkill or if I should add this (or elements of it).

def Substrait_VarCharAttr
    : Substrait_Attr<"VarChar", "var_char", [TypeInferableAttrInterface]> {
  let summary = "Substrait varchar type";
  let description = [{
    This type represents a substrait varchar attribute type, namely a unicode
    string of at most L characters.
  }];
  let parameters = (ins "StringAttr":$value, "VarCharType":$type);
  let assemblyFormat = [{ `<` $value `,` $type `>` }];
  let genVerifyDecl = 1;
}
LogicalResult mlir::substrait::VarCharAttr::verify(
  llvm::function_ref<mlir::InFlightDiagnostic()> emitError, StringAttr value, VarCharType type) {
  if (static_cast<int32_t>(value.getValue().size()) > type.getLength())
    return emitError() << "value length exceeds the maximum length of " << type.getLength();
  return success();
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants