Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pretty print function for dense tensors #350

Closed
tgkolda opened this issue Nov 21, 2024 · 12 comments · Fixed by #360
Closed

Pretty print function for dense tensors #350

tgkolda opened this issue Nov 21, 2024 · 12 comments · Fixed by #360
Labels
doing Actively being worked on

Comments

@tgkolda
Copy link
Collaborator

tgkolda commented Nov 21, 2024

Would it be possible to add a pretty print function for dense tensors. Here is what I have in mind:

def pretty_print_tensor(X, fmt="10.4f", name="Slice"):
    if not isinstance(X, ttb.tensor):
        raise ValueError("Input must be a pyttb tensor")

    # Get the shape of the tensor
    shape = X.shape

    if name == "Slice":
        print("Tensor is of shape "+ " x ".join(map(str, shape)))
    else:
        print(f"{name} is a tensor of shape " + " x ".join(map(str, shape)))


    # Iterate over all possible slices (in Fortran order)
    for index in np.ndindex(shape[2:][::-1]): # Skip the first two dimensions and reverse the order
        index = index[::-1] # Reverse the order back to the original
        # Construct the slice indices
        slice_indices = (slice(None), slice(None)) + index
        slice_data = X[slice_indices]
        print(f"{name}(:, :, {', '.join(map(str, index))}) =")
        array = slice_data.data
        for row in array:
            print(" ".join(f"{val:{fmt}}" for val in row))
        

So pretty_print_tensor(X,fmt="2d",name="X") produces output like this:

X is a tensor of shape 3 x 3 x 2
X(:, :, 0) =
 3  9  1
 8  2  1
 4  3  9
X(:, :, 1) =
 6  9  5
 5  6  4
 1  4  1

Or pretty_print_tensor(X,fmt="5.1f") for a different tensor produces:

Tensor is of shape 3 x 4 x 3 x 2
Slice(:, :, 0, 0) =
  1.0   7.0   5.0   5.0
  8.0   9.0   1.0   7.0
  4.0   5.0   3.0   8.0
Slice(:, :, 1, 0) =
  4.0   9.0   9.0   9.0
  1.0   2.0   1.0   3.0
  3.0   5.0   6.0   5.0
Slice(:, :, 2, 0) =
  9.0   7.0   2.0   5.0
  2.0   7.0   5.0   4.0
  5.0   5.0   4.0   8.0
Slice(:, :, 0, 1) =
  7.0   3.0   6.0   7.0
  3.0   5.0   4.0   4.0
  6.0   4.0   5.0   9.0
Slice(:, :, 1, 1) =
  2.0   4.0   4.0   7.0
  7.0   6.0   1.0   5.0
  4.0   5.0   1.0   7.0
Slice(:, :, 2, 1) =
  9.0   3.0   9.0   5.0
  7.0   6.0   9.0   8.0
  1.0   3.0   8.0   2.0
@tgkolda
Copy link
Collaborator Author

tgkolda commented Nov 21, 2024

Maybe it's a function in the tensor class, in which case the call might be X.pretty_print(...)

@ntjohnson1
Copy link
Collaborator

I have a PR up for printing the frontal slices #354. IIRC matlab has a global setting to control the printing precision so my bias would be to use something similar here as well. I haven't tried if our implementation is hooked up correctly where changing the numpy printing format populates correctly for us.

@tgkolda
Copy link
Collaborator Author

tgkolda commented Dec 12, 2024

Thanks, Nick. I would still like this functionality as an option. The numpy-style printing is painful.

@ntjohnson1
Copy link
Collaborator

The primary painful piece after having control of the print format, and the frontal slices is just the brackets and a custom name display correct? Or is there some additional functionality that I'm overlooking?

@tgkolda
Copy link
Collaborator Author

tgkolda commented Dec 12, 2024

So, the pain points are:

  • Displaying the name
  • Controlling the print format
  • Displaying the size as 5 x 4 x 3 rather than (5,4,3)
  • For 1D tensors, displaying the size as 5 rather than (5,).

FWIW, I don't think we necessarily want to change the universal print formatting. I can imagine scenarios where you just want different formats for different purposes in the same file.

Here's an updated version that handles 1D and 2D tensors more intelligently and changes the default name to Values.

def pretty_print_tensor(X, fmt="10.4f", name="Values"):
    if not isinstance(X, ttb.tensor):
        raise ValueError("Input must be a pyttb tensor")

    # Get the shape of the tensor
    shape = X.shape

    if name == "Values":
        print("Tensor is of shape "+ " x ".join(map(str, shape)))
    else:
        print(f"{name} is a tensor of shape " + " x ".join(map(str, shape)))

    # Check if the tensor is 1D
    if len(shape) == 1:
        print(f"{name}(:) =")
        for val in X.data:
            print(f"{val:{fmt}}")
        return

    # Check if the tensor is 2D
    if len(shape) == 2:
        print(f"{name}(:,:) =")
        array = X.data
        for row in array:
            print(" ".join(f"{val:{fmt}}" for val in row))
        return

    # Iterate over all possible slices (in Fortran order)
    for index in np.ndindex(shape[2:][::-1]): # Skip the first two dimensions and reverse the order
        index = index[::-1] # Reverse the order back to the original
        # Construct the slice indices
        slice_indices = (slice(None), slice(None)) + index
        slice_data = X[slice_indices]
        print(f"{name}(:, :, {', '.join(map(str, index))}) =")
        array = slice_data.data
        for row in array:
            print(" ".join(f"{val:{fmt}}" for val in row))        

@ntjohnson1
Copy link
Collaborator

I believe all of those pain points are decisions aligned with numpy conventions. Since that's the defacto scientific library for python it feels like deviating from that would make it harder for a python user to pick things up.

I believe pyttb is trying to capture the functionality of the MATLAB tensor toolbox but the aesthetics are not an objective. Here it feels like we are deviating from Python/Numpy standard approaches to better match MATLAB aesthetics.

I'll defer to @dmdunla since he is the one allowed to approve incoming PRs

@dmdunla
Copy link
Collaborator

dmdunla commented Dec 13, 2024

For MATLAB users accustomed to the TTB for MATLAB, Tammy's suggestion makes sense. However, as Nick points out, this moves away from Python/NumPy conventions and thus would be distracting to people accustomed to pythonic standards.

Here are my suggestions:

  1. Align Tammy's code to TTB for MATLAB as much as is possible (using the variable name in the output has already been determined as impossible at this point).
  2. Name the methods that produce the TTB for MATLAB-like output as matlab_print. I believe this is both more descriptive and more objective than pretty_print, and including matlab inthe name of the function describes the intent (e.g., print the tensor like it's printed in MATLAB).
  3. For the default printing of data classes, continue with our attempts to be faithful to TTB for MATLAB while incorporating numpy-like conventions (e.g., tensor of shape (2, 2, 2) vs. tensor of shape 2 x 2 x 2).
  4. For the default printing, use the names of the data class data members when printing. For example, tensor has data, ktensor has weights and factor_matrices, and ttensor has core and factor_matrices. I'm not a fan of using the name U for factor matrices in general, and I'd like to avoid introduction of aliases to underlying data members in the __str__ and __repr__ methods.
  5. Use frontal slices for now in all data classes. NumPy does not output based on layout/order at this point, so aligning with TTB for MATLAB, which outputs frontal slices, makes the most sense at this point. However, if we eventualy support other layout/order options (e.g., "C"), we should revisit this choice to make sure that frontal slices make sens for all layouts.

I do not think there is a single solution that will both aligh with both MATLAB and Python conventions, so the above are an attempt at some compromise.

This is my take ATM, and I'm open to more discussion.

@ntjohnson1
Copy link
Collaborator

SG. If we are accommodating this then I think it makes sense to create pyttb.matlab_support under which we can put matlab_print as a function. Each data class can have a private method _matlab_print this publicly facing function calls. That way we have some separation for the weird/non-standard/custom things required. If the support doesn't continue to grow then it's not a crazy amount of indirection.

@dmdunla
Copy link
Collaborator

dmdunla commented Dec 13, 2024

Great idea. I like the separation of matlab_support into a distinct space in pyttb.

@ntjohnson1 ntjohnson1 added the doing Actively being worked on label Dec 14, 2024
@ntjohnson1
Copy link
Collaborator

Ok I added matlab_print in the direction I proposed above. The linked PR has an example of the output format.

@tgkolda
Copy link
Collaborator Author

tgkolda commented Dec 16, 2024

This all sounds great. Thanks!

One naive question about the broader setup. Why not just have matlab_print be a function for each object? I'm not sure what having pyttb.matlab_support buys. Is it just called

matlab_print(X) versus X.matlab_print()?

@ntjohnson1
Copy link
Collaborator

Functionally it is the same. But it makes it more obvious that there is a standard way and non-standard way. The non-standard way requires importing something else that is explicit this is a compatibility approach rather than something core to pyttb. In python underscore methods are internal components so while a user could type print(a_tensor._matlab_str()) it's fairly clear they are doing something weird.

In mind the breakdown of users are:

  1. python users who know MATLAB ttb (they will be familiar with Numpy print styles already and should use the standard approach)
  2. New users to any variant of tensor toolbox. Who should have no reason to use a compatibility tool
  3. People with extensive use of MATLAB ttb who aren't as familiar with python (so the differences are a major barrier for adoption)

Ideally no one from the first two categories need or want to use MATLAB support, and overtime the users in the last category become users in the first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
doing Actively being worked on
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants