Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v9 Access to text formatters from custom formatters #263

Closed
webstech opened this issue Dec 5, 2022 · 19 comments
Closed

v9 Access to text formatters from custom formatters #263

webstech opened this issue Dec 5, 2022 · 19 comments

Comments

@webstech
Copy link
Contributor

webstech commented Dec 5, 2022

Congrats on the new release. Thanks for your work on this.

I am working on updating the typescript typings. One feature of the typings was access to the formatter functions. They were exported here. I don't use the functions but they are part of the test program:

import * as formatters from 'html-to-text/lib/formatter';
...
        textFormatter: (elem, walk, builder, options) => {
            formatters.heading(elem, walk, builder, options);
        }

I am assuming this was part of the test because it was needed and I can still imagine a use case for it.

The old documentation does not mention being able to access the formatters. At least I do not recall that it did. I am opening this issue in case someone is using the formatters.

Questions:

  1. Were the formatters intended to be accessible in the previous versions?
  2. If the answer is yes, can they be re-introduced or the doc updated to reflect they are no longer available (and a suggested workaround)?

The current typescript typings work with the new release except for the formatters and removed deprecated options.

@KillyMXI
Copy link
Member

KillyMXI commented Dec 5, 2022

Hmm. I never considered the old lib/formatter.js as an entry point. Built-in formatter functions were only meant to be specified in the config by name.
That's why I felt safe to move them around this time.

I'm curious what use cases you see for them being accessible from client code?

@webstech
Copy link
Contributor Author

webstech commented Dec 5, 2022

I'm curious what use cases you see for them being accessible from client code?

The example in the typing test driver is unrealistic. There could be other cases where the wrapper is checking the node name or data (emit an emoji before all headers starting with 'new' and then call the header formatter). Are there users? I don't know. Maybe @CarsonF can comment. He may have added the formatter support.

@webstech
Copy link
Contributor Author

webstech commented Dec 5, 2022

Is it possible the formatters are accessible at builder.options.formatters['some-formatter']? That could be added to the typing api if it is determined to be needed.

@KillyMXI
Copy link
Member

KillyMXI commented Dec 5, 2022

That's how it works internally:

const format = options.formatters[tagDefinition.format];
format(elem, walk, builder, tagDefinition.options || {});

At that point it contains all built-in and custom formatters.

There could be other cases where the wrapper is checking the node name or data (emit an emoji before all headers starting with 'new' and then call the header formatter).

There was a demand for recursive formatters (#154) (Albeit it's not as significant now, after the introduction of selectors). I haven't got to think about it yet, but perhaps we might have everything for it already in place.
I will need to add tests and documentation before I call it officially supported.

@webstech
Copy link
Contributor Author

webstech commented Dec 5, 2022

As a POC, mocking up BlockTextBuilder with:

  options: {
    formatters: Record<string, FormatCallback>;
  };

allowed this to work:

builder.options.formatters['heading'](elem, walk, builder, options);

I know this is passing the wrong options. If options are passed, do they replace the existing options or get merged? I guess merge would be consistent with how a selector's options are handled now.

@CarsonF
Copy link

CarsonF commented Dec 5, 2022

I'm curious what use cases you see for them being accessible from client code?

Yeah I think the goal was that formatters could wrap another formatter to patch functionality.
Here's an example for myself:
https://github.com/SeedCompany/nestjs-email/blob/5150f0298f89a3542fc3bfd7333009db21119745/src/email.service.ts#L139-L146

@KillyMXI
Copy link
Member

KillyMXI commented Dec 5, 2022

builder.options is literally the Options (HtmlToTextOptions in DT) object after initial processing (I do the merging of formatters and selectors but nothing that won't fit the same type).

From this comes naturally that builder.options.formatters['heading'] is FormatCallback and it accepts FormatOptions as the fourth argument.

  • builder.options is the same object that exist throughout the conversion of one or more documents (initial processing of HtmlToTextOptions and building the selectors tree is expensive (htmlToText/convert functions much slower than fromString #265));
  • when you call builder.options.formatters['whatever'](elem, walk, builder, formatOptions) - it can be a custom formatter as well. And there is no level of indirection that would do the magic for you. Each formatter takes FormatOptions from input argument, doesn't look it up from somewhere else. It would be your' custom formatter responsibility to construct new FormatOptions object based on its' own formatOptions input argument and its' intent when calling another formatter. (Side note: there is a danger of breaking things by modifying formatOptions input argument directly, but I don't think there is much I can do or willing to do to prevent it, other than making it less and less necessary to resort to custom formatters.)

The example from nestjs-email is now also solvable with selectors:

{ selectors: [
    { selector: 'table', format: 'dataTable' },
    { selector: 'table[role="presentation"]', format: 'block' }, // or 'inline'
] }

I'm trying to come up with a no-nonsense example before I add tests and documentation for recursive formatters.

@KillyMXI
Copy link
Member

KillyMXI commented Dec 5, 2022

(To think about it now, maybe I can deep freeze the HtmlToTextOptions object. It would've been nice if it could give some performance boost, but it doesn't seem to be the case.)

@webstech
Copy link
Contributor Author

webstech commented Dec 6, 2022

  • It would be your' custom formatter responsibility to construct new FormatOptions object based on its' own formatOptions input argument and its' intent when calling another formatter.

Okay, but what if I wanted a merge? The FormatOptions can only be obtained via builder.picker.pick1()? eg, I want to call the heading formatter with an options of 'uppercase: false'. I want to merge this with the rest of the h3 selector format options but do not have access to them.

I am just trying to think about what someone wanting to do recursion might want. I am perfectly fine with doing nothing, especially since the known user of this functionality may no longer need it. The call in the POC looks like a hack.

@KillyMXI
Copy link
Member

KillyMXI commented Dec 6, 2022

  formatters: {
    'foo': function (elem, walk, builder, formatOptions) {
      if (elem...) {
        walk(elem.children, builder);
      } else {
        builder.options.formatters['bar'](elem, walk, builder, { ...formatOptions, a: 42 });
      }
    }
  },
  selectors: [
    {
      selector: 'div',
      format: 'foo', // can call 'bar' inside
      options: { ... } // any format options meaningful for your custom 'foo' formatter with its internals,
                       // defined here for 'div' selector specifically
    }
  ]

Since you as the author of the foo formatter are aware that it can call bar formatter inside, you can decide what formatOptions you want to specify so it would cover all you needs, including what to pass down to bar.

formatOptions are really defined per selector, not per formatter.
All formatOptions instances for the same selector are merged during the initial processing.

There are really no connections between different selectors, nothing to merge with from within the formatter.
And there are no options for bar formatter somewhere. There might be options for different selectors that have format: 'bar' as well, but they are not connected in any way.

It might be confusing ¯\_(ツ)_/¯

@webstech
Copy link
Contributor Author

webstech commented Dec 6, 2022

It might be confusing

Yes it can be. That is why I chose h3, a well-known selector in my example for merging. No ambiguity as to whether it exists but the format options are not available. Not asking for changes here. Maybe this will help someone in the future who is considering calling other formatters.

I will finish up the typescript v9 support once the tests and doc are available. I can run pre-release tests if necessary.

Thanks for your responsiveness to this.

@webstech
Copy link
Contributor Author

webstech commented Dec 6, 2022

I will finish up the typescript v9 support once the tests and doc are available

Also have to wait for another typing to be updated. 🙄

@KillyMXI
Copy link
Member

KillyMXI commented Dec 6, 2022

I might be able to make it closer in behavior to cascade style sheets at some point in the future.
But there is a conflict of interests - solving (querying and merging) styles for each node would be expensive operation. I have some compromise in mind. But all of that would barely make any difference for writing custom formatters.
For now, I think I've reached a reasonable balance between demands in flexibility and performance and can focus on other things first.

Will try to do tests and docs asap.

@KillyMXI
Copy link
Member

KillyMXI commented Dec 6, 2022

done

@webstech
Copy link
Contributor Author

webstech commented Dec 7, 2022

Thanks for getting the updates done. The test code didn't quite have full coverage so I am reviewing that now. Should get the typings done this week.

@webstech
Copy link
Contributor Author

webstech commented Dec 9, 2022

The use of options.formatters is working and you have made it officially supported so I will close this. Thanks for your work.

@webstech webstech closed this as completed Dec 9, 2022
@KillyMXI
Copy link
Member

KillyMXI commented Dec 9, 2022

Thank you for QA

@webstech
Copy link
Contributor Author

Should get the typings done this week.

PR 63576 has been opened.

@webstech
Copy link
Contributor Author

webstech commented Jan 4, 2023

the goal was that formatters could wrap another formatter to patch functionality

@CarsonF the typing updates for v9 may break your code but could you please take a look at them? @KillyMXI suggested a simple fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants