Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Creating empty value for attributte #552

Open
joffremota opened this issue Jun 5, 2024 · 8 comments · Fixed by #554
Open

Creating empty value for attributte #552

joffremota opened this issue Jun 5, 2024 · 8 comments · Fixed by #554
Assignees

Comments

@joffremota
Copy link

1. Description

I've got the following value
<a download href=\"/downloads/arquivo.zip\">Download do Arquivo</a>

When I pass this into the following method in order to add the target="_blank" attributte, I'm getting this:
<a download="" href=\"/downloads/arquivo.zip\" target=\"_blank\">Download do Arquivo</a>

How can I prevent the lib to add the empty value on download?

Here is the method.

        private string UpdateAnchorTagsWithTargetBlank(string html)
        {
            var doc = new HtmlDocument();

            doc.LoadHtml(html);
            var anchorNodes = doc.DocumentNode.SelectNodes("//a[@href]");
            if (anchorNodes != null)
            {
                foreach (var node in anchorNodes)
                {
                    if (node.GetAttributeValue("target", "") != "_blank")
                        node.SetAttributeValue("target", "_blank");
                }
            }

            return doc.DocumentNode.OuterHtml;
        }

2. Exception

Not applicable

3. Fiddle or Project

Not applicable

4. Any further technical details

  • HtmlAgilityPack (1.11.40)
  • SDK Version: 6.0.404
@elgonzo
Copy link
Contributor

elgonzo commented Jun 5, 2024

It's not so obvious, but to get the desired behavior you have to configure HtmlDocument.GlobalAttributeValueQuote to use AttributeValueQuote.Initial, i.e.:

var doc = new HtmlDocument()
{
    GlobalAttributeValueQuote = AttributeValueQuote.Initial
};

(This could also be done after loading an HTML document.)



EDIT: I just noticed that the newly created target attribute will have single-quotes when setting up the HtmlDocument instance with AttributeValueQuote.Initial and it's impossible to change this by fiddling with the HtmlAttribute's QuoteType property. Dang! If you can't tolerate single quotes, my suggested solution isn't acceptable, unfortunately.

The problem is the internal field HtmlAttribute.InternalQuoteType being left untouched for newly created attributes and therefore initialized with the default value (which is SingleQuote). Either the cause is the untouched HtmlAttribute.InternalQuoteType field itself or this if expression is borked:

if (quoteType == AttributeValueQuote.Initial && !(att._isFromParse && !att._hasEqual && string.IsNullOrEmpty(att.XmlValue)))
{
quoteType = att.InternalQuoteType;
}

@JonathanMagnan JonathanMagnan self-assigned this Jun 6, 2024
@JonathanMagnan
Copy link
Member

Thank you @elgonzo ,

Indeed to keep attribute, your proposed solution is perfect: doc.GlobalAttributeValueQuote = AttributeValueQuote.Initial;

As for the SingleQuote problem, I guess the only way at this moment is to use reflection to set the value to DoubleQuote.

Such as:

var html = "<a download href=\"/downloads/arquivo.zip\">Download do Arquivo</a>";
var doc = new HtmlDocument();
doc.GlobalAttributeValueQuote = AttributeValueQuote.Initial;
doc.LoadHtml(html);

var anchorNodes = doc.DocumentNode.SelectNodes("//a[@href]");
if (anchorNodes != null)
{
	foreach (var node in anchorNodes)
	{
		if (node.GetAttributeValue("target", "") != "_blank")
		{
			node.SetAttributeValue("target", "_blank");
			var targetAttribute = node.GetAttributes("target").Single();
			var internalQuoteTypeProperty = typeof(HtmlAgilityPack.HtmlAttribute).GetProperty("InternalQuoteType", System.Reflection.BindingFlags.Public | System.Reflection.BindingFlags.NonPublic | System.Reflection.BindingFlags.Instance);
			internalQuoteTypeProperty.SetValue(targetAttribute, AttributeValueQuote.DoubleQuote);
		}                            
	}
}

var outputHtml = doc.DocumentNode.OuterHtml;

Best Regards,

Jon

@POFerro
Copy link
Contributor

POFerro commented Jun 21, 2024

Hi @JonathanMagnan ,

I just commited a PR to propose a correction for this issue, can you please take a look, I am facing the same issue and need this to be fixed in my system :).

Thanks in advance and best regards
POFerro

@JonathanMagnan
Copy link
Member

Thank you @POFerro for your PR.

I will try to look at it very soon.

Best Regards,

Jon

@POFerro
Copy link
Contributor

POFerro commented Jul 16, 2024

Hi @JonathanMagnan ,

Any news? :)

@JonathanMagnan
Copy link
Member

Hello @POFerro ,

Sorry for the delay. I didn't say it, but I have been on vacation since June 25 (a few days after your PR).

I'm returning tomorrow, so I will look at it and merge it if accepted next week.

Best Regards,

Jon

JonathanMagnan added a commit that referenced this issue Jul 31, 2024
Review Attribute QuoteType Behavior vs InternalQuoteType fixes #552, #516
@JonathanMagnan
Copy link
Member

Hello @POFerro ,

Thank you again for your pull request. It has been merged and released in the version v1.11.62

Honestly, I'm always afraid of side impacts that will cause other developers as now the download doesn't have a double quote anymore, but I guess we will see if some people report this new behavior as an issue or not in the following weeks.

@joffremota , could you confirm it indeed fixed your issue as well? It seems to work flawlessly on my side.

Best Regards,

Jon

@POFerro
Copy link
Contributor

POFerro commented Aug 9, 2024

Hi @JonathanMagnan

Thanks for accepting the PR.
I already tested in my case and works like a charm.

Thanks and best regards ;)
POFerro

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

4 participants