Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: stardict qstring modification #1708

Merged
merged 9 commits into from
Jul 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 10 additions & 11 deletions src/common/htmlescape.cc
Original file line number Diff line number Diff line change
Expand Up @@ -130,26 +130,25 @@ string escapeForJavaScript( string const & str )
return result;
}

QString stripHtml( QString & tmp )
QString stripHtml( QString tmp )
{
tmp.replace(
QRegularExpression(
"<(?:\\s*/?(?:div|h[1-6r]|q|p(?![alr])|br|li(?![ns])|td|blockquote|[uo]l|pre|d[dl]|nav|address))[^>]{0,}>",
QRegularExpression::CaseInsensitiveOption ),
" " );
tmp.replace( QRegularExpression( "<[^>]*>" ), " " );
static QRegularExpression htmlRegex(
"<(?:\\s*/?(?:div|h[1-6r]|q|p(?![alr])|br|li(?![ns])|td|blockquote|[uo]l|pre|d[dl]|nav|address))[^>]{0,}>",
QRegularExpression::CaseInsensitiveOption );
tmp.replace( htmlRegex, " " );
static QRegularExpression htmlElementRegex( "<[^>]*>" );
tmp.replace( htmlElementRegex, " " );
return tmp;
}

QString unescape( QString const & str, HtmlOption option )
QString unescape( QString str, HtmlOption option )
{
// Does it contain HTML? If it does, we need to strip it
if ( str.contains( '<' ) || str.contains( '&' ) ) {
QString tmp = str;
if ( option == HtmlOption::Strip ) {
stripHtml( tmp );
str = stripHtml( str );
}
return QTextDocumentFragment::fromHtml( tmp.trimmed() ).toPlainText();
return QTextDocumentFragment::fromHtml( str.trimmed() ).toPlainText();
}
return str;
}
Expand Down
4 changes: 2 additions & 2 deletions src/common/htmlescape.hh
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,9 @@ string preformat( string const &, bool baseRightToLeft = false );

// Escapes the given string to be included in JavaScript.
string escapeForJavaScript( string const & );
QString stripHtml( QString & tmp );
QString stripHtml( QString tmp );
// Replace html entities
QString unescape( QString const & str, HtmlOption option = HtmlOption::Strip );
QString unescape( QString str, HtmlOption option = HtmlOption::Strip );

QString fromHtmlEscaped( QString const & str );
string unescapeUtf8( string const & str, HtmlOption option = HtmlOption::Strip );
Expand Down
21 changes: 10 additions & 11 deletions src/dict/stardict.cc
Original file line number Diff line number Diff line change
Expand Up @@ -525,29 +525,28 @@ string StardictDictionary::handleResource( char type, char const * resource, siz

QString src = match.captured( 2 );

if ( src.indexOf( "://" ) >= 0 )
if ( src.indexOf( "://" ) >= 0 ) {
articleNewText += match.captured();

}
else {
std::string href = "\"gdau://" + getId() + "/" + src.toUtf8().data() + "\"";
QString newTag = QString::fromUtf8(
( addAudioLink( href, getId() ) + "<span class=\"sdict_h_wav\"><a href=" + href + ">" ).c_str() );
newTag += match.captured( 4 );
if ( match.captured( 4 ).indexOf( "<img " ) < 0 )

std::string href = "\"gdau://" + getId() + "/" + src.toUtf8().data() + "\"";
std::string newTag = addAudioLink( href, getId() ) + "<span class=\"sdict_h_wav\"><a href=" + href + ">";
newTag += match.captured( 4 ).toUtf8().constData();
if ( match.captured( 4 ).indexOf( "<img " ) < 0 ) {
newTag += R"( <img src="qrc:///icons/playsound.png" border="0" alt="Play">)";
}
newTag += "</a></span>";

articleNewText += newTag;
articleNewText += QString::fromStdString( newTag );
}
}
if ( pos ) {
articleNewText += articleText.mid( pos );
articleText = articleNewText;
articleNewText.clear();
}

return articleText.toStdString();
auto text = articleText.toUtf8();
return text.data();
}
Copy link
Collaborator

@shenlebantongying shenlebantongying Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference is pretty subtle.

  • in the new code, std::string(toUtf8().data()) will seek for the null byte again
  • in the old code, toStdString's std::string(toUtf8().data(),toUtf8().size()) will use the articleText's actual length

Maybe there should be a comment that clarify articleText may contain null byte, and we want to cut off earlier?

Copy link
Owner Author

@xiaoyifang xiaoyifang Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the original code is
return (articleText.toUtf8().data()); I changed it into return articleText.toStdString() ,crash still reported on this line.
So I split it into two lines.
I guess articleText.toUtf8() will generate a temporary string . when chained with .data() . sometimes the temporary variable has no longer existed. and a crash will throw.

From real environment test ,the crash has more probability to happen when in multithread cases.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • in the old code, toStdString's std::string(toUtf8().data(),toUtf8().size()) will use the articleText's actual length

the old code is return (articleText.toUtf8.data()); ,toStdString() is my first attempt.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Owner Author

@xiaoyifang xiaoyifang Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also can not be sure.

But Qt+MSVC +QString are more likely to have issue as the https://bugreports.qt.io/browse/QTBUG-63274 implies

Maybe some link option in CMAKE , /MD /MT etc?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be the correct way to set /MD /MT

set_property(TARGET foo PROPERTY
  MSVC_RUNTIME_LIBRARY "MultiThreaded$<$<CONFIG:Debug>:Debug>DLL")

https://github.com/qt/qtbase/blob/02cb165ef8050230b477358e4136e9f0acd83eb6/cmake/QtPublicTargetHelpers.cmake#L395-L397

However, CMake's doc says

If the property is not set, then CMake uses the default value MultiThreaded$<$CONFIG:Debug:Debug>DLL to select a MSVC runtime library.

So, I suppose it is set correctly by CMake already.

https://cmake.org/cmake/help/latest/prop_tgt/MSVC_RUNTIME_LIBRARY.html

Copy link
Owner Author

@xiaoyifang xiaoyifang Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No,cmake_minimum_required will set policy version, and we currently set it aggressively high 😅.

https://cmake.org/cmake/help/latest/command/cmake_minimum_required.html#policy-settings

cmake_minimum_required(VERSION 3.25) # ubuntu 23.04 Fedora 36

Copy link
Collaborator

@shenlebantongying shenlebantongying Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think searching in the build.ninja file in the build folder can confirm if /MD /MT is set.

On my machine, /MDd is automatically set.

case 'm': // Pure meaning, usually means preformatted text
return "<div class=\"sdct_m\">" + Html::preformat( string( resource, size ), isToLanguageRTL() ) + "</div>";
Expand Down
Loading