Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add section linking for the search result #5829

Merged
merged 70 commits into from
Jul 12, 2019
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
ee1ba1a
add sections field
dojutsu-user Jun 19, 2019
4b05f8a
index each section as separate document in ES
dojutsu-user Jun 21, 2019
79d2459
Merge branch 'master' into search-section-linking
dojutsu-user Jun 21, 2019
54ceb5c
few refactoring
dojutsu-user Jun 21, 2019
b11e357
revert all
dojutsu-user Jun 24, 2019
5b81471
Merge branch 'master' into search-section-linking
dojutsu-user Jun 24, 2019
762a79d
update document mapping (nested fields)
dojutsu-user Jun 24, 2019
7a61dbd
format text
dojutsu-user Jun 24, 2019
644565b
get results from inner_hits
dojutsu-user Jun 25, 2019
fa51a1c
Merge branch 'master' into search-section-linking
dojutsu-user Jun 25, 2019
0bc6be5
correct the none
dojutsu-user Jun 25, 2019
0139993
add field for PageDocument.
dojutsu-user Jun 26, 2019
53a02e8
remove domain_index settings
dojutsu-user Jun 26, 2019
11ba9e7
Merge branch 'htmlfile-sphinx-domain-integration' into search-section…
dojutsu-user Jun 26, 2019
6207f4e
remove SphinxDomainDocument and DomainSearch
dojutsu-user Jun 26, 2019
a251a98
generate correct query
dojutsu-user Jun 26, 2019
b6847b9
remove boosting and allsearch
dojutsu-user Jun 26, 2019
af2d69f
remove allsearch import
dojutsu-user Jun 26, 2019
32d0bed
recursively remove newline characters from highlight dict
dojutsu-user Jun 26, 2019
d472f29
Merge branch 'master' into search-section-linking
dojutsu-user Jun 27, 2019
878343d
lint fix
dojutsu-user Jun 27, 2019
f98d91c
Merge branch 'master' into search-section-linking
dojutsu-user Jun 28, 2019
7c1c641
set number_of_fragments to 1
dojutsu-user Jun 28, 2019
fd8e8f7
use nested facet
dojutsu-user Jun 28, 2019
f6221ec
get sorted results
dojutsu-user Jul 2, 2019
8840606
Merge branch 'master' into search-section-linking
dojutsu-user Jul 2, 2019
60e229c
fix search.html
dojutsu-user Jul 2, 2019
3835e2e
remove unused imports and add logging
dojutsu-user Jul 2, 2019
ae5033c
show more data on domain objects
dojutsu-user Jul 3, 2019
28e7cbf
fix main site search
dojutsu-user Jul 3, 2019
1e2a40b
mark as safe and change log to debug
dojutsu-user Jul 3, 2019
7b7a3c9
add transpiled files -- js
dojutsu-user Jul 3, 2019
3931bc0
remove log
dojutsu-user Jul 3, 2019
84a2494
small improvements in template
dojutsu-user Jul 3, 2019
5cae508
change variable name
dojutsu-user Jul 3, 2019
adb74ed
fix template
dojutsu-user Jul 4, 2019
d500d98
fix lint
dojutsu-user Jul 4, 2019
9461d4f
use python datatypes
dojutsu-user Jul 4, 2019
75dcc2f
remove highlight url param from sections and domains
dojutsu-user Jul 4, 2019
ea36138
fix clashing css classes
dojutsu-user Jul 4, 2019
451c0f4
Merge branch 'master' into search-section-linking
dojutsu-user Jul 8, 2019
0817d43
use underscore.js template
dojutsu-user Jul 8, 2019
5305458
add _ with variables
dojutsu-user Jul 8, 2019
68cb7af
add comment in template
dojutsu-user Jul 8, 2019
d62bf3e
use .iterator()
dojutsu-user Jul 8, 2019
ed16e56
show multiple results per section, if present
dojutsu-user Jul 9, 2019
0ed64f7
fix sphinx indexing
dojutsu-user Jul 9, 2019
f988302
don't index '-' value of domain.display_name
dojutsu-user Jul 9, 2019
429b3e9
fix eslint
dojutsu-user Jul 9, 2019
897e09f
Merge branch 'master' into search-section-linking
dojutsu-user Jul 9, 2019
aeaba6f
reduce complexity in search.js
dojutsu-user Jul 10, 2019
6f9b2bc
refactor tasks.py file
dojutsu-user Jul 10, 2019
6135cde
fix logic in search.views
dojutsu-user Jul 10, 2019
d3566ac
make 100 a constant
dojutsu-user Jul 10, 2019
992c72e
Add checkbox for searching in current section
dojutsu-user Jul 10, 2019
f0babf1
remove checkbox code for now
dojutsu-user Jul 10, 2019
4527839
Merge branch 'master' into search-section-linking
dojutsu-user Jul 10, 2019
7e75d7e
fix test_imported_file
dojutsu-user Jul 10, 2019
1e6721d
fix test_search_json_parsing
dojutsu-user Jul 10, 2019
2a4c070
fix test_search_json_parsing
dojutsu-user Jul 10, 2019
4beec39
update test_search_json_parsing
dojutsu-user Jul 10, 2019
01346a0
Merge branch 'master' into search-section-linking
dojutsu-user Jul 11, 2019
91282de
refactor parse_json and its test
dojutsu-user Jul 11, 2019
cfe8f5b
write initial tests
dojutsu-user Jul 11, 2019
7e99f6a
make 100 as constant
dojutsu-user Jul 11, 2019
b7ce777
fix lint
dojutsu-user Jul 12, 2019
6701a4e
add test for domains and filter by version and project
dojutsu-user Jul 12, 2019
cee24ed
revert changes to python_environments.py
dojutsu-user Jul 12, 2019
685f6db
remove tests from this pr
dojutsu-user Jul 12, 2019
d7edeee
update template to make 100 as constant
dojutsu-user Jul 12, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 6 additions & 12 deletions readthedocs/core/static-src/core/js/doc-embed/search.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
var rtddata = require('./rtd-data');
var xss = require('xss/lib/index');
var MAX_RESULT_PER_SECTION = 3;
var MAX_SUBSTRING_LIMIT = 100;


/*
Expand Down Expand Up @@ -97,7 +98,7 @@ function attach_elastic_search_query(data) {
var section = inner_hits[j];
var section_subtitle = section._source.title;
var section_subtitle_link = link + "#" + section._source.id;
var section_content = [section._source.content.substring(0, 100) + " ..."];
var section_content = [section._source.content.substring(0, MAX_SUBSTRING_LIMIT) + " ..."];

if (section.highlight) {
if (section.highlight["sections.title"]) {
Expand Down Expand Up @@ -136,6 +137,7 @@ function attach_elastic_search_query(data) {
var domain_subtitle = domain._source.role_name;
var domain_subtitle_link = link + "#" + domain._source.anchor;
var domain_content = "";
var domain_name = domain._source.name;

if (
typeof domain._source.display_name === "string" &&
Expand All @@ -144,23 +146,15 @@ function attach_elastic_search_query(data) {
domain_subtitle = "(" + domain._source.role_name + ") " + domain._source.display_name;
}

// preparing domain_content
// domain_content = type_display --
domain_content = domain._source.type_display + " -- ";
if (domain.highlight) {
if (domain.highlight["domains.name"]) {
// domain_content = type_display -- name
domain_content += xss(domain.highlight["domains.name"][0]);
} else {
// domain_content = type_display -- name
domain_content += domain._source.name;
domain_name = xss(domain.highlight["domains.name"][0]);
}
} else {
// domain_content = type_display -- name
domain_content += domain._source.name;
}

// domain_content = type_display -- name -- in doc_display
domain_content += " -- in " + domain._source.doc_display;
domain_content = domain._source.type_display + " -- " + domain_name + " -- in " + domain._source.doc_display;

contents.append(
$u.template(
Expand Down
2 changes: 1 addition & 1 deletion readthedocs/core/static/core/js/readthedocs-doc-embed.js

Large diffs are not rendered by default.

41 changes: 30 additions & 11 deletions readthedocs/projects/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -1278,14 +1278,25 @@ def fileify(version_pk, commit, build):
}
)
try:
_manage_imported_files(version, path, commit, build)
changed_files = _create_imported_files(version, path, commit, build)
except Exception:
changed_files = set()
log.exception('Failed during ImportedFile creation')

try:
_create_intersphinx_data(version, path, commit, build)
except Exception:
log.exception('Failed during SphinxDomain creation')

try:
_sync_imported_files(version, build, changed_files)
except Exception:
log.exception('Failed during ImportedFile syncing')


def _create_intersphinx_data(version, path, commit, build):
"""
Update intersphinx data for this version.
Create intersphinx data for this version.

:param version: Version instance
:param path: Path to search
Expand Down Expand Up @@ -1401,14 +1412,16 @@ def clean_build(version_pk):
return True


def _manage_imported_files(version, path, commit, build):
def _create_imported_files(version, path, commit, build):
"""
Update imported files for version.
Create imported files for version.

:param version: Version instance
:param path: Path to search
:param commit: Commit that updated path
:param build: Build id
:returns: paths of changed files
:rtype: set
"""

changed_files = set()
Expand Down Expand Up @@ -1458,16 +1471,22 @@ def _manage_imported_files(version, path, commit, build):
build=build,
)

# create SphinxDomain objects
try:
_create_intersphinx_data(version, path, commit, build)
except Exception:
log.exception('Failed during SphinxDomain objects creation')
return changed_files


def _sync_imported_files(version, build, changed_files):
"""
Sync/Update/Delete ImportedFiles objects of this version.

:param version: Version instance
:param build: Build id
:param changed_files: path of changed files
"""

# Index new HTMLFiles to elasticsearch
# Index new HTMLFiles to ElasticSearch
index_new_files(model=HTMLFile, version=version, build=build)

# Remove old HTMLFiles from elasticsearch
# Remove old HTMLFiles from ElasticSearch
remove_indexed_files(
model=HTMLFile,
version=version,
Expand Down
19 changes: 12 additions & 7 deletions readthedocs/rtd_tests/tests/test_imported_file.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
from django.test import TestCase

from readthedocs.projects.models import ImportedFile, Project
from readthedocs.projects.tasks import _manage_imported_files
from readthedocs.projects.tasks import _create_imported_files, _sync_imported_files


base_dir = os.path.dirname(os.path.dirname(__file__))
Expand All @@ -16,21 +16,26 @@ class ImportedFileTests(TestCase):
def setUp(self):
self.project = Project.objects.get(slug='pip')
self.version = self.project.versions.first()

def _manage_imported_files(self, version, path, commit, build):
"""Helper function for the tests to create and sync ImportedFiles."""
_create_imported_files(version, path, commit, build)
_sync_imported_files(version, build, set())

def test_properly_created(self):
test_dir = os.path.join(base_dir, 'files')
self.assertEqual(ImportedFile.objects.count(), 0)
_manage_imported_files(self.version, test_dir, 'commit01', 1)
self._manage_imported_files(self.version, test_dir, 'commit01', 1)
self.assertEqual(ImportedFile.objects.count(), 3)
_manage_imported_files(self.version, test_dir, 'commit01', 2)
self._manage_imported_files(self.version, test_dir, 'commit01', 2)
self.assertEqual(ImportedFile.objects.count(), 3)

def test_update_commit(self):
test_dir = os.path.join(base_dir, 'files')
self.assertEqual(ImportedFile.objects.count(), 0)
_manage_imported_files(self.version, test_dir, 'commit01', 1)
self._manage_imported_files(self.version, test_dir, 'commit01', 1)
self.assertEqual(ImportedFile.objects.first().commit, 'commit01')
_manage_imported_files(self.version, test_dir, 'commit02', 2)
self._manage_imported_files(self.version, test_dir, 'commit02', 2)
self.assertEqual(ImportedFile.objects.first().commit, 'commit02')

def test_update_content(self):
Expand All @@ -40,13 +45,13 @@ def test_update_content(self):
with open(os.path.join(test_dir, 'test.html'), 'w+') as f:
f.write('Woo')

_manage_imported_files(self.version, test_dir, 'commit01', 1)
self._manage_imported_files(self.version, test_dir, 'commit01', 1)
self.assertEqual(ImportedFile.objects.get(name='test.html').md5, 'c7532f22a052d716f7b2310fb52ad981')

with open(os.path.join(test_dir, 'test.html'), 'w+') as f:
f.write('Something Else')

_manage_imported_files(self.version, test_dir, 'commit02', 2)
self._manage_imported_files(self.version, test_dir, 'commit02', 2)
self.assertNotEqual(ImportedFile.objects.get(name='test.html').md5, 'c7532f22a052d716f7b2310fb52ad981')

self.assertEqual(ImportedFile.objects.count(), 3)
8 changes: 5 additions & 3 deletions readthedocs/rtd_tests/tests/test_search_json_parsing.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,9 @@ def test_h2_parsing(self):
'files/api.fjson',
),
)
self.assertEqual(data['path'], 'api')
self.assertEqual(data['sections'][1]['id'], 'a-basic-api-client-using-slumber')
# Only capture h2's after the first section
for obj in data['sections'][1:]:
self.assertEqual(obj['content'][:5], '\n<h2>')
self.assertTrue(data['sections'][1]['content'].startswith(
'You can use Slumber'
))
self.assertEqual(data['title'], 'Read the Docs Public API')
58 changes: 28 additions & 30 deletions readthedocs/search/views.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,36 +107,34 @@ def elastic_search(request, project_slug=None):
if results:

# sorting inner_hits (if present)
try:
for result in results:

inner_hits = result.meta.inner_hits
sections = inner_hits.sections or []
domains = inner_hits.domains or []
all_results = itertools.chain(sections, domains)

sorted_results = (
{
'type': hit._nested.field,

# here _source term is not used because
# django gives error if the names of the
# variables start with underscore
'source': hit._source.to_dict(),

'highlight': utils._remove_newlines_from_dict(
hit.highlight.to_dict()
),
}
for hit in sorted(all_results, key=utils._get_hit_score, reverse=True)
)

result.meta.inner_hits = sorted_results

except Exception as e:
# if the control comes in this block,
# that implies that there was a PageSearch
pass
if user_input.type == 'file':

try:
for result in results:
inner_hits = result.meta.inner_hits
sections = inner_hits.sections or []
domains = inner_hits.domains or []
all_results = itertools.chain(sections, domains)

sorted_results = (
{
'type': hit._nested.field,

# here _source term is not used because
# django gives error if the names of the
# variables start with underscore
'source': hit._source.to_dict(),

'highlight': utils._remove_newlines_from_dict(
hit.highlight.to_dict()
),
}
for hit in sorted(all_results, key=utils._get_hit_score, reverse=True)
)

result.meta.inner_hits = sorted_results
except Exception:
log.exception('Error while sorting the results (inner_hits).')

log.debug('Search results: %s', pformat(results.to_dict()))
log.debug('Search facets: %s', pformat(results.facets.to_dict()))
Expand Down
4 changes: 3 additions & 1 deletion readthedocs/templates/search/elastic_search.html
Original file line number Diff line number Diff line change
Expand Up @@ -25,6 +25,8 @@

{% block content %}

{% trans "100" as MAX_SUBSTRING_LIMIT %}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need trans here, I think we can use with: https://docs.djangoproject.com/en/2.2/ref/templates/builtins/#with


<div class="navigable">
<ul>
<h5>{% trans 'Object Type' %}</h5>
Expand Down Expand Up @@ -226,7 +228,7 @@ <h3>
{% endfor %}
{% endwith %}
{% else %}
{{ inner_hit.source.content|slice:"100" }} ...
{{ inner_hit.source.content|slice:MAX_SUBSTRING_LIMIT }} ...
{% endif %}
{% endif %}
{% endfor %}
Expand Down