Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Sharepoint server] crawl the root site #2141

Merged
merged 4 commits into from
Feb 14, 2024

Conversation

sjors101
Copy link
Contributor

@sjors101 sjors101 commented Feb 9, 2024

Closes #2140

Support crawling and an indexing top-level site collection from Sharepoint server on-premise. Use the existing input field site_collections to have the option to define the toplevel (root) site: example input: foo, bar, /. Tested agains the Sharepoint 2019 (16.0.0.10337: 1).

Checklists

Pre-Review Checklist

  • this PR has a meaningful title
  • this PR links to all relevant github issues that it fixes or partially addresses
  • if there is no GH issue, please create it. Each PR should have a link to an issue
  • this PR has a thorough description
  • Covered the changes with automated tests
  • Tested the changes locally
  • Added a label for each target release version (example: v7.13.2, v7.14.0, v8.0.0)

Release Note

[Added by Artem]
It's now possible to crawl root site collection with Sharepoint Server connector - before it was possible to crawl only sub-sites of the root collection

Copy link
Member

@artem-shelkovnikov artem-shelkovnikov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution, @sjors101!

Change looks good to me, we'll test it on our side against a real Sharepoint Server instance and will get back to you!

@@ -92,7 +92,12 @@ def __init__(self, configuration):
self.certificate = self.configuration["ssl_ca"]
self.ssl_enabled = self.configuration["ssl_enabled"]
self.retry_count = self.configuration["retry_count"]
self.site_collections = self.configuration["site_collections"]

self.site_collections_path = []
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.site_collections_path = []
self.site_collection_paths = []

I'd probably rename it like this to indicate that it's an array

@artem-shelkovnikov
Copy link
Member

buildkite test this

@artem-shelkovnikov
Copy link
Member

Tested the change, all looks good, thank you for your contribution, @sjors101!

@artem-shelkovnikov
Copy link
Member

buildkite test this

@artem-shelkovnikov
Copy link
Member

buildkite test this

@artem-shelkovnikov artem-shelkovnikov merged commit edbfe63 into elastic:main Feb 14, 2024
2 checks passed
@seanstory
Copy link
Member

@sjors101 be sure to consider joining our contributor program! You can get credit for both the bug you filed and this PR.

As a side note, we'd like to congratulate you, as the first non-Elastic-affiliated contributor to Connectors. If we were a restaurant, we'd pin this code diff to our wall. :) We're very excited about your contribution!

@sjors101
Copy link
Contributor Author

Thanks @seanstory and @artem-shelkovnikov. The connector-framework is developing quite nicely, and the instructions in the contribution guide are great. We use various connectors intensively, expect more contributions coming your way :)

@sjors101 sjors101 deleted the sharepoint-server-root-site branch February 15, 2024 20:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Sharepoint server] Option to crawl the sharepoint root site
3 participants