Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Questions regarding Scaladoc search #203

Open
valencik opened this issue Mar 30, 2024 · 10 comments
Open

Questions regarding Scaladoc search #203

valencik opened this issue Mar 30, 2024 · 10 comments

Comments

@valencik
Copy link
Contributor

valencik commented Mar 30, 2024

@valencik Sorry for bumping on this thread, but I was looking at the Scaladoc Search in Protosearch project idea and had a few questions about the project idea.

  1. Would we be utilizing Lucille to parse Scaladocs? It seems like there is no functionality for that as of right now.
  2. How would this functionality be integrated into the in-browser search? Couldn't find too much on that matter

These questions would help me greatly with creating my proposal for Google Summer of Code. Thank you!

Originally posted by @VigneshSK17 in #188 (comment)

@valencik
Copy link
Contributor Author

Would we be utilizing Lucille to parse Scaladocs?

No. Lucille is a library for parsing and representing Lucene style queries.
Parsing Scaladoc, or somehow getting the information out of Scaladoc is a new and separate problem this project will have to tackle. There's a handful of avenues to explore here. Do we parse the Scaladoc text directly? Do we interop with the Scaladoc classes? Do we use TASTY? This project will absolutely require some experimentation here.

How would this functionality be integrated into the in-browser search?

This is another area where there could be different options and require experimentation. Ultimately, the index file should probably contain all the data we need, and return this data for the various matching hits. Currently we can support this with stored fields. For the index we build as part of the Laika interop, we set the stored fields here:

(Field("body", analyzer, true, true, true), _.content),
(Field("title", analyzer, true, true, true), d => renderTitle(d.title, d.path)),
(Field("path", analyzer, true, true, false), d => renderLink(d)),

And you can see us accessing fields in worker.js:

const path = hit.fields.path
const link = "../" + hit.fields.path.replace(".txt", ".html")
const title = hit.fields.title
const preview = hit.fields.body.slice(0, 150) + "..."

@VigneshSK17
Copy link
Contributor

Makes sense, would we potentially explore using another dependency like https://scalameta.org/docs/trees/scaladoc.html or parse by hand?

I'll spend some time looking into parsing options, whether that be in a similar manner to scala-meta or a different method. Thanks for answering!

@valencik
Copy link
Contributor Author

It would absolutely be fine to use another dependency if it helped with getting the needed ScalaDoc information.
Similar to how our current Laika model depends on Laika to read and process Markdown files.

However, that specific link you've shared, https://scalameta.org/docs/trees/scaladoc.html, looks like it's a collection of links to ScalaMeta's ScalaDoc documentation, not a module for reading ScalaDoc.

@VigneshSK17
Copy link
Contributor

Oh right, I was looking at an issue on that project and linked to the docs instead of that issue. I did find what seems to be a promising means of parsing Scaladoc (https://github.com/andyglow/scaladoc). It seems to be actively maintained as well which is a plus. I'll look into how well this works for our purposes.

@valencik
Copy link
Contributor Author

@VigneshSK17 unfortunately I have to recommend not investigating https://github.com/andyglow/scaladoc any further.
There is no license on that project, so all rights are reserved, it is not open source software.
Sorry.

@VigneshSK17
Copy link
Contributor

Oops my bad, still not used to checking for things like that. I'll look for a properly licensed option for a bit but will probably default to making a somewhat custom scaladoc parser.

@valencik
Copy link
Contributor Author

It would be ideal if we can find some path to get the Scaladoc info from official Scala tooling. Writing something custom means maintaining something custom as Scala evolves.

My friend @samspills recently shared this Scalameta link which seems promising:

https://www.javadoc.io/doc/org.scalameta/trees_2.12/latest/scala/meta/internal/Scaladoc$.html

I'm unsure if Scalameta would work nicely for both Scala 2 and 3 though.
But the Scalameta docs make it seem like this would be easy enough to explore:

https://scalameta.org/docs/trees/guide.html#from-strings

@VigneshSK17
Copy link
Contributor

Makes sense. The original link I sent was what I thought this was, it definitely looks like a good solution.

@valencik
Copy link
Contributor Author

valencik commented Apr 1, 2024

Hey @VigneshSK17 I just wanted to note that the applications are due soon (April 2nd 18:00 UTC), and I'm more than happy to review rough drafts if you'd like 😄

You can email me at andrew.valencik@gmail.com.
Also, feel free to reach out on the Typelevel discord

@VigneshSK17
Copy link
Contributor

Thank you! I don't think I'll have the time to finish a proposal in time for you to review because of an emergency but I'll use this space to ask questions so I can create a good first version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants