Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][RFC] Hyperlink Refactor - Registry Version #793

Closed

Conversation

SlayerOfTheBad
Copy link
Contributor

@SlayerOfTheBad SlayerOfTheBad commented Aug 13, 2024

This pull request achieves most of the same goals as #790, but is structured in a similar way to orgmode.org.autocompletion.

This PR superseeds #790.

TODO: unit tests

Copy link
Member

@kristijanhusak kristijanhusak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR!

There's progress from the previous PR, but we still have few things to handle.

Since things are fairly complicated, we can chunk these changes a bit more for now. For example, we can leave autocompleting the links for a separate PR, and for now just build the sources and all the structure around it, and maybe have a follow on the top links class. From there we can later add autocompletion.

@@ -20,6 +20,7 @@ local auto_instance_keys = {
---@field capture OrgCapture
---@field clock OrgClock
---@field completion OrgCompletion
---@field links OrgLinkHandlerRegistry
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's name this just OrgLinks

Suggested change
---@field links OrgLinkHandlerRegistry
---@field links OrgLinks

end

function CustomId:follow()
local headlines = Org.files:get_current_file():find_headlines_with_property_matching('custom_id', self.custom_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should pass down files from the top through constructor instead of requiring orgmode directly.

return utils.echo_warning(('Could not find custom ID "%s".'):format(self.custom_id))
end

self.goto_oneof(headlines)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are using self to call something, call it with a colon notation

Suggested change
self.goto_oneof(headlines)
self:goto_oneof(headlines)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to utils, as it seems more applicable than as a method of Internal

local completions = {}
for _, headline in pairs(headlines) do
local id = headline:get_property('CUSTOM_ID')
table.insert(completions, self:new(id):__tostring())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are using the metamethod to set up a string format of an entry, we should use tostring(self:new(id)). Other option is to have a custom to_string method and call that here directly. I'm ok with any of the options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Used tostring, just forgot it existed tbh.

return self
end

local id = headlines[1]:get_property('id')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we getting the id property here? Ids are handled through a separate handler.

Also, we need to consider that there are multiple headlines here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this was meant to implement the behaviour determined by org-id-link-to-use-id as described in https://orgmode.org/org.pdf#Handling%20Links.
Upon further reading however, this is resolved on storing a link, rather than on inserting it. I thought it happened at both. Removed

end

local function autocompletions_filenames(lead)
local Org = require('orgmode')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is a very WIP, but just a note that we should not require orgmode directly anywhere. We need to pass down dependencies from top.

---@param target string | nil
---@param disallow_file boolean?
---@return OrgLinkHandlerInternal | OrgLinkHandlerFile | nil
function Internal.parse(target, disallow_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This whole class is doing a bit too much, and also has the checks that we are trying to avoid with the sources.
Each source should do this check and know how to parse the value from the target. Some example code:

local sources = { Headline, CustomId, LineNumber, File, Plain }

for _, source in ipairs(sources) do
  local parsed = source:parse(target)
  if parsed then
  	return parsed
  end
end

This should happen in the top level class, not here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I split this up into a separate handler because these are a bit of a special case, on account of not having a <protocol>: prefix.

It seemed redundant to me to repeat the <protocol>: prefix handling in each of the LinkHandlers individually, so I put this logic in OrgLinks, but these five cases are special in the fact that they don't have prefixes.

These are also special in that these are the only allowed targets for id: and file: links after the ::, so it made handling of that easier.

file: is an even more special case because its <protocol>: prefix is optional, so if we say each 'internal' link should be able to parse itself in its entirety, either File needs special treatment to parse both the prefixed case and the non-prefixed case, or all LinkHandlers need to parse their own prefix.

It's a weird set of special cases, and I'm not sure there's a perfect solution, but I'm interested in your thoughts.

end

function OrgLinkHandlerRegistry:setup_builtin_handlers()
self:add_handler(require('orgmode.org.links.handlers.file'))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should instantiate the handlers here and pass down the dependencies.

Comment on lines +41 to +44
for k, v in pairs(Link) do
handler[k] = handler[k] or v
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this for now since we are not exposing adding custom handlers yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just as valid for protecting against internal handler's issues as it is for protecting custom handlers

Comment on lines 70 to 75
local function filter_by_prefix(lead)
return function(needle)
if lead == '' or needle:find('^' .. lead) then
return true
end
return false
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part can be just inlined where needed since it's a simple check

Suggested change
local function filter_by_prefix(lead)
return function(needle)
if lead == '' or needle:find('^' .. lead) then
return true
end
return false
end
end
local function filter_by_prefix(lead)
return function(needle)
return lead == '' or needle:find('^'..lead)
end
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied

@SlayerOfTheBad
Copy link
Contributor Author

Thanks for the review @kristijanhusak!

I've been processing a lot of the comments, but I'm left with a couple of questions:

maybe have a follow on the top links class.
Assuming you mean OrgLinks, How do you imagine this working?
Should the function signature be fun(link: OrgLinkHandler), and just call link:follow()?
OrgLinks doesn't really have anything it can follow, it's just a parser for a generic link.

And I'm currently of half a mind to split up each handler into a LinkFactory and a Link, so like:

local Id = Link:new('id')

function Id:new(id, target, files)
  -- Returns instance of Id
end

function Id:follow()
  -- Follows this instance
end

function Id:__tostring()
  -- Converts to link
end

local IdFactory = {}

function IdFactory:init(files)
  self.files = files
end

function IdFactory:new(id, target)
  return Id:new(id, target, self.files)
end

---@param input string
function IdFactory:parse(input)
  -- Parse id and target
  return self:new(id, target)
end

---@param lead string
function IdFactory:complete(lead)
  -- Find and return the completions
end

return IdFactory

This has the benefit of separating the link's functionality from the functionality to parse the link,
and ensures the files are always injected properly. Is this a structure that's desired?
If yes, should these two classes be split into separate files, or are they so interlinked they should live in the same file?
And how does this impact the structure for custom handlers?

(The complete function can be omitted from the factories for this PR as you suggested, but it's included here to illustrate the responsibilities of the Factory VS the responsibilities of the Link itself.)

@kristijanhusak
Copy link
Member

kristijanhusak commented Aug 15, 2024

I'll attempt to answer both of your comments with an example. This is what I have in mind:

Top level OrgLinks would look something like this:

local OrgLinks = {}

function OrgLinks:new(opts)
  local this = setmetatable({
    files = opts.files,
    sources = {},
    sources_by_name = {},
  }, OrgLinks)
  this:setup_builtin_sources()
end

function OrgLinks:setup_builtin_sources()
  self:add_source(require('orgmode.org.links.sources.headline'):new({ files = self.files }))
  self:add_source(require('orgmode.org.links.sources.custom_id'):new({ files = self.files }))
  --- etc
end

function OrgLinks:add_source(source)
  if self.sources_by_name[source:get_name()] then
    error('Link source ' .. source:get_name() .. ' already exists')
  end
  self.sources_by_name[source:get_name()] = source
  table.insert(self.sources, source)
end

---@param link string
function OrgLinks:follow(link)
  -- For example, link = 'file:~/Documents/notes.org::*headline'

  for _, source in ipairs(self.sources) do
    -- I named this `can_handle`, but we can think of a better name, maybe `source:matches` or `source:handles`
    if source:can_handle(link) then
      return source:follow(link)
    end

    -- Alternative is to atempt to follow immediately and return boolean if it was successful or not
    if source:follow(link) then
      return true
    end
  end
end

return OrgLinks

And then the headline link example would look like this:

local OrgLinkHeadline = {}

function OrgLinkHeadline:new(opts)
  local this = setmetatable({
    files = opts.files,
  }, OrgLinkHeadline)
end

---@param link string
function OrgLinkHeadline:can_handle(link)
  -- check if link have any of these formats
  -- * file:~/Documents/notes.org::*headline
  -- * *headline

  if 'matches_above_checks' then
    return true
  end

  return false
end

function OrgLinkHeadline:follow(link)
  local link_parts = self:parse_link_parts(link)
  -- if se use the altnerative approach to follow immediately in the for loop, we can just check if parsing parts was successful or not
  local headlines = self.files:find_headlines_by_title(link_parts.headline)
  return utils.goto_oenof(headlines)
end


return OrgLinkHeadline

And the custom id source would look like this:

local OrgLinkCustomId = {}

function OrgLinkCustomId:new(opts)
  local this = setmetatable({
    files = opts.files,
  }, OrgLinkCustomId)
end

---@param link string
function OrgLinkCustomId:can_handle(link)
  -- check if link have any of these formats
  -- * file:~/Documents/notes.org::#custom-id
  -- * #custom-id

  if 'matches_above_checks' then
    return true
  end

  return false
end

function OrgLinkCustomId:follow(link)
  local link_parts = self:parse_link_parts(link)
  -- if se use the altnerative approach to follow immediately in the for loop, we can just check if parsing parts was successful or not
  local headlines = self.files:find_headlines_with_property_matching('CUSTOM_ID', link_parts.custom_id)
  return utils.goto_oenof(headlines)
end

return OrgCustomId

We should not use factories since it complicates things additionally.

It seemed redundant to me to repeat the : prefix handling in each of the LinkHandlers individually, so I put this logic in OrgLinks, but these five cases are special in the fact that they don't have prefixes.

We should extract the parsing of it to a util or something and reuse it where necessary. In the top 2 examples we do need it to check if it's a file protocol. Note that /path/to/file::*headline is also a valid link for a headline (without a protocol), so there's more to it.

Hope this helps clearing up some of the misunderstandings.

@SlayerOfTheBad
Copy link
Contributor Author

Thanks for the response @kristijanhusak.

I'm willing to concede on parsing the protocol prefix within their own handler.
It's sensible enough, and the logic for it can indeed be fit into a reused utility function.

I do have a concern about the example you provided however.
It seems to suggest that links will be passed around as strings, and parsed any time it's accessed.
Is this actually what you would suggest, or is this just for terseness of the example?

@kristijanhusak
Copy link
Member

It seems to suggest that links will be passed around as strings, and parsed any time it's accessed.
Is this actually what you would suggest, or is this just for terseness of the example?

It's just an example. It can be an instance of something, like what currently is OrgLink.

@SlayerOfTheBad
Copy link
Contributor Author

Hello @kristijanhusak.

After some thinking, I've landed on the following option, and wanted to know your opinion on it before I implemented it for everything.

The following example described a handler for id:

---@class OrgFiles

---@class OrgLinkHandler
---@field protocol string
---@field parse fun(self : OrgLinkHandler, input : string): OrgLinkHandler

---@class OrgLinkHandlerId:OrgLinkHandler
---@field private files OrgFiles
---@field private id string
local LinkHandler = {}

---@param opts table
---@return OrgLinkHandler
function LinkHandler:new(opts)
    local this = {
        files = opts.files,
    }
    setmetatable(this, self)
    self.__index = self
    return this
end

---@param input string
---@return OrgLinkHandlerId | nil
function LinkHandler:parse(input)
    if not input:match("^id:") then return nil end

    return {
        id = input:sub(4)
    }
end

---@param input OrgLinkHandlerId
function LinkHandler.follow(input)
    -- Go to the file defined in `input`
end

---@param input OrgLinkHandlerId
function LinkHandler.__tostring(input)
    return 'id:' .. input.id
end

The following example describes OrgLinks

---Responsible for delegations to the different LinkHandlers
---@class OrgLinks
---@field private files OrgFiles
---@field private handlers OrgLinkHandler[]
---@field private handlers_by_name table<string, OrgLinkHandler>
local OrgLinks = {
    files = {},
    handlers = {},
    handlers_by_name = {},
}

function OrgLinks:setup_builtin_handlers()
    self:add_handler(require('orgmode.org.links.handlers.file'):new({ files = self.files }))
end

---@param handler OrgLinkHandler
function OrgLinks:add_handler(handler)
    if self.handlers_by_name[handler.protocol] then
        error('Completion handler ' .. handler.protocol .. ' already exists')
    end

    handler = self:wrap_parse(handler)

    self.handlers_by_name[handler.protocol] = handler
    table.insert(self.handlers, handler)
end

---@param handler OrgLinkHandler
---@return OrgLinkHandler
function OrgLinks:wrap_parse(handler)
    local internal_parse = handler.parse
    handler.parse = function(_handler, input)
        local this = internal_parse(_handler, input)
        setmetatable(this, _handler)
        _handler.__index = _handler
        return this
    end

    return handler
end

The essential linchpin is wrap_parse.
It modifies any registered handler such that it returns an object returned by the LinkHandler's parse has all of the LinkHandler's functions attached.
This allows us to call LinkHandler:parse(input):follow() or LinkHandler:follow(LinkHandler:parse(input)) as we see fit.
This

  1. Eliminates the Factory components
  2. Allows custom link handles to be defined as a single table
  3. Only requires a link handler's parse function to return an object that contains its own data. The functions are attached automatically.
  4. Gives the objects returned by parse access to the LinkHandler's internal fields (such as self.files in this case).

Does this seem sensible, or is this a bad idea?

@kristijanhusak
Copy link
Member

I'm not really liking the idea of modifying the handlers in that way.
I think this is mostly to simplify custom sources that we will add later, but we can address those differently.

  1. Eliminates the Factory components

I don't think we need this anyway.

  1. Allows custom link handles to be defined as a single table

I believe it will be only a single table. We just need to inject the files somehow. I gave some options at the end.

  1. Only requires a link handler's parse function to return an object that contains its own data. The functions are attached automatically.

I'm not sure what we would share besides files, since each source would need to implement their own parse and follow (and later autocompletion). Even if there's something, we can just expose a helper class that users can use to extend their own.

  1. Gives the objects returned by parse access to the LinkHandler's internal fields (such as self.files in this case).

For giving access to the files for custom sources, we can do 2 things:

  1. Not provide anything, and let user manually reference require('orgmode').files
  2. Assign the files manually when adding the source. Thankfully, it's just a table so we can assign things
local OrgLinks = {
    files = {},
    handlers = {},
    handlers_by_name = {},
}

function OrgLinks:setup_builtin_handlers()
    -- Note that I removed `files` here.
    self:add_handler(require('orgmode.org.links.handlers.file'):new())
end

---@param handler OrgLinkHandler
function OrgLinks:add_handler(handler)
    if self.handlers_by_name[handler.protocol] then
        error('Completion handler ' .. handler.protocol .. ' already exists')
    end

    --- assign files to the handler so they can use it
    handler.files = self.files

    self.handlers_by_name[handler.protocol] = handler
    table.insert(self.handlers, handler)
end

Let me know if I misunderstood what you are trying to solve here.

@SlayerOfTheBad
Copy link
Contributor Author

The problem I'm trying to solve is a certain combination of requirements, some self-imposed, some imposed in the review. I will try to enumerate them here.

For examples I will use a handler for id:

  1. Have OrgLinkHandler:parse(input: string) return an object (OrgLink) that
  • Contains all the fields it needs (id, files)
  • Contains all the actions that can be executed on that link (follow, __tostring, insert_description)
  1. Have fields that are the same between different OrgLink instances of the same type be automatically inserted (files)
  • Would normally do this through require('orgmode').files, as you've suggested for custom handlers, but you've also expressed a desire for these dependencies to be passed down from the top
  • This is usually done through a factory pattern, but you've rejected this for complexity reasons.
  1. Treat user-defined hyperlinks in the same way as internal hyperlinks.
  • Reduces complexity
  1. Not require the user to manually add the functions to the object returned from their OrgLinkHandler:parse.
  • Not nice if you want to just define your custom hyperlink as a table.

I don't think just adding files to the handlers is a great solution, as different types of links have different requirements. For example: The way I currently envision OrgLinkId, is having a field target: OrgLink, which stores the OrgLink associated with the part after ::. Thus
id:some-file::*some-title
would be parsed to

{
  id: 'some-file',
  files: OrgFiles,
  target: {
    headline: 'some-title',
    files: OrgFiles,
  },
}

To be able to parse it to something like that, OrgLinkHandlerId needs a reference to OrgLinks, so it can use that to parse *some-title, but OrgLinkHandlerHeadline does not need this.

Let me know if anything is unclear, or if you think I missed or misunderstood anything.

@kristijanhusak
Copy link
Member

I agree with first 3 points.

Regarding the 4th point:

  1. Not require the user to manually add the functions to the object returned from their OrgLinkHandler:parse.
  • Not nice if you want to just define your custom hyperlink as a table.

I think we have misunderstanding on what parse should do. In my example I didn't actually create a parse method, but can_handle. Parsing the input should be internal to each handler. With my example, can_handle would call an internal parse method, and if it's able to parse it, would return true. Then follow would do same thing and actually execute follow with the parsed value.
There is some double parsing here to avoid creating factories, and that's why I suggested to immediately attempt to follow. I'll put a more detailed example below.

I don't think just adding files to the handlers is a great solution, as different types of links have different requirements.

That's true, but I think we can pass files to all of them and have that as part of the interface they are implementing. If some source does not use it, that's fine. There's no overhead in that.

For example: The way I currently envision OrgLinkId, is having a field target: OrgLink, which stores the OrgLink associated with the part after ::. Thus
id:some-file::*some-title
To be able to parse it to something like that, OrgLinkHandlerId needs a reference to OrgLinks, so it can use that to parse *some-title, but OrgLinkHandlerHeadline does not need this.

We should not reference the "parent" class in the handlers. We can have some helpers in another utils file.
So if a specific handler needs a target, we can have a util helper parse_target('url with double colon::*something')
id format does not actually support target, since it points to a very specific entry. This is a good example why each handler should have their own parsing.

Here's a more concrete example of how I envisioned it. Mostly copied from my previous comment, with more details around parsing:

Top level OrgLinks would look something like this:

local OrgLinks = {}

function OrgLinks:new(opts)
  local this = setmetatable({
    files = opts.files,
    sources = {},
    sources_by_name = {},
  }, OrgLinks)
  this:setup_builtin_sources()
end

function OrgLinks:setup_builtin_sources()
  self:add_source(require('orgmode.org.links.sources.id'):new({ files = self.files }))
  self:add_source(require('orgmode.org.links.sources.custom_id'):new({ files = self.files }))
  --- etc
end

function OrgLinks:add_source(source)
  if self.sources_by_name[source:get_name()] then
    error('Link source ' .. source:get_name() .. ' already exists')
  end
  self.sources_by_name[source:get_name()] = source
  table.insert(self.sources, source)
end

---@param link string
function OrgLinks:follow(link)
  for _, source in ipairs(self.sources) do
    if source:follow(link) then
      return true
    end
  end
end

return OrgLinks

And then the id handler example would look like this:

local OrgLinkId = {}

function OrgLinkId:new(opts)
  local this = setmetatable({
    files = opts.files,
  }, OrgLinkId)
end

---@param link string
---@return boolean
function OrgLinkId:follow(link)
  local link_info = self:_parse(link)
  if not link_info then
    return false
  end
  local id = link_info.id
  -- Part below copied from `mappings:open_at_point`
  local files = self.files:find_files_with_property('id', id)
  if #files > 0 then
    if #files > 1 then
      utils.echo_warning(string.format('Multiple files found with id: %s, jumping to first one found', id))
    end
    vim.cmd(('edit %s'):format(files[1].filename))
    return
  end

  local headlines = self.files:find_headlines_with_property('id', id)
  if #headlines == 0 then
    return utils.echo_warning(string.format('No headline found with id: %s', id))
  end
  if #headlines > 1 then
    return utils.echo_warning(string.format('Multiple headlines found with id: %s', id))
  end
  local headline = headlines[1]
end

---Note that this is an internal method and should not be part of interface that handlers need to implement
---@private
---@param link string
---@return { id: string } | nil
function OrgLinkId:_parse(link)
  local id = link:match('^id:(.+)$')
  if not id then
    return nil
  end
  return { id = id }
end


return OrgLinkId

It's ok if we need to duplicate some of the code, as long as is scoped within a single handler. Once everything is finished, we can easily extract reusable parts.

Hope this clarifies things a bit.

@SlayerOfTheBad
Copy link
Contributor Author

I am sorry, but the structure you propose is not something I would enjoy writing. I do not have the time, nor the energy, to continue pushing this PR anymore. Feel free to use what I have made here as a basis for your own implementation of this.

@kristijanhusak
Copy link
Member

I very much appreciate all the time you put into this, and I'm sorry that it became a burden because of my strict guidelines.
Once I do the implementation, I'll definitely consult you about the custom sources, since you seem eager to start using them.
Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants