Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

excuse me! I want Ask a question that is not related to this repository #1

Closed
zdw189803631 opened this issue Oct 17, 2024 · 6 comments

Comments

@zdw189803631
Copy link

zdw189803631 commented Oct 17, 2024

irst of all, I'm sorry, I shouldn't ask this question here, but I didn't find a suitable way to send private messages, and I'm not familiar with whether GitHub has a way to send private messages!
In this issue, Amaimersion/google-docs-utils#10, you left a message about how you decompiled the language tool code, but then found how to get the selected text from Google Doc and insert text, but according to your message, I can only insert text, I want to know how to get the selected text.
Most importantly, I also want to know how you decompile the language tool code, which I think is a very important skill, but I don't know where to start learning

@swoorpious
Copy link
Owner

swoorpious commented Oct 17, 2024

Hello, apologies for late replies. I am busy with school lately.

To begin with, I will advise you to type random characters in docs and inspect the DOM. You will find svg elements that give you plenty of data you need. I am assuming that you have DOM annotated with the extension id.

I will leave a few important class/id names here that you can just copy/paste directly.

const GOOG_DEF = { // google docs definitions
    IF_WINDOW_SELECT_PATH: ".docs-texteventtarget-iframe",
    RECT_SELECT_PATH: ".kix-canvas-tile-selection > svg > rect",
    CARET_SELECT_ID: "#kix-current-user-cursor-caret",
}

I will leave a few hints here. Now, for example, you want to grab the selected text. One way you can do it is by using the way I mentioned in the original thread - i.e. by calculating the caret position - then calculating the index of the character based on the caret's screen coordinates - and then selecting text to the left or right in the code. The other way, the way LanguageTool uses, is by getting the screen coordinates of the beginning and the end of the selection they want (they calculate this by calculating width of characters with font css) and then dispatching a series of keyboard and mouse/mousedrag events to create a RECT_SELECT_PATH in the DOM - then they dispatch a copy event to copy the selected text in the DOM (more on how that works below).

From the original thread - here, I attached a screenshot of Language Tool's source code. There are two functions called _selectFromLeftToRight and _selectFromRightToLeft

  • You can see the series of events in the return statement there.
  • This series of events (from what I understand) is trying to mimic how a real person would select text with a mouse and keyboard.
  • If you have text selected, you can pass in a CustomEvent named copy on IF_WINDOW_SELECT_PATH (iirc) and docs will put the selected text somewhere in IF_WINDOW_SELECT_PATH. It's pretty visible - you can try copying some text with ctrl+c and it will put text there 😂

For my project, all I needed was to select a couple characters to the left or right - which, again, I did using caret index instead of LanguageTool's way.

firefox_MvpwOEQDSH__2024-10-18__00-24

If you might notice, the class name says kix-canvas-tile-selection, which is what I used in the code snippet above - only that instead of grabbing this, I grab the more useful rect inside it.

image

Similarly, with the content on the page, you can access whatever is written inside the document, with all it's font styling in the DOM itself (data-font-css). The text is separated line-by-line, and the actual text (for that line) is in the aria-label property.

  • Every enter counts as new line 😉 I feel like new people get confused with this one. Some think after a period is a new line lol

More on how to calculate this in the original thread - here.

image


Now, about "trying to reverse engineer LanguageTool". You can access LanguageTool's source code where your browser stores the extension. Then just find the extension id from your browser and then searching it up in the file system.
You can follow this: https://stackoverflow.com/a/21482465/13680585

After you're done finding the source code for LanguageTool, you will have to beautify those files with one of those tools online, Because the files are of course minified, and it's a brain fuck. Although some functions retain their original names and you can use ChatGPT to rename the rest of the variables to get more context on how it works.

Anyway, I hope you have fun working on this project and learn how it all functions (and how apps like docs fuck you up). Feel free to ask any more questions here, I can help by giving examples from my code/implementation, but expect delays since I am slow with replying 😄

Also this is funny
image

@zdw189803631
Copy link
Author

Thank you very much for replying amidst your busy schedule. I am now able to obtain the text selected by the user and have also found the code for the browser extension.

get_selection

there are still have a few questions that confuses me:

  1. Without Language Tool (LT) installed, can I get svg elements to be inserted into kix-canvas-tile-selection? Some SVG elements appear within the div tag with the class kix-canvas-tile-selection when text is selected by the user in Google Docs. Is this behavior due to Chrome or is it a result of the LT code? In this Stack Overflow post https://stackoverflow.com/questions/75019646/get-selected-text, it mentions the need for screen reader functionality / SVG reading. I searched the LT code globally for parts that include kix-canvas-tile-selection, but it seems there is no code that inserts SVG elements into this tag. I also conducted an experiment where I inserted a similar DOM fragment into the page, and no SVG elements appeared in this tag when I selected the text.

  2. How to get all text without using cmd + a? Out of curiosity, I want to obtain the entire article content at the beginning so that I can perform grammar checks on the full text, similar to LT. My approach is to simulate a cmd + a click and then dispatch a copy event to .docs-texteventtarget-iframe, after which I can print out the entire content. However, unlike LT, my current page remains in a selected state, which might seem strange to users. Do you have any other methods or suggestions on how I can achieve this without selecting the entire text visibly?

Thank you again for your previous reply, and I wish you success in your studies!!!

@zdw189803631
Copy link
Author

Hello friend, through AI and your help, I found the answer to my question above. I will post it in the comment below after I sort it out.

@swoorpious
Copy link
Owner

I don't quite understand what you mean by the first question.

But about the second question, doing cmd+a or ctrl+a is not a good idea, as you get the entire text as one massive chunk of text. The better approach would be to have a system where you only scan for grammar on the active page - which you can calculate by the caret index (there's a lot of things you can do with the caret in docs lol).

iirc; the text annotation would be grouped page-by-page and then line-by-line. So with the caret position you can pretty much just get the active page and all the rect elements inside without having to do the copy event shenanigans, where you also lose important data in case you want to add error-highlighting similar to LT. Important data like line breaks, coordinates of text and styling of text, etc.

@swoorpious swoorpious reopened this Oct 21, 2024
@zdw189803631
Copy link
Author

image In the source code of LT, there is a line of code like this. When I commented it out, gdoc would not insert SVG elements into `kix-canvas-tile-selection`.

Here is a comparison chart:

  1. Without comments:
image
  1. With comments:
image

According to some posts, one needs to apply to Google for a browser extension whitelist to have the privilege to implement such logic. I tried adding window._docs_annotate_canvas_by_ext = languageToolChromeExtensionId to my own developed extension, but it didn't work. Although the previous post Amaimersion/google-docs-utils#10 mentioned this, I thought at the time that the purpose was to add an HTML rendering method to the page. However, the rendering methods of tools like LT and Grammarly are still canvas, so I felt that this application might be outdated.

In the LT source code, I also found a getKixAppString method, which can obtain the entire content of the current document at any time, eliminating the need to execute cmd + a.

@swoorpious
Copy link
Owner

Yep, adding window._docs_annotate_canvas_by_ext = languageToolChromeExtensionId to your code won't work. I tried it myself and it does not work. There is probably a bit more stuff that LT does that I haven't come across yet, or maybe there's a way/api to check if any extension with languageToolChromeExtensionId is actually installed because docs does not want to risk the document's security.

And about adding the svg - LT does not add those svg elements themselves; LT tells docs to verify and then docs adds those elements - pretty sure you figured it out already tho lol

Also yeah, there's probably a lot to be had in LT's source code. I had a tight schedule when I worked on my project so I only looked for the bare minimum. By bare minimum, I mean I found simple workarounds that won't cost me much time. Maybe in future I work on making a docs library as a side project, lol. It shouldn't take me long this time now that I have most of the bullshit figured out.


I noticed in the image you sent, line 145, it says ExtensionConfigManager.getInstance(). There is probably something more than just setting a global variable that LT does to let docs know when to annotate the canvas.

You can dig around, maybe post in the original thread if you find something useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants