Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add delete feature to combine pdf tool #14

Closed
wants to merge 4 commits into from

Conversation

Fahme
Copy link
Contributor

@Fahme Fahme commented Apr 11, 2021

Adds a delete feature to the merge pdf tool

Fixes on of the issues in
#10

@@ -23,6 +23,23 @@ async function mergePdfs(files: File[], addToDoc?: PDFDocument) {
};
}

// This is a work around for the pdf-js library because removePage method does not delete all
Copy link
Contributor Author

@Fahme Fahme Apr 11, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mkhatib i'm kinda lost here, according to Hopding/pdf-lib#140 (comment)

the maintainer mentions that removePage ONLY delete a reference of a page from the pdf leaving objects and sizes the same.

I went with the solution he mentioned on deleting all of the page reference objects etc .. assuming the user want to delete the whole page and reduce the size ?

What's your opinion ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better way might be to handle this similar to how we handle Ordering Pages. We do not recreate or reorder the actual PDF pages instead we just maintain a pages order list which is the indexes of the PDF pages and how they should be EVENTUALLY ordered when the user downloads the PDF.

I think the same can be achieved for deletion. You wouldn't touch the PDF file at all when deleting. Instead you just delete the "page index" from the pagesOrder list - react will re-render the list and only show the user the pages that exist in the pagesOrder - so the user will have instant feedback.

When the user is ready to click DOWNLOAD. We already only download the pages in that ordersPage so anything else won't be included. I think this is the right way to go about this - let me know if I am missing anything.

Although in order to address the delete issue you might need to do a simple modification on getOrderedPdf to use the order list instead of getting all the pages.

Let me know if this makes senes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is defiantly cleaner

Copy link
Contributor Author

@Fahme Fahme Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually one edge case i'm thinking for this approach is.

If the user wants to upload a new document and append to existing list after deleting a file for example, so when a file is dropped mergePdfs gets called re-creating all of the documents list (including the ones we deleted, since it's already referenced in original document and we only mutated the order page array) also a re-render happens to render the old, new, deleted pages.

So i'm thinking we might need to still remove the page from original document somehow.

If it doesn't make sense i can elaborate more ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right this will introduce an issue but not exactly what you described, I think it's solvable but let me describe what you're saying and what could happen in an example.

action actual pages in pdf pages order
upload 3 pages 1 2 3 1 2 3
reorders to 3 2 1 1 2 3 3 2 1
uploads 2 pages 1 2 3 4 5 3 2 1 4 5
delete index 2 1 2 3 4 5 3 1 4 5
upload 2 pages 1 2 3 4 5 6 7 3 1 4 5 5 6

The issue when a user uploads after a deletion (look at the repeated index in the last row). And the fix should be simple (I think). All is needed is to fix our reference of "page count" when appending the the document to reference the actual PDF document w're handling, instead of what we're doing now which is using pagesOrder list as references to what are the indexes of newly uploaded pages should be.

This is the bit of code needs changing (code change is between **):

      // Calculate new page order with the added pages at the end of the order.
      // To avoid losing previously ordered indexes.
     **const oldPdfDoc = doc;
     const newPagesIndexOffset = oldPdfDoc.getPageCount();**

      const oldPagesOrder = pagesOrder; 
      const oldPageCount = oldPagesOrder.length;
      const newlyAddedPagesOrder = new Array(pageCount - oldPageCount)
        .fill(0)
        .map((_, index) => **newPagesIndexOffset** + index + 1);
      const newOrder = [...oldPagesOrder, ...newlyAddedPagesOrder];
      setDoc(newDoc);
      setPDF(pdf);
      setPagesOrder(newOrder);

If we do that I think that last step in the table above would be:
| upload 2 pages | 1 2 3 4 5 6 7 | 3 1 4 5 6 7 |

Copy link
Contributor Author

@Fahme Fahme Apr 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that's what i.e. actually the indexes are duplicated, i'll look into this more not sure if the solution you introduced solves the issue.

Because the newPagesIndexOffset will always be the number of pages in the pdf as it's constant and they are not being modified.

for example a page of 5 the resulting newlyAddedPagesOrder would be something like
[6, 7, 8, 9] right ? or am i missing something.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the newPagesIndexOffset will always be the number of pages in the pdf as it's constant and they are not being modified.

for example a page of 5 the resulting newlyAddedPagesOrder would be something like
[6, 7, 8, 9] right ? or am i missing something.

Yes. Which is exactly what we want to fix this. We want to make sure the "order" we create refers to the proper pages.

An alternative to all of this is we could still keep the "Deleted page" and not remove it from Order but instead just "mark it as "deleted" visually. Not sure if this is nice but it does enable a feature of "undoing" deleting any page that is currently shown to be deleted. It just might be confusing or might get too crowded if humans delete a lot of pages

@ahmgeek ahmgeek requested review from ahmgeek and mkhatib April 12, 2021 06:59
@ahmgeek ahmgeek added the enhancement New feature or request label Apr 12, 2021
Comment on lines +159 to +164
<div
onClick={(e) => onDelete(pageNumber)}
className="absolute rounded-md p-1 bg-red-600 text-white text-xs cursor-pointer"
>
Delete
</div>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I suggest you move this to a button with icon x for example and to be styled on the top right corner of the file? since we are using tailwind I think it should be easy.

Copy link
Collaborator

@mkhatib mkhatib left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fahme great PR! Don't mind the comments, just wanted to be thorough and see what you think.

Comment on lines +159 to +164
<div
onClick={(e) => onDelete(pageNumber)}
className="absolute rounded-md p-1 bg-red-600 text-white text-xs cursor-pointer"
>
Delete
</div>
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide delete button when there's only one page left? Or maybe keep it and let people delete all the pages and start over by doing setPdf(undefined) above instead of that return?

Suggested change
<div
onClick={(e) => onDelete(pageNumber)}
className="absolute rounded-md p-1 bg-red-600 text-white text-xs cursor-pointer"
>
Delete
</div>
{pageOrder.length > 1 && (<div
onClick={(e) => onDelete(pageNumber)}
className="absolute rounded-md p-1 bg-red-600 text-white text-xs cursor-pointer"
>
Delete
</div>)
}


// A document needs to at least have one page
// @(TODO) show error message
if (currentPages.length < 1) return;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned below, we can either hide the delete button when there's only one page, OR let people delete all pages and do something line

Suggested change
if (currentPages.length < 1) return;
if (currentPages.length < 1) {
setPdf(undefined);
setDoc(undefined);
setPagesOrder([]);
return;
}

That should automatically create a new document next time they upload a new file.

Comment on lines +119 to +121
const newPageOrder = new Array(pageCount)
.fill(0)
.map((_, index) => index + 1);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't this fall into the same issue you found last time? Losing the order of the current pages? Instead you can call this method by index you want to delete instead of pageNumber and do this:

Suggested change
const newPageOrder = new Array(pageCount)
.fill(0)
.map((_, index) => index + 1);
const newPageOrder = [...pagesOrder];
newPageOrder.splice(deleteIndex, 1); // splice will mutate the array but that's ok, we already copied it.
setPagesOrder(newPagesOrder);

@@ -23,6 +23,23 @@ async function mergePdfs(files: File[], addToDoc?: PDFDocument) {
};
}

// This is a work around for the pdf-js library because removePage method does not delete all
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a better way might be to handle this similar to how we handle Ordering Pages. We do not recreate or reorder the actual PDF pages instead we just maintain a pages order list which is the indexes of the PDF pages and how they should be EVENTUALLY ordered when the user downloads the PDF.

I think the same can be achieved for deletion. You wouldn't touch the PDF file at all when deleting. Instead you just delete the "page index" from the pagesOrder list - react will re-render the list and only show the user the pages that exist in the pagesOrder - so the user will have instant feedback.

When the user is ready to click DOWNLOAD. We already only download the pages in that ordersPage so anything else won't be included. I think this is the right way to go about this - let me know if I am missing anything.

Although in order to address the delete issue you might need to do a simple modification on getOrderedPdf to use the order list instead of getting all the pages.

Let me know if this makes senes.

@mkhatib
Copy link
Collaborator

mkhatib commented Apr 16, 2021

btw, if you'd like we could jump on a video call and pair program on this (something @ahmgeek and I have been doing). Usually we do it on Sundays between 9am-1pm PST.

@Fahme
Copy link
Contributor Author

Fahme commented Apr 16, 2021

@mkhatib thanks for the invite but it will be quite late for me to work at that time. I'm going to tackle this PR during the weekends.

Feel free to submit a solution if you guys worked on this before.

@mkhatib
Copy link
Collaborator

mkhatib commented Apr 16, 2021 via email

@mkhatib
Copy link
Collaborator

mkhatib commented Dec 6, 2022

Thanks @Fahme for working on this. I just added the feature to delete pages in combine PDF tools. Closing this.

@mkhatib mkhatib closed this Dec 6, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants