Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search Functionality #189

Closed
GlenBrauer opened this issue Apr 16, 2018 · 10 comments
Closed

Search Functionality #189

GlenBrauer opened this issue Apr 16, 2018 · 10 comments
Assignees
Labels
question Further information is requested

Comments

@GlenBrauer
Copy link

Is there a way to utilize the search functionality of PDF.js that you get when you click the magnifying icon from the toolbar?

@wojtekmaj
Copy link
Owner

Hey!
React-PDF does not aim to be a fully fledged PDF reader, it only gives you an easy way to display PDFs so that you can build some UI around it. You can highlight some words in the text using custom text renderer.

See:

https://github.com/wojtekmaj/react-pdf/blob/master/test/Test.jsx#L116-L125

@wojtekmaj wojtekmaj self-assigned this Apr 17, 2018
@wojtekmaj wojtekmaj added the question Further information is requested label Apr 17, 2018
@GlenBrauer
Copy link
Author

Thanks think that will achieve what I need. I am working on implementing a search/highlight feature now and am running into an issue. Seems that the text layer is not aligned correctly.

pdfissue

@wojtekmaj
Copy link
Owner

You might have canvas auto-scaled down to fit your container. Make sure to use proper Page width.

@GlenBrauer
Copy link
Author

Thanks for all your help, was hoping you might have some insight on a wabpack worker issue I am having now. I am seeing the following error when starting my application

[WARN] 404 - GET /b898199c77b935089e6c.worker.js (127.0.0.1) 314 bytes

This causes the viewer to sit at the "Loading Pdf" message when attempting to load a document.

I am using the create-react-app and importing via import { Document, Page } from 'react-pdf/dist/entry.webpack';

@wojtekmaj
Copy link
Owner

Hey, please keep one topic on one issue, otherwise we can't possibly have any control over it. Please see #164 - it seems like a similar issue.

@wahlforss
Copy link

Hey @GlenBrauer! How did it go with the search and highlight feature? I'm thinking of building the same. Programatically search and highlight in the PDF.

@wojtekmaj
Copy link
Owner

See also #212

@frontr-uk
Copy link

"You might have canvas auto-scaled down to fit your container. Make sure to use proper Page width."

can you elaborate more on that pls.

@wojtekmaj
Copy link
Owner

@frontr-uk If you're rendering canvas larger than its container, CSS might scale it down. You should make sure the width you're providing is exactly the width you are ending up with. Otherwise, some layers of Page components may be misaligned.

@wjustice
Copy link

wjustice commented Jul 12, 2021

For anyone looking to add search to their application, I made this simple usePdfTextSearch that uses PDF.js under the hood (already a dependency of readt-pdf). You can go see it in action at this CodeSandbox: https://codesandbox.io/s/distracted-khayyam-tzzou?file=/src/usePdfTextSearch.js:0-1618.

import { useState, useEffect } from "react";
import { pdfjs } from "react-pdf";

export const usePdfTextSearch = (file, searchString) => {
  const [pages, setPages] = useState([]);
  const [resultsList, setResultsList] = useState([]);

  useEffect(() => {
    pdfjs.getDocument(file).promise.then((docData) => {
      const pageCount = docData._pdfInfo.numPages;

      const pagePromises = Array.from(
        { length: pageCount },
        (_, pageNumber) => {
          return docData.getPage(pageNumber + 1).then((pageData) => {
            return pageData.getTextContent().then((textContent) => {
              return textContent.items.map(({ str }) => str).join(" ");
            });
          });
        }
      );

      return Promise.all(pagePromises).then((pages) => {
        setPages(pages);
      });
    });
  }, [file]);

  useEffect(() => {
    if (!searchString || !searchString.length) {
      setResultsList([]);
      return;
    }

    /* 
      Currently this regex is case-insensitive. This could be extended to be configurable. 
      Or could be extended to be a fuzzy search. Fuzzy search would need a more 
      complex return from the hook to be able to highlight the found term(s) in the view.
      EX: resultsList = Array<{ pageNumber: number, matchedTerms: Array<string> }>
    */
    const regex = new RegExp(`${searchString}*`, "i");
    const updatedResults = [];

    pages.forEach((text, index) => {
      if (regex.test(text)) {
        updatedResults.push(index + 1);
      }
    });

    setResultsList(updatedResults);
  }, [pages, searchString]);

  return resultsList;
};

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

5 participants