Search and text parsing


What do you expect from a PDF reader? I would expect the best reading experience – an easy a pleasurable read. Secondly, I would also expect search support.

What we've done is to add PDF parse engine, enabling it to find letters and words, but also to understand where they're placed on the page. In order to do that, we've included a decomposition and normalization engine to match letters with glyphs and fonts.

The result is a fast and precise search engine and perfect highlight of searched results on the PDF page.

We've also decided to support different types of characters match. You may need to search for precise words based on different versions of the same letter, like a - à and å. Our flexible engine can detect different ones and let you choose to perform a diacritic search (which we call smart) so if you search for the term bezier, it will also match bèzier, but if you search for bèzier it will match bèzier only. You can also choose the hard mode so if you search for bèzier it will match bèzier only but not bezier. If you search for bezier it will match bezier only. Or soft mode so you can search for term bèzier it will match both bezier and bèzier. The same applies if you search for bezier. If you are interested in these options please take a look at the specific documentation for FPKSearchMode.

FastPdfKit obviously also supports:

  • Result highlight;
  • Search through closed documents;
  • Search on multiple documents at the same time;
  • List search results;
  • Look through results and go to the following one;
  • Text extraction;
  • Search with and results skimming;
  • Zoom on results;
  • Multibyte characters encoding;
  • Font caching for high speed searches;
  • Different encoding support: Win ANSI, MacOS Roman, Unicode;
  • Unicode sequences and single chars;
  • 14 Adobe Standard Fonts.
Our pdf parsing engine is also needed to support:

  • Link between pages;
  • Link to web pages;
  • Link to open external documents;
  • Any kind of custom hyperlinks.
  • We also enable you, with any paid license or with the reseller's one, to get the detailed position for every letter in the document in page coordinates.

search

Twitter