Convert PDF to text

Hello I need to convert PDF to text.

I do not have pdftotxt, although poppler-23.05.0-1 is installed.

I found pdftxt in AUR but pamac install pdftxt fails with undefined reference to FT_Get_Char_Index.

I am still looking for a working solution in Manjaro.

Okular provides an option to export to plain text. Have you tried that?

2 Likes

The command to install anything from the AUR with pamac is build, not install — the latter is only for repo packages. Try… :arrow_down:

pamac build pdftxt

And if you are using the Plasma desktop, okular can indeed also export a PDF to plain text, as @ajaychat3 says. I’m not sure what the default PDF viewer is in the other desktop environments, but it’s possible that they too have such a function built in.

Yes Okular does the job, however only works interactively, not in the command line. Fortunately, I need no more today. Thanks.

I use Gnome Desktop, Document Viewer does not export to text.

Try pandoc. I have yet to find a conversion it can’t handle

In repos poppler package provides /usr/bin/pdftotext.

You can use it like

pdftotext input.pdf output.txt

Or to try to preserve layout

pdftotext -layout input.pdf output.txt

If it isnt there … it must be some sort of error. Maybe attempt reinstall.

probably just fail to find the command that is there because of a wrong assumption

The name of the command is:
pdftotext

not:
pdftotxt

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.