PdfIndexer
OsProject | |
---|---|
edit | |
id | pdfindexer |
state | |
owner | WolfgangFahl |
title | Java Library and Tool to Index and search PDF files using Apache Lucene and PDF Box |
url | https://github.com/WolfgangFahl/pdfindexer |
version | 0.0.11 |
description | |
date | 2018/08/22 |
since | |
until |
Motivation
In one of our project we were asked to check a few dozen PDF documents for consistency. So we needed a way to cross-reference the documents and find keywords. At the time there was no SimpleGraph project yet and we created a special solution end made it available as OpenSource.
Using in Docker
In Issue #4 peebles asked how the example would be run in a docker container.
open a Java container allowing access to the current directory
# get a fresh version of the PDF Indexer
git clone https://github.com/WolfgangFahl/pdfindexer
# change to the directory
cd pdfindexer
# run a docker Container with OpenJDK Java 8
docker run --rm -it -v $(pwd):/deploy -w /deploy openjdk:8 bash