500 errors while trying to use Tabula
I was working recently with a PDF in Tabula. Previously, I have used Tabula to extract tables from PDFs. It has worked great. This time, I could not get Tabula to start processing the PDF. Debugging the network request, gave a pointer about a temporary directory issue.
The error message was (ENOENT) No such file or directory - /tmp
, which is a bit cryptic on Windows since it refers to a directory that is unlikely to exist. The error message did not include the full path for this directory nor did it give a file or line number for the error in source.
Since the issue is the AJAX request to upload.json
, I did a search for that in the code. You can see the two files that are returned are tabula_web.rb
and index.html
. The latter is the front end UI, so I went to the former file to spot the issue
I had previously dug into the AJAX request to know that it was POSTing a parameter called files
with the relevant info. It would not have mattered because both code paths call is_valid_pdf
with a parameter of file[:tempfile].path
. This is supposed to refer to the path of the uploaded file. In my case, it seems was not getting into the is_valid_pdf
function. I added a line to the top of that function which did not generate output.
Given that, it looked like there was an issue with Tempfile
having a correct directory to output to.
Checking the docs of Tempfile you can see that it uses Dir.tmpdir
as the output directory. This can be controlled by an evironment variable $TMPDIR
. Checking my environment variables, I had nothing set for this. I forced this variable to be equal to the directory %USERPROFILE%/AppData/Local/Temp/tmp
, which I had just created.
With this addition and a restart of Tabula, the files uploaded correctly.