Propagating Bonjour Messages through VPNs or how to make a AirPrint printer available on a different network

So I decided it would be a great idea to share a printer with a friend. After all, why should we both have one if we seldomly use it.

Since we both use a FritzBox Router we established a VPN connection between those.  In order for this to work we needed to change one FritzBox HTTPS port (by incrementing it by one).

After that worked I was satisfied. But my friend has Apple devices and wanted to be able to use AirPrint. AirPrint makes use of Bonjour, but the Bonjour broadcast can’t reach past the first hop (TTL=1). The guy who educated me about this also pointed out that one should look into avahi for these sort of issues.

 

So what did I do? First I observed the behaviour of my printer via avahi-browse -avr in my network.

Then I set up a raspberrypi in the target network. Install avahi (apt-get install libnss-mdns avahi-utils).

Write the following into /etc/avahi/hosts (Replace 192.168.0.1 with the IP of the device you want to broadcast, and name.local with the devices FQDN in the source network)

192.168.0.1 name.local

Create the service by creating a name.service file with the following content:

<?xml version=”1.0″ standalone=’no’?><!–*-nxml-*–>
<!DOCTYPE service-group SYSTEM “avahi-service.dtd”>
<service-group>
<name>NAME</name>
<service>
<type>_ipp._tcp</type>
<subtype>_universal._sub._ipp._tcp</subtype>
<host-name>name.local</host-name>
<port>631</port>
<txt-record>example=test</txt-record>

<txt-record …
..These text records should be based on your observation via avahi-browse -avr
</service>
</service-group>

And you’re done! You devices address will be broadcasted.

 

Extracting text from PDF or “I’d rather shoot myself right now”

Updated on August 11, 2017

tl;dr: Use pdf-box for general text extraction tasks and use tabula for tables.

If you should ever find yourself in a situation where you want to get information out of a pdf-document you should reconsider first. Is there no other source available?

No? Okay, prepare for pain! Or listen to my advice, as I have already endured the pain:

Don’t try to find a python tool that does the job. It seems like, at the moment, there aren’t any good ones out there.

Instead use https://pdfbox.apache.org it gave me the best results out of the box of any tools I used – and I tested a lot of tools.

Update: If you want to extract tables from a pdf there really is no way around tabula. It has a very good browser based gui and cli. If you are trying to programmatically extract data you might have to play around with the “–columns” part instead of auto detection for good results when working with loads of similar looking pdfs.

 

 

Nextcloud Explorer Icon Integration

Zapem and me wrote a little script that allows you to easily add nextcloud to your windows explorer.

You can find it here.

 

 

Test Post Please Ignore

don’t look (@°°@)