About

This is the code & project documentation for the percs website.

Percs was initially created to provide a basic, searchable index of the NSW Pecuniary interest documents. These files are uploaded as large PDFs, containing scanned images of Minister’s submissions. Nearly 100 submissions, dumped into big, unsearchable, opaque files.

One OpenAustralia pub meetup, Luke Bacon kicked off the effort by processing & splitting the files into individual per-minister submissions.

Chris Nilsson ran these files through an OCR, indexed the found text, and wrapped this percs site around the result.

The aim is to digitise & index previous, and ongoing years as needed.

Why? These folks are in charge of a good chunk of our money. Having their pecuniary interest declarations more accessible can only help keep things transparent and fair.

Of course, the process isn’t perfect. Many documents are handwritten, and difficult enough for humans to read, letalone the OCR.

But, it’s a start.

Source code

You can get the source code from: https://github.com/otherchirps/percs

Documentation

Usage instructions, etc, can be found here: http://percs.readthedocs.org/

Licensing

The site itself, and its custom libraries are available at no charge, and licensed under the Mozilla Public License 2.0. The third party libraries (which actually do the cool stuff) have their own licenses.

PDFs sourced from the NSW State Government, and are Copyright © State of New South Wales (NSW Parliament). To the best of my knowledge, I’m complying with their terms of use.

Contact

Site-specific suggestions and problems should go to the issue tracker.

Any other queries, you can reach me via email.