Open data files are frequently published in a form where the data is not “machine-readable” – structured in a way that allows it to be directly processed by computer. In these cases, Benefacts manually extracts the financial, governance and narrative data, converting it from image files into digital content.

Most of the open data used by Benefacts comes in one of four forms:

  1. data files (for example in MS Excel) published specifically for re-use as open datasets (e.g. the list of schools published by the Department of Education & Skills)
  2. image files (for example in PDF, TIFF) from which Benefacts extracts data by re-keying it or by optical character recognition (OCR) (e.g. the constitutions or annual financial statements of friendly societies published by the Registrar of Friendly Societies)
  3. digital datasets (provided as an XML feed) which Benefacts purchases from a licensed re-seller of Companies Registration Office data (e.g. the names and dates of appointment/resignation of company directors)
  4. public websites such as the public register of charities maintained by the Charities Regulatory Authority

In each case, Benefacts reviews the contents, checks them against prior data from the same source and data about the same organisation from other sources, and imports the data to the database.

Only some of the data we acquire is re-published on The rest is used to inform our annual sector analysis reports and other data services.

In re-using data under open data licence terms from multiple sources, Benefacts has taken all reasonable care in preparing and publishing information and analysis on using documents retrieved from regulatory sources, including financial statements prepared by third parties. We cannot guarantee the accuracy or completeness of the information contained and accept no responsibility for errors and omissions.


Last updated 21 May 2020