Very rough notes from the Digitising NZ books & Audience session

Jonathan Harker — Thu, 28 Nov 2013 04:25:28 +0000

Follows is rough minutes from this session. Apologies if this is wrong or misleading – I did not recognise half of it

In 1950-59, 4037 publications in NZ (publication = pamphlet of 5+ pages, or a novel). Where are they, how do we get hold of them, and can they be freed?

Publications NZ – has all of the things, in MARC. Federated to Worldcat. US Copyright law restricts stuff since 1870.
Some stuff will need consultation with Iwi
Reclaiming New Zealand’s Digitised Heritage project – Kiwi Alex
Today’s bibliographies from libraries may not match online or physical storage. Stuff can be in stack, lost, moved, destroyed, storage, etc. It may be possible to use cross-catalogue data or cross-media data to track books down (e.g. mentions in Papers Past)
Maybe leave out unpublished data
How do we get institutions to lend the books to digitise? NL won’t lend out valuable material without conservator reporting.

What format, and how do we make it text-searchable?

Agree on something like METS-ALTO and DC, and federate with OAI-PMH and/or use Digital NZ.
NL scanned 300dpi colour TIFF images per page, into PDF with page image + OCR.
e-books in EPUB, which is (more or less) zipped HTML. Kindle uses MobiPocket, another format based on Open eBook.
OPDS is a syndication format – like RSS but for e-books.
stats.govt.nz/ ← fully searchable open access (XML) yearbook data.
TEI XML can be huge and possibly redundant for many use cases, but cab be used to embed contextual semantic data – www.tei-c.org/
Gutenberg Project offer many formats – but some are auto-generated from a master

How do we make it available online?

Some data will be very specific and of little commercial or even research value.
Others will have high commercial value

Some have been digitised already.

Digitisation efforts are already under way at National Library.
Some stuff is in Google Books/Hathi Trust, but sourced from the US (little comms with Nat Lib NZ), can be tricky to get stuff from them
Some is public domain, some copyrighted

Do we want to cover periodicals?

Quite probably yes!
RILM are going about digitising.

Who will host it, who will “own” it and maintain it over time?

National Library seems the most sensible fit
Public/Private collaboration; should we delete data that is objected to by one or two stakeholders, or preserve but restrict public access?
- Reliance on corporate law and upholding contracts
- Could the private organisations be not-for-profit, or similarly chartered?
- Retain public ownership

Audience

Where is the data for matching your message, with the right audience, and the platform they tend to use? For example for 16 year-olds, we need Facebook, but not so much for retirees?!

Media studies – in primary and secondary education, there are “bring your own device” initiatives which may have good data.
Sometimes lack of demand is because people don’t know it’s available and/or don’t know they’re looking for it
Local information could be sliced-and-diced by locality, person, and so on (semantic metadata) and be highly relevant to the punters.

Use of open source software in DH

Jonathan Harker — Wed, 27 Nov 2013 21:21:28 +0000

A session around the use of FLOSS software and open standards/protocols in digital repositories, and the needs of digital humanities in particular. Discussions could centre around Fedora Commons, DSpace, Fez, Drupal and so on. How do we weigh rigid metadata schemata (MODS, MARC, DC-TERMS etc.) versus relying on full-text indexes, or open access versus access control, flexibility and complexity versus ease of use?

Archives – THATCamp Wellington 2013

Very rough notes from the Digitising NZ books & Audience session

Use of open source software in DH