Follows is rough minutes from this session. Apologies if this is wrong or misleading – I did not recognise half of it 🙂
In 1950-59, 4037 publications in NZ (publication = pamphlet of 5+ pages, or a novel). Where are they, how do we get hold of them, and can they be freed?
- Publications NZ – has all of the things, in MARC. Federated to Worldcat. US Copyright law restricts stuff since 1870.
- Some stuff will need consultation with Iwi
- Reclaiming New Zealand’s Digitised Heritage project – Kiwi Alex
- Today’s bibliographies from libraries may not match online or physical storage. Stuff can be in stack, lost, moved, destroyed, storage, etc. It may be possible to use cross-catalogue data or cross-media data to track books down (e.g. mentions in Papers Past)
- Maybe leave out unpublished data
- How do we get institutions to lend the books to digitise? NL won’t lend out valuable material without conservator reporting.
What format, and how do we make it text-searchable?
- Agree on something like METS-ALTO and DC, and federate with OAI-PMH and/or use Digital NZ.
- NL scanned 300dpi colour TIFF images per page, into PDF with page image + OCR.
- e-books in EPUB, which is (more or less) zipped HTML. Kindle uses MobiPocket, another format based on Open eBook.
- OPDS is a syndication format – like RSS but for e-books.
- stats.govt.nz/ ← fully searchable open access (XML) yearbook data.
- TEI XML can be huge and possibly redundant for many use cases, but cab be used to embed contextual semantic data – www.tei-c.org/
- Gutenberg Project offer many formats – but some are auto-generated from a master
How do we make it available online?
- Some data will be very specific and of little commercial or even research value.
- Others will have high commercial value
Some have been digitised already.
- Digitisation efforts are already under way at National Library.
- Some stuff is in Google Books/Hathi Trust, but sourced from the US (little comms with Nat Lib NZ), can be tricky to get stuff from them
- Some is public domain, some copyrighted
Do we want to cover periodicals?
- Quite probably yes!
- RILM are going about digitising.
Who will host it, who will “own” it and maintain it over time?
- National Library seems the most sensible fit
- Public/Private collaboration; should we delete data that is objected to by one or two stakeholders, or preserve but restrict public access?
- Reliance on corporate law and upholding contracts
- Could the private organisations be not-for-profit, or similarly chartered?
- Retain public ownership
Audience
Where is the data for matching your message, with the right audience, and the platform they tend to use? For example for 16 year-olds, we need Facebook, but not so much for retirees?!
- Media studies – in primary and secondary education, there are “bring your own device” initiatives which may have good data.
- Sometimes lack of demand is because people don’t know it’s available and/or don’t know they’re looking for it
- Local information could be sliced-and-diced by locality, person, and so on (semantic metadata) and be highly relevant to the punters.