Skip to content
  • Home
  • About W13
  • Campers
  • Schedule
  • Session proposals
  • Sponsors
  • Getting Here
  • Contact
← TALK: Communicating Resources to Researchers
Use of open source software – rough notes →

Very rough notes from the Digitising NZ books & Audience session

Posted on November 28, 2013 by Jonathan Harker

Follows is rough minutes from this session. Apologies if this is wrong or misleading – I did not recognise half of it 🙂

In 1950-59, 4037 publications in NZ (publication = pamphlet of 5+ pages, or a novel). Where are they, how do we get hold of them, and can they be freed?

  • Publications NZ – has all of the things, in MARC. Federated to Worldcat. US Copyright law restricts stuff since 1870.
  • Some stuff will need consultation with Iwi
  • Reclaiming New Zealand’s Digitised Heritage project – Kiwi Alex
  • Today’s bibliographies from libraries may not match online or physical storage. Stuff can be in stack, lost, moved, destroyed, storage, etc. It may be possible to use cross-catalogue data or cross-media data to track books down (e.g. mentions in Papers Past)
  • Maybe leave out unpublished data
  • How do we get institutions to lend the books to digitise? NL won’t lend out valuable material without conservator reporting.

What format, and how do we make it text-searchable?

  • Agree on something like METS-ALTO and DC, and federate with OAI-PMH and/or use Digital NZ.
  • NL scanned 300dpi colour TIFF images per page, into PDF with page image + OCR.
  • e-books in EPUB, which is (more or less) zipped HTML. Kindle uses MobiPocket, another format based on Open eBook.
  • OPDS is a syndication format – like RSS but for e-books.
  • stats.govt.nz/ ← fully searchable open access (XML) yearbook data.
  • TEI XML can be huge and possibly redundant for many use cases, but cab be used to embed contextual semantic data – www.tei-c.org/
  • Gutenberg Project offer many formats – but some are auto-generated from a master

How do we make it available online?

  • Some data will be very specific and of little commercial or even research value.
  • Others will have high commercial value

Some have been digitised already.

  • Digitisation efforts are already under way at National Library.
  • Some stuff is in Google Books/Hathi Trust, but sourced from the US (little comms with Nat Lib NZ), can be tricky to get stuff from them
  • Some is public domain, some copyrighted

Do we want to cover periodicals?

  • Quite probably yes!
  • RILM are going about digitising.

Who will host it, who will “own” it and maintain it over time?

  • National Library seems the most sensible fit
  • Public/Private collaboration; should we delete data that is objected to by one or two stakeholders, or preserve but restrict public access?
    • Reliance on corporate law and upholding contracts
    • Could the private organisations be not-for-profit, or similarly chartered?
    • Retain public ownership

Audience

Where is the data for matching your message, with the right audience, and the platform they tend to use? For example for 16 year-olds, we need Facebook, but not so much for retirees?!

  • Media studies – in primary and secondary education, there are “bring your own device” initiatives which may have good data.
  • Sometimes lack of demand is because people don’t know it’s available and/or don’t know they’re looking for it
  • Local information could be sliced-and-diced by locality, person, and so on (semantic metadata) and be highly relevant to the punters.
Categories: Archives, Copyright, Metadata, Open Access, Session Notes, Text Mining |

About Jonathan Harker

I live in Pukerua Bay, grow chillies, roast coffee and own a precocious spotty Bengal cat. I also play trombone in Orchestra Wellington from time to time. I joined Catalyst IT in 2006 and among many things we are fundamentally passionate about open source, open formats and digital sustainability. At Catalyst I am mainly a Moodle and Django developer and I am also a contributor to the University of Queensland's open source Fez digital repository system which is based on Fedora Commons.
View all posts by Jonathan Harker →
← TALK: Communicating Resources to Researchers
Use of open source software – rough notes →
  • Thurs 28 Nov 2013

    THATCamp Wellington 2013 will be held on Thurs 28 November at Victoria University, following the National Digital Forum conference. Follow us on Twitter for updates #thatcamp #wgtn13
  • Follow @thatcampwgtn
  • Recent Posts

    • Thanks!
    • Communicating Resources to Researchers – Options suggested during session
    • Video Essay session update
    • Use of open source software – rough notes
    • Very rough notes from the Digitising NZ books & Audience session
  • Categories

    • Archives
    • Coding
    • Copyright
    • Crowdsourcing
    • Data Mining
    • Digital Literacy
    • Funding
    • Games
    • General
    • Libraries
    • Licensing
    • Linked Data
    • Metadata
    • Open Access
    • Project Management
    • Publishing
    • Research Methods
    • Session Notes
    • Session Proposals
    • Session: Play
    • Session: Talk
    • Session: Teach
    • Social Media
    • Teaching
    • Text Mining
    • Visualization
  • You might also be interested in…

    THATCamp HQ
    National Digital Forum 2013 conference
    Australasian Association for Digital Humanities
    Wai-te-ata Press, Victoria University of Wellington
    Digital Humanities, University of Canterbury
  • Meta

    • Log in
All text and code on THATCamp Wellington 2013 is freely available for you to use, copy, adapt and distribute under a Creative Commons Attribution 3.0 Unported License as long as you link to THATCamp.org and the Center for History and New Media. The name "THATCamp" and the THATCamp logo are trademarks of the Center for History and New Media at George Mason University.

Proudly powered by WordPress.
Creative Commons License
Skip to toolbar
  • About WordPress
    • About WordPress
    • WordPress.org
    • Documentation
    • Support Forums
    • Feedback
  • Log In
  • About Me
  • My THATCamps
  • My Friends