Session Notes – THATCamp Wellington 2013 http://wellington2013.thatcamp.org Just another THATCamp site Wed, 04 Dec 2013 22:46:14 +0000 en-US hourly 1 https://wordpress.org/?v=4.9.12 Communicating Resources to Researchers – Options suggested during session http://wellington2013.thatcamp.org/2013/11/28/communicating-resources-to-researchers-options-suggested-during-session/ Thu, 28 Nov 2013 23:15:58 +0000 http://wellington2013.thatcamp.org/?p=327 Continue reading ]]>

Firstly, thanks to those Campers who came to my roughly planned session and who gave such great input.

Although the purpose of the session was to talk about this issue across the GLAM sector as a whole, we used Archives New Zealand’s electronic finding aid Archway as a “test case”.

Following is a list of possible solutions the group came up with, ranging from very manual to more automated:

  • Capture contact details of researchers and their interests – contact them when new material becomes available
  • Set up webpage to publicise new material
  • Create an RSS feed of newly added items
  • Something similar to the trove news bot – easy to use, it checks for messages from Twitter to create queries in Trove’s newspaper database, tweeting the result – still requires a pull by researchers
  • Provide ability for researchers to save search criteria
  • Provide functionality to highlight/identify those search results that have been presented to the researcher previously; ability for researcher to filter out these out of the result set

We also discussed other technologies currently in use, or provided, by various institutions:

  • British Library’s Mechanical Curator – undirected, haphazard, unplanned publishing of content
  • Digital NZ’s custom search builder – a tool that allows anyone to create a mini search engine across Digital NZ’s aggregated digital content, also to create an embeddable widget to share the content
  • Academia.edu – allows registered users to add research interests to their profile, and delivers content (research papers, etc.) that has been shared by other registered users and tagged with the chosen research interest phrase.
  • Google Scholar and other library search tools – a post THATCamp investigation of Google Scholar revealed that the tool uses “robots” or “crawlers” to fetch files from websites for inclusion in the search results.  This is the type of thing I was wondering if researchers could create for themselves.

Discussion was had about the need for not only appropriate tags against each item of material so it can be categorised/classified, but also the need for metadata that will enable capture of when records are added / updated so the researcher can filter out those items they may have already seen.

The group also identified some possible funding/resourcing options:

  • Create partnerships with open source development organisations
  • Collaborate with National Library of New Zealand (this option relates specifically to the Archives NZ case)
  • Apply to Internet NZ for funding
  • Create a research question for an information studies masters student
]]>
Use of open source software – rough notes http://wellington2013.thatcamp.org/2013/11/28/use-of-open-source-software-rough-notes/ Thu, 28 Nov 2013 04:40:41 +0000 http://wellington2013.thatcamp.org/?p=319 Continue reading ]]>

An agenda

  • DH needs in particular
  • existing open source DR: DSpace, Fez, Fedora Commons, Drupal
  • Complexity vs usability
  • Institutional concerns
  • Software selection, procurement
    • Where to start looking for solutions, sifting information
    • Size of the community around an open source project – institutional risk.
    • crowdcrafting.org

Open source repository projects

  • DSpace and Fedora Commons – duraspace.org/
    • DSpace is the “turn-key” web repository system – www.dspace.org/
    • Fedora Commons is a pure repository framework
  • Fez – PHP web front-end developed by University of Queensland on Fedora Commons
  • Islandora – islandora.ca/ – Fedora Commons modules for Drupal 6 and 7
  • Project Hydra – Fedora Commons code for Ruby On Rails – projecthydra.org/

Open Online tools

Software Carpentry – learning programming

Proprietary systems

Rosetta from Ex Libris, at National Library.

How can OSS projects help bridge gaps?

  • Collate digital preservation policies, find the feature gaps, write some code!
  • e.g. Open Providence Model, in Fedora Commons.
]]>
Very rough notes from the Digitising NZ books & Audience session http://wellington2013.thatcamp.org/2013/11/28/very-rough-notes-from-the-digitising-nz-books-audience-session/ Thu, 28 Nov 2013 04:25:28 +0000 http://wellington2013.thatcamp.org/?p=316 Continue reading ]]>

Follows is rough minutes from this session. Apologies if this is wrong or misleading – I did not recognise half of it 🙂

In 1950-59, 4037 publications in NZ (publication = pamphlet of 5+ pages, or a novel). Where are they, how do we get hold of them, and can they be freed?

  • Publications NZ – has all of the things, in MARC. Federated to Worldcat. US Copyright law restricts stuff since 1870.
  • Some stuff will need consultation with Iwi
  • Reclaiming New Zealand’s Digitised Heritage project – Kiwi Alex
  • Today’s bibliographies from libraries may not match online or physical storage. Stuff can be in stack, lost, moved, destroyed, storage, etc. It may be possible to use cross-catalogue data or cross-media data to track books down (e.g. mentions in Papers Past)
  • Maybe leave out unpublished data
  • How do we get institutions to lend the books to digitise? NL won’t lend out valuable material without conservator reporting.

What format, and how do we make it text-searchable?

  • Agree on something like METS-ALTO and DC, and federate with OAI-PMH and/or use Digital NZ.
  • NL scanned 300dpi colour TIFF images per page, into PDF with page image + OCR.
  • e-books in EPUB, which is (more or less) zipped HTML. Kindle uses MobiPocket, another format based on Open eBook.
  • OPDS is a syndication format – like RSS but for e-books.
  • stats.govt.nz/ ← fully searchable open access (XML) yearbook data.
  • TEI XML can be huge and possibly redundant for many use cases, but cab be used to embed contextual semantic data – www.tei-c.org/
  • Gutenberg Project offer many formats – but some are auto-generated from a master

How do we make it available online?

  • Some data will be very specific and of little commercial or even research value.
  • Others will have high commercial value

Some have been digitised already.

  • Digitisation efforts are already under way at National Library.
  • Some stuff is in Google Books/Hathi Trust, but sourced from the US (little comms with Nat Lib NZ), can be tricky to get stuff from them
  • Some is public domain, some copyrighted

Do we want to cover periodicals?

  • Quite probably yes!
  • RILM are going about digitising.

Who will host it, who will “own” it and maintain it over time?

  • National Library seems the most sensible fit
  • Public/Private collaboration; should we delete data that is objected to by one or two stakeholders, or preserve but restrict public access?
    • Reliance on corporate law and upholding contracts
    • Could the private organisations be not-for-profit, or similarly chartered?
    • Retain public ownership

Audience

Where is the data for matching your message, with the right audience, and the platform they tend to use? For example for 16 year-olds, we need Facebook, but not so much for retirees?!

  • Media studies – in primary and secondary education, there are “bring your own device” initiatives which may have good data.
  • Sometimes lack of demand is because people don’t know it’s available and/or don’t know they’re looking for it
  • Local information could be sliced-and-diced by locality, person, and so on (semantic metadata) and be highly relevant to the punters.
]]>