TALK: Exchanging knowledge services for data

Background

Data collection from human subjects is always a challenge. Now that everyone and their dog is on the internet, it has become possible to collect data online. Some researchers have investigated the feasibility of doing so and have concluded that it is possible. These data can also be analysed to understand human learning, cognition, psychology, and possibly other topics of interest.

Most of the researchers seem to be paying participants small sums in exchange for the data they produce. But another alternative exists when you consider what is going on with all the data that users generate on interactive websites. Many companies mine it and sell it to advertisers, market researchers, and the like. Others use it to improve the user experience or to evaluate changes to the code base.

Proposed discussion

I’m interested in possible ways of combining the provision of useful services in exchange for collecting (mostly) anonymised data which can then be used for research. Twitter, for example, has created one of the largest corpora in history of speech-like text which everyone from computer scientists to linguists to political scientists are analysing.

Other examples include Coursera and Khan Academy, both of which collect data on human learning in exchange for a free education. Other sites, such as Human Benchmark, don’t even really offer a service, and yet manage to collect impressive data sets.

So, what I propose is a discussion about

  • what types of human data are interesting, but difficult to collect
  • what kinds of services or formats could be used to entice people to produce that data
  • what are some reasons why these types of services have succeeded/failed in the past
  • what existing platforms/projects could be leveraged to facilitate data collection and service provision
  • etc.

Qualifications

I have some limited experience collecting data through websites which have yielded interesting insights into vocabulary acquisition and rent pricing (yes, they are completely unrelated). I’m keen on pursuing this concept further to study the development of reading proficiency and speed in a second/foreign language.

Categories: Crowdsourcing, Data Mining, General, Research Methods, Session Proposals, Session: Talk, Teaching |

About Tellya Later

My education and experience has primarily focused on the study of languages and language learning, but I have also taken tangential trips through microbiology and analytical chemistry. I've taught myself programming in the past few years and have started a few small online data collection projects. Despite my interest in online technology, I deeply dislike the potential it has for never forgetting anything which is why I do not participate in this trendy social media thingamabob.

1 Response to TALK: Exchanging knowledge services for data

  1. Tellya Later says:

    THATcamp misc notes

    1 Session notes
    1.1 what types of human data are interesting, but difficult to collect

    Everything, basically, but especially data that can be analysed
    in its own right.

    1.2 what kinds of services or formats could be used to entice people to produce that data

    whatever it is, it needs to have a competitive or social aspect

    1.3 what are some reasons why these types of services have succeeded/failed in the past

    successful services need to enrich someone’s life

    difficult to find motivating factor

    hard to keep people coming back

    takes time to build an audience

    ensure it’s usable

    start from an existing need/demand

    have to have a way to filter out the garbage

    1.3.1 Ethical issues

    privacy policy may help, people seem quite open

    ethical issues – who benefits? Ideally, both users and data
    collector’s interests should be balanced, with an emphasis on users

    be upfront and not sneaky

    release data to public

    don’t be like Google+ where they claim to make it beneficial to user, but actually it’s not

    1.3.2 Be a member of the community
    1.3.3 help users be social with each other
    1.4 what existing platforms/projects could be leveraged to facilitate data collection and service provision

    mix and mash: open government initiative in NZ

    five stars of open access

    others?

    1.5 problems with volunteer crowd sourcing and gamification
    1.5.1 not representative group?
    2 history example

    How could this idea be used to generate new data that would be
    worthy of analysis to historians?

    get children to record their grandparents stories

    and have contests

Comments are closed.