IASSIST 2013 logo

Data Innovation: Increasing Accessibility, Visibility and Sustainability

IASSIST Workshop Sessions

Morning sessions (09:00-12:00)

Classroom Sessions

Title Presenter(s) Venue
Using the Data Curation Profile as a Means to Engage ResearchersD. Scott Brandt
Purdue University Libraries
GESIS Ost
Access Policies and Licensing for Archives and RepositoriesLaurence Horton
Data Service Infrastructure for the Social Sciences and Humanities project (DASISH)
GESIS West I
Using OLAP Techniques for Data Presentation and Analysis

Chris Leowski, Andreea Gheorghe
University of Toronto

Laurentius

Computer lab sessions

Title Presenter(s) Venue
Introduction to R and Reproducible Research

Harrison Dekker, Tim Dennis
UC Berkeley

GESIS Schulungsraum
da|ra: How to obtain a DOI name for my social and economic research data?Brigitte Hausstein,
GESIS - Leibniz Institute for the Social Sciences
GESIS West II

Afternoon sessions (13:00-16:00)

Classroom Sessions

TitlePresenter(s)Venue
UK Institutional partnership training workshop: costing, appraising and managing data for social science research

Louise Corti
UK Data Archive

Jared Lyle
ICPSR

GESIS West I

Computer lab sessions

TitlePresenter(s)Venue
Data Visualization and RRyan Womack
Rutgers University

GESIS Schulungsraum

CharmStats: Coding and Harmonization of StatisticsKristi Winters
GESIS - Leibniz Institute for the Social Sciences
GESIS West II

Workshop abstracts

Using the Data Curation Profile as a Means to Engage Researchers

 

Venue 

  • GESIS Ost
    09:00-12:00

Presenter(s)/Coordinator(s)

  • D. Scott Brandt
    Purdue University Libraries

Engaging researchers in discussions about data may be new, unfamiliar territory for many librarians. This workshop provides training in the application and use of the Data Curation Profile Toolkit (http://docs.lib.purdue.edu/dcp), an instrument used to elicit information about data from researchers. The Toolkit can be used to facilitate data discussions, to identify research data needs, and to help plan for development of data services. It provides a flexible structure for conducting an interview, and can facilitate discussion with researchers about what they may want to do with data beyond its immediate use. Profiles which are developed out of interviews with researchers can be published in the Data Curation Profiles Directory. The goal of the workshop is to build knowledge and skills to discuss data with researchers. Learning is facilitated by presenting scenarios and working through hands-on exercises with the Toolkit, which includes an Interviewer Manual and Interview Worksheet. 

Participants are asked to download the Toolkit prior to the workshop.

Access Policies and Licensing for Archives and Repositories

 

Venue

  • GESIS West I
    09:00-12:00 

Presenter(s)/Coordinator(s)

  • Laurence Horton
    Data Service Infrastructure for the Social Sciences and Humanities project (DASISH)

The workshop combines expertise from the DASISH project, covering five European Social Science and Humanities research infrastructures. It will focus on data administration policies for user management. The workshop is of interest to two groups: people in organizations/institutions who:

  • Are in the process of setting up a data archive or repository
  • Or, with a repository (for data/publications) who wish to systematically consider access policies and licensing as part of a long-term digital preservation and reuse strategy

Topics covered include:

  • Data submission
  • License and acquisition agreements between data producers and archives/preservation services: How do you get data into your archive/repository and ensure you can continue preserving and disseminating it in the future?
  • Introducing the DASISH training module
  • Presentation of an Access Policies and Licensing training module with discussion on content and structure
  • Responsibilities for subsequent data reuse
  • What can archives do to ensure that license terms are understood and respected? How to ensure meeting legal requirements on data protection and security.
  • Comparing existing license schemes
  • What are existing licenses in use and what are relevant differences between them?
  • Secure data licenses
  • Access policies and conditions of reuse for sensitive personal data. Legal requirements on data protection and security and how they can be met.

Using OLAP Techniques for Data Presentation and Analysis

Venue

  • Laurentius
    09:00-12:00

Presenter(s)/Coordinator(s)

  • Chris Leowski, Andreea Gheorghe
    University of Toronto

Preceded by a crash course in the underlying theory of OLAP (On Line Analytical Processing) cubes, the workshop will focus on using OLAP cubes in social sciences for data presentation and analysis of - mostly - Canadian census data and some economic tables from CANSIM. On line hands-on exercises in slicing, dicing, aggregation and disaggregation of data will be offered to participants.

UK Institutional Partnership Training Workshop: Costing, Appraising and Managing Data for Social Science Research

Venue

  • GESIS West I
    13:00-16:00

Presenter(s)/Coordinators(s)

  • Louise Corti
    UK Data Archive
  • Jared Lyle
    ICPSR

When it comes to dealing with the ever increasing commitments of research data, both the UK Data Service and ICPSR continue to see institutions struggle with the challenges of domain specificity; how do we help our local researchers cost, plan and manage social science data? How do we then appraise and curate that mixed bag of data that a social scientist might have created?

In this workshop we will showcase our collaborative support and training materials that are being used to support both institutional repository managers charged with appraising, ingesting and managing social science research data from local academics; and research support staff who face dealing with ensuring compliance with data management responsibilities set out in almost all research applications.

In this session participants will get a chance to try our exercises on:

  • Costing shot and longer-term data management
  • How to write good data management plan
  • Appraising data for social science research
  • Creating sufficient context for data collections
  • Creating Data Centre ‘compliant’ metadata records for local repositories

Introduction to R and Reproducible Research

Venue

  • GESIS Schulungsraum
    09:00-12:00 

Presenter(s)/Coordinator(s)

  • Harrison Dekker, Tim Dennis
    UC Berkeley

R is an open-source statistical computing environment, comparable to SAS, Stata, and SPSS, widely used across disciplines as disparate as bioinformatics and finance. A key feature of the R platform is a “package” system that allows users to easily share code, data, and documentation via community-supported repositories like CRAN, R-Forge, and Bioconductor. Thousands of high quality, well-documented packages are currently available and the number continues to grow.

The workshop will focus on R commands for data manipulation and descriptive statistics and the use of the RStudio development environment. Examples and exercises will cover common social science data tasks like subsetting, merging, and creating categorical variables. In addition, the workshop will introduce workflow practices that promote reproducibility which can be implemented in R or any comparable statistics packages.

No previous experience with R will be assumed, but participants should be familiar with at least one other statistics package.

da|ra: How to obtain a DOI name for my social and economic research data?

Venue

  • GESIS West II
    09:00-12:00

Presenter(s)/Coordinator(s)

  • Brigitte Hausstein,
    GESIS - Leibniz Institute for the Social Sciences

The GESIS - Leibniz Institute for the Social Sciences and the ZBW Leibniz Information Center for Economics are offering a DOI registration service for social and economic research data. As members of DataCite, GESIS and ZBW pursue the goal of promoting and establishing uniform standards for the acceptance of research data as independent citable scientific objects.

The DOI registration service da|ra was introduced in 2010 and meanwhile most of the leading German social science research data centers have been using the service already. The workshop will introduce da|ra, its policy and metadata schema as well as the functionality of the system. The main focus of the workshop will lay on hands-on complemented by the presentation. Thus the participants will learn how to register DOI names with da|ra:

1.    da|ra Policy and Metadata schema

2.    Registering a DOI name for test data sets generated in the social sciences and in economics. The participants will get the chance to check out the different workflows provided by da|ra: web interface and the xml-upload/web service.

The Workshop will also show the interaction between da|ra and the Data Cite Metadata Store. Another component will be the debate on the specific requirements for metadata when it comes to registration.

Data Visualization and R

Venue

  • GESIS Schulungsraum
    13:00-16:00

Presenter(s)/Coordinator(s)

  • Ryan Womack
    Rutgers University

This workshop will focus on principles and techniques for the visualization of data, with an equal emphasis on theory and implementation. Drawing on classic works by Cleveland (Visualizing Data), Tufte (The Visual Display of Quantitative Information), and Wilkinson (The Grammar of Graphics), a range of best practices for visualization will be illustrated. Recently developed techniques for large-scale, 3D, and interactive visualization will also be discussed. This discussion will be based on works such as Graphics of Large Datasets: Visualizing a Million (Unwin, Theus, and Hofmann), the Handbook of Data Visualization (Chen, Hardle, and Unwin), and Trends in Interactive Visualization: A State of the Art Survey (Liere, Adriaansen and Zudilova-Seinstra) For each of these approaches, methods for creating similar graphics in the R open-source statistical language will be demonstrated, using packages such as ggplot2, lattice, and rggobi. Prior familiarity with R is helpful but not required.

CharmStats: Coding and Harmonization of Statistics

Venue

  • GESIS West II
    13:00-16:00

Presenter(s)/Coordinator(s)

  • Kristi Winters
    GESIS - Leibniz Institute for the Social Sciences

The software program Charmstats 1.0 (Coding and Harmonizing Statistics) will provide a structured approach to data harmonization by allowing researchers to: 1) download harmonization protocols; 2) document variable coding and harmonization processes; 3) access variables from existing datasets for harmonization; and 4) create harmonization protocols for publication and citation. It will be open source and free to all users. The workshop will explain the software interface, show participants how to find variables using the program and walk them through creating a harmonized variable.