OSDC Annual Workshop Edinburgh - June 17-21, 2013
Big Data and Cloud Computing Workshop
The second OSDC workshop, funded in part by the NSF PIRE program, will gather young researchers and experts in Edinburgh to address today’s challenges and to develop understanding in how best to exploit our growing wealth of data. How can we formulate questions and then use evidence from data to answer them without inadvertently introducing bias? How can we integrate data across boundaries from heterogeneous sources? How can we build machinery - hardware, software and internet protocols - to enable this on a global scale? The participants, from North and South America, Japan, and Europe will engage in talks, discussions, hands-on tutorials, and debate. The outcome should be new ideas, new understanding, and lasting international alliances pursuing collaborative research.
Organization:
Prof. Robert Grossman, University of Chicago
Prof. Heidi Alvarez, Florida International University
Prof. Malcolm Atkinson, University of Edinburgh
Registration:
REGISTRATION IS NOW CLOSED
Workshop details:
Location: Informatics Forum, 10 Crichton Street, School of Informatics, University of Edinburgh
Dates: 17 to 21 Jun 2013
Workshop 2013 Program
Post-workshop Survey
Presentations
DAY 1 (Mon, 17-Jun 2013)
Session 1: Welcome & Introduction, Chair: Malcolm Atkinson, U of Edinburgh
- Welcome and “Why are we here?”, Malcolm Atkinson, U of Edinburgh
- Using the OSDC for Data-Intensive Research, Robert Grossman, UChicago
Session 2: OSDC PIRE participant self-introduction session, Chair: Heidi Alvarez, FIU
Session 3: Using the Open Science Data Cloud for Data-Intensive Research, Chair: Robert Grossman, UChicago
- Tutorial and Hands on OSDC exercise, Allison Heath, UChicago
DAY 2 (Tues, 18-Jun 2013)
Session 4: UChicago Applications & Infrastructure 10-20 Minute Talks, Chair: Robert Grossman, UChicago
- Bionimbus - Managing, Analyzing and Sharing Large Genomic Datasets, Allison Heath, UChicago
- Tukey: The OSDC User Interface, Matt Greenway, UChicago
- Yates: The OSDC Automation Infrastructure, Rafael Suarez, UChicago
- The Namibia Flood Detection Dashboard, Zac Fleming, U of Oklahoma
- An Overview of Matsu: An Open Standards-Based Cloud Infrastructure of Earth Science Data, Robert Grossman, UChicago
- UDR: A utility for synchronizing large remote data sets, Allison Heath, UChicago
- MoSGrid – Molecular simulations in a distributed environment, *Sandra Gesing, U of Edinburgh (en route from U of Tübingen, Germany to Notre Dame, Indiana, USA)
- Technical Interactive Session – Walk through an OSDC usage exercise, Allison Heath, UChicago
Session 5: The Research Bazaar, Malcolm Atkinson, U of Edinburgh
- Classification of large scale distributed workflows, Michael Lewis
- Retroviral Links to Cancer, Gilbert Cole
- The MITRE Elastic Goal-Directed Simulation Framework (MEG), Christine Harvey
- The Domination of Three Ddimensional Chessboards by Bishops, Joshua Eisenberg
- UvA Data Service, Pedro D. Bello-Maldonado
DAY 3 (Wed, 19-Jun 2013)
Session 6: Data-Intensive Streaming Methods, Chair: Malcolm Atkinson, U of Edinburgh
- Motivation and Architecture of Data-Streaming Systems, Malcolm Atkinson, U of Edinburgh
- A practical introduction to DISPEL, Paul Martin (assisted by: Michelle Galea, Amy Krause, Alessandro Spinuso and Iraklis Klampanos), U of Edinburgh
Session 7: Collaborative and distributed e-Science, Heidi Alvarez, FIU
- Data-Intensive Seismology, Iraklis Klampanos, U of Edinburgh
- Multi-workflow Systems and Editors, Sandra Gesing, U of Edinburgh (en route from U of Tübingen, Germany to Notre Dame, Indiana, USA)
- The EFFORT Science Gateway, Rosa Filgueira, U of Edinburgh
- Sustainable Smart e-Infrastructure, Paola Grosso, University of Amsterdam
- System and Network Engineering, Cees de Laat
- PRAS-DT: Portable, Reliable, and Automatic Streaming Data Transfer, Christine Harvey and Rosa Filgueira, MITRE and University of Edinburgh
- Reflections on the achievements of PIRE Fellows, Heidi Alvarez, Florida International University
Session 8: Data-Intensive Science: Principles, Examples & Questions, Robert Grossman, UChicago
- A Principled Approach to Provenance, James Cheney, U of Edinburgh
- Using Big Data to build decision support tools in Agriculture, Karen Langona, U of Sao Paulo
- Case Studies in Running Many Simulations on Many Clusters, Clouds and Supercomputers, Shantenu Jha, Rutgers University
- What are the most powerful data-intensive methods and how should we use them?, Robert Grossman, UChicago
Session 9: Break-out Group Discussions on the OSDC & Data-Intensive Research Challenges, Chair: Robert Grossman
DAY 4 (Thurs, 20-Jun 2013)
Session 10: Practical technologies for advancing the state of the art, Malcolm Atkinson, University of Edinburgh
- Mastering Complex Internet Infrastructure to support Science, Cees de Laat, University of Amsterdam
- Data-Intensive Research in ITRI/AIST, Isao Kojima, AIST
- A Brief Introduction to SAGA and Bigjob, Shantenu Jha, Rutgers University
Session 11: Parallel Sessions
Session 11A: Research in AIST, Isao Kojima
Session 11B: Research in Amsterdam and Sao Paolo, Cees de Laat and Karen Langona
Session 11C: Research in Edinburgh, Malcolm Atkinson, U of Edinburgh
- Edinburgh: A Hands-on Introduction to SAGA and BigJob, Shantenu Jha, Rutgers University
- Provenance for Seismology, Alessandro Spinuso, KNMI & ORFEUS
- Designing Python Libraries for Rock Physics, Rosa Filgueira, University of Edinburgh
- Introduction to iRODS handling distributed scientific data, TBA, EUDAT consortium
- Data Integration and Analysis for Systems Medicine, Ian Overton, MRC Human Genetics Unit, Western General Hospital, Edinburgh
Session: 12 What have we learnt & where next?, Heidi Alvarez, Malcolm Atkinson & Bob Grossman
### Results from the OSDC - PIRE Challenge
PIRE Challenge details and Instructions
Team | Title | Presentation |
---|---|---|
Zac Flamig, Warren Cole, and Rafael Suarez | Examining Vegetation Recovery Time after a Small Scale Disaster using MODIS Data and the OSDC | (Slides) |
Pedro D Bello Maldonado and Matthew Greenway | Integrating the UvA Data Service with the OSDC | (Slides) (Paper) |
Joshua Miller, Spencer Claxton, Alice Mukora, and Eric Griffis (First Place) | Stratosphere: A multilevel data-driven social-network for cloud computing | (Paper) |
Joseph Korpela and Matt Greenway | Augmenting OSDC’s Datascope with a Finderscope | (Paper) |
Warren Cole and Sandra Gesing | Retroviral Links to Cancer | (Slides) (Paper) |
Maria T. Patterson, Joshua D. Eisenburg, and Rafael Suarez (Third Place) | The addition of solar activity (space weather) data to the Open Science Data Cloud in order to facilitate cross-disciplinary studies | (Slides) (Paper) |
Joshua Eisenberg and Maria Patterson | Cloud Query | (Slides) (Paper) |
Michael Lewis and Matthew Greenway | Extending OSDC toolset for cross disciplinary discoveries | (Slides) |
Christine Harvey and Rafael Suarez (Second Place) | Organ Procurement and Transplantation Network (OPTN) Database on the OSDC | (Slides) (Paper) |