Open Science Data Cloud PIRE bio photo

Open Science Data Cloud PIRE

Providing training in data intensive computing using the Open Science Data Cloud.

Email us Twitter Facebook Flickr YouTube Newsletter All Posts

Universidade Federal Fluminense 2015 Projects

2015 Projects

Below are possible projects for 2015 summer PIRE fellows to become involved with at UFF in Rio de Janeiro, Brazil. These projects involve working with graph structures in big datasets, data mining, provenance, GPU programming, and data visualization.

Dominoes - an interactive exploratory tool for relationships among project artifacts

Dominoes is an approach for analyzing software repositories with thousands of artifacts by considering multiple perspectives of the software development data. In order to achieve computational power we model the data and its relationships as matrices, making possible to efficiently process them with a GPUs (Graphics Processing Unit) based architectures. Dominoes can support automated exploration of different relationships among project artifacts, where users have the flexibility to interactively combine and compose them. Our solution organizes data extracted from software repositories into multiple matrices that can be treated as domino pieces (e.g., [commit | method]). The connection of such pieces corresponds to a set of matrices operations, which derive additional domino pieces. These derived domino pieces represent specific project entity relationships (e.g., number of commits in which two methods co-occurred) and can be used for further explorations. As an evaluation of the Dominoes framework we present two exploratory case studies based on Apache Derby. First, we use Dominoes to show how dependencies among artifacts can be derived. Then, we identify expertise of developers by considering the commits that developers make to artifacts. We show that identifying relationships among 34,335 elements along 7,578 commits takes about 0.2 minutes in GPU, while the same processing in CPU takes about 413 minutes. Identifying expertise of developer on a set of 34,335 files and 36 developers takes about 0.1 minute in GPU, whereas in CPU it takes 324 minutes.

Game flux analysis with provenance

Winning or losing a game session is the final consequence of a series of decisions and actions made during the game. The analysis and understanding of events, mistakes, and fluxes of a concrete game play may be useful for different reasons: understanding problems related to gameplay, data mining of specific situations, and even understanding educational and learning aspects in serious games. We introduce a novel approach based on provenance concepts in order to model and represent a game flux. We model the game data and map it to provenance to generate a provenance graph for analysis. As an example, we also instantiated our proposed conceptual framework and graph generation in a serious game, allowing developers and designers to identify possible mistakes and failures in gameplay design by analyzing the generated provenance graph from collected gameplay data.