4.6 Article

Implementing a genomic data management system using iRODS in the Wellcome Trust Sanger Institute

Journal

BMC BIOINFORMATICS
Volume 12, Issue -, Pages -

Publisher

BIOMED CENTRAL LTD
DOI: 10.1186/1471-2105-12-361

Keywords

-

Ask authors/readers for more resources

Background: Increasingly large amounts of DNA sequencing data are being generated within the Wellcome Trust Sanger Institute (WTSI). The traditional file system struggles to handle these increasing amounts of sequence data. A good data management system therefore needs to be implemented and integrated into the current WTSI infrastructure. Such a system enables good management of the IT infrastructure of the sequencing pipeline and allows biologists to track their data. Results: We have chosen a data grid system, iRODS (Rule-Oriented Data management systems), to act as the data management system for the WTSI. iRODS provides a rule-based system management approach which makes data replication much easier and provides extra data protection. Unlike the metadata provided by traditional file systems, the metadata system of iRODS is comprehensive and allows users to customize their own application level metadata. Users and IT experts in the WTSI can then query the metadata to find and track data. The aim of this paper is to describe how we designed and used (from both system and user viewpoints) iRODS as a data management system. Details are given about the problems faced and the solutions found when iRODS was implemented. A simple use case describing how users within the WTSI use iRODS is also introduced. Conclusions: iRODS has been implemented and works as the production system for the sequencing pipeline of the WTSI. Both biologists and IT experts can now track and manage data, which could not previously be achieved. This novel approach allows biologists to define their own metadata and query the genomic data using those metadata.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Multidisciplinary Sciences

The UK10K project identifies rare variants in health and disease

Klaudia Walter, Josine L. Min, Jie Huang, Lucy Crooks, Yasin Memari, Shane McCarthy, John R. B. Perry, ChangJiang Xu, Marta Futema, Daniel Lawson, Valentina Iotchkova, Stephan Schiffels, Audrey E. Hendricks, Petr Danecek, Rui Li, James Floyd, Louise V. Wain, Ines Barroso, Steve E. Humphries, Matthew E. Hurles, Eleftheria Zeggini, Jeffrey C. Barrett, Vincent Plagnol, J. Brent Richards, Celia M. T. Greenwood, Nicholas J. Timpson, Richard Durbin, Nicole Soranzo, Senduran Bala, Peter Clapham, Guy Coates, Tony Cox, Allan Daly, Petr Danecek, Yuanping Du, Richard Durbin, Sarah Edkins, Peter Ellis, Paul Flicek, Xiaosen Guo, Xueqin Guo, Liren Huang, David K. Jackson, Chris Joyce, Thomas Keane, Anja Kolb-Kokocinski, Cordelia Langford, Yingrui Li, Jieqin Liang, Hong Lin, Ryan Liu, John Maslen, Shane McCarthy, Dawn Muddyman, Michael A. Quail, Jim Stalker, Jianping Sun, Jing Tian, Guangbiao Wang, Jun Wang, Yu Wang, Kim Wong, Pingbo Zhang, Ines Barroso, Ewan Birney, Chris Boustred, Lu Chen, Gail Clement, Massimiliano Cocca, Petr Danecek, George Davey Smith, Ian N. M. Day, Aaron Day-Williams, Thomas Down, Ian Dunham, Richard Durbin, David M. Evans, Tom R. Gaunt, Matthias Geihs, Celia M. T. Greenwood, Deborah Hart, Audrey E. Hendricks, Bryan Howie, Jie Huang, Tim Hubbard, Pirro Hysi, Valentina Iotchkova, Yalda Jamshidi, Konrad J. Karczewski, John P. Kemp, Genevieve Lachance, Daniel Lawson, Monkol Lek, Margarida Lopes, Daniel G. MacArthur, Jonathan Marchini, Massimo Mangino, Iain Mathieson, Shane McCarthy, Yasin Memari, Sarah Metrustry, Josine L. Min, Alireza Moayyeri, Dawn Muddyman, Kate Northstone, Kalliope Panoutsopoulou, Lavinia Paternoster, John R. B. Perry, Lydia Quaye, J. Brent Richards, Susan Ring, Graham R. S. Ritchie, Stephan Schiffels, Hashem A. Shihab, So-Youn Shin, Kerrin S. Small, Maria Soler Artigas, Nicole Soranzo, Lorraine Southam, Timothy D. Spector, Beate St Pourcain, Gabriela Surdulescu, Ioanna Tachmazidou, Nicholas J. Timpson, Martin D. Tobin, Ana M. Valdes, Peter M. Visscher, Louise V. Wain, Klaudia Walter, Kirsten Ward, Scott G. Wilson, Kim Wong, Jian Yang, Eleftheria Zeggini, Feng Zhang, Hou-Feng Zheng, Richard Anney, Muhammad Ayub, Jeffrey C. Barrett, Douglas Blackwood, Patrick F. Bolton, Gerome Breen, David A. Collier, Nick Craddock, Lucy Crooks, Sarah Curran, David Curtis, Richard Durbin, Louise Gallagher, Daniel Geschwind, Hugh Gurling, Peter Holmans, Irene Lee, Jouko Lonnqvist, Shane McCarthy, Peter McGuffin, Andrew M. McIntosh, Andrew G. McKechanie, Andrew McQuillin, James Morris, Dawn Muddyman, Michael C. O'Donovan, Michael J. Owen, Aarno Palotie, Jeremy R. Parr, Tiina Paunio, Olli Pietilainen, Karola Rehnstrom, Sally I. Sharp, David Skuse, David St Clair, Jaana Suvisaari, James T. R. Walters, Hywel J. Williams, Ines Barroso, Elena Bochukova, Rebecca Bounds, Anna Dominiczak, Richard Durbin, I. Sadaf Farooqi, Audrey E. Hendricks, Julia Keogh, Gae Lle Marenne, Shane McCarthy, Andrew Morris, Dawn Muddyman, Stephen O'Rahilly, David J. Porteous, Blair H. Smith, Ioanna Tachmazidou, Eleanor Wheeler, Eleftheria Zeggini, Saeed Al Turki, Carl A. Anderson, Dinu Antony, Ines Barroso, Phil Beales, Jamie Bentham, Shoumo Bhattacharya, Mattia Calissano, Keren Carss, Krishna Chatterjee, Sebahattin Cirak, Catherine Cosgrove, Richard Durbin, David R. Fitzpatrick, James Floyd, A. Reghan Foley, Christopher S. Franklin, Marta Futema, Detelina Grozeva, Steve E. Humphries, Matthew E. Hurles, Shane McCarthy, Hannah M. Mitchison, Dawn Muddyman, Francesco Muntoni, Stephen O'Rahilly, Alexandros Onoufriadis, Victoria Parker, Felicity Payne, Vincent Plagnol, F. Lucy Raymond, Nicola Roberts, David B. Savage, Peter Scambler, Miriam Schmidts, Nadia Schoenmakers, Robert K. Semple, Eva Serra, Olivera Spasic-Boskovic, Elizabeth Stevens, Margriet van Kogelenberg, Parthiban Vijayarangakannan, Klaudia Walter, Kathleen A. Williamson, Crispian Wilson, Tamieka Whyte, Antonio Ciampi, Celia M. T. Greenwood, Audrey E. Hendricks, Rui Li, Sarah Metrustry, Karim Oualkacha, Ioanna Tachmazidou, ChangJiang Xu, Eleftheria Zeggini, Martin Bobrow, Patrick F. Bolton, Richard Durbin, David R. Fitzpatrick, Heather Griffin, Matthew E. Hurles, Jane Kaye, Karen Kennedy, Alastair Kent, Dawn Muddyman, Francesco Muntoni, F. Lucy Raymond, Robert K. Semple, Carol Smee, Timothy D. Spector, Nicholas J. Timpson, Ruth Charlton, Rosemary Ekong, Marta Futema, Steve E. Humphries, Farrah Khawaja, Luis R. Lopes, Nicola Migone, Stewart J. Payne, Vincent Plagnol, Rebecca C. Pollitt, Sue Povey, Cheryl K. Ridout, Rachel L. Robinson, Richard H. Scott, Adam Shaw, Petros Syrris, Rohan Taylor, Anthony M. Vandersteen, Jeffrey C. Barrett, Ines Barroso, George Davey Smith, Richard Durbin, I. Sadaf Farooqi, David R. Fitzpatrick, Matthew E. Hurles, Jane Kaye, Karen Kennedy, Cordelia Langford, Shane McCarthy, Dawn Muddyman, Michael J. Owen, Aarno Palotie, J. Brent Richards, Nicole Soranzo, Timothy D. Spector, Jim Stalker, Nicholas J. Timpson, Eleftheria Zeggini, Antoinette Amuzu, Juan Pablo Casas, John C. Chambers, Massimiliano Cocca, George Dedoussis, Giovanni Gambaro, Paolo Gasparini, Tom R. Gaunt, Jie Huang, Valentina Iotchkova, Aaron Isaacs, Jon Johnson, Marcus E. Kleber, Jaspal S. Kooner, Claudia Langenberg, Jian'an Luan, Giovanni Malerba, Winfried Maerz, Angela Matchan, Josine L. Min, Richard Morris, Borge G. Nordestgaard, Marianne Benn, Susan Ring, Robert A. Scott, Nicole Soranzo, Lorraine Southam, Nicholas J. Timpson, Daniela Toniolo, Michela Traglia, Anne Tybjaerg-Hansen, Cornelia M. van Duijn, Elisabeth M. van Leeuwen, Anette Varbo, Peter Whincup, Gianluigi Zaza, Eleftheria Zeggini, Weihua Zhang

NATURE (2015)

Article Multidisciplinary Sciences

Whole-genome sequence-based analysis of thyroid function

Peter N. Taylor, Eleonora Porcu, Shelby Chew, Purdey J. Campbell, Michela Traglia, Suzanne J. Brown, Benjamin H. Mullin, Hashem A. Shihab, Josine Min, Klaudia Walter, Yasin Memari, Jie Huang, Michael R. Barnes, John P. Beilby, Pimphen Charoen, Petr Danecek, Frank Dudbridge, Vincenzo Forgetta, Celia Greenwood, Elin Grundberg, Andrew D. Johnson, Jennie Hui, Ee M. Lim, Shane McCarthy, Dawn Muddyman, Vijay Panicker, John R. B. Perry, Jordana T. Bell, Wei Yuan, Caroline Relton, Tom Gaunt, David Schlessinger, Goncalo Abecasis, Francesco Cucca, Gabriela L. Surdulescu, Wolfram Woltersdorf, Eleftheria Zeggini, Hou-Feng Zheng, Daniela Toniolo, Colin M. Dayan, Silvia Naitza, John P. Walsh, Tim Spector, George Davey Smith, Richard Durbin, J. Brent Richards, Serena Sanna, Nicole Soranzo, Nicholas J. Timpson, Scott G. Wilson

NATURE COMMUNICATIONS (2015)

Article Multidisciplinary Sciences

Improved imputation of low-frequency and rare variants using the UK10K haplotype reference panel

Jie Huang, Bryan Howie, Shane McCarthy, Yasin Memari, Klaudia Walter, Josine L. Min, Petr Danecek, Giovanni Malerba, Elisabetta Trabetti, Hou-Feng Zheng, Giovanni Gambaro, J. Brent Richards, Richard Durbin, Nicholas J. Timpson, Jonathan Marchini, Nicole Soranzo

NATURE COMMUNICATIONS (2015)

Article Biochemistry & Molecular Biology

Ensembl 2009

T. J. P. Hubbard, B. L. Aken, S. Ayling, B. Ballester, K. Beal, E. Bragin, S. Brent, Y. Chen, P. Clapham, L. Clarke, G. Coates, S. Fairley, S. Fitzgerald, J. Fernandez-Banet, L. Gordon, S. Graf, S. Haider, M. Hammond, R. Holland, K. Howe, A. Jenkinson, N. Johnson, A. Kahari, D. Keefe, S. Keenan, R. Kinsella, F. Kokocinski, E. Kulesha, D. Lawson, I. Longden, K. Megy, P. Meidl, B. Overduin, A. Parker, B. Pritchard, D. Rios, M. Schuster, G. Slater, D. Smedley, W. Spooner, G. Spudich, S. Trevanion, A. Vilella, J. Vogel, S. White, S. Wilder, A. Zadissa, E. Birney, F. Cunningham, V. Curwen, R. Durbin, X. M. Fernandez-Suarez, J. Herrero, A. Kasprzyk, G. Proctor, J. Smith, S. Searle, P. Flicek

NUCLEIC ACIDS RESEARCH (2009)

Article Biochemistry & Molecular Biology

Ensembl 2011

Paul Flicek, M. Ridwan Amode, Daniel Barrell, Kathryn Beal, Simon Brent, Yuan Chen, Peter Clapham, Guy Coates, Susan Fairley, Stephen Fitzgerald, Leo Gordon, Maurice Hendrix, Thibaut Hourlier, Nathan Johnson, Andreas Kaehaeri, Damian Keefe, Stephen Keenan, Rhoda Kinsella, Felix Kokocinski, Eugene Kulesha, Pontus Larsson, Ian Longden, William McLaren, Bert Overduin, Bethan Pritchard, Harpreet Singh Riat, Daniel Rios, Graham R. S. Ritchie, Magali Ruffier, Michael Schuster, Daniel Sobral, Giulietta Spudich, Y. Amy Tang, Stephen Trevanion, Jana Vandrovcova, Albert J. Vilella, Simon White, Steven P. Wilder, Amonida Zadissa, Jorge Zamora, Bronwen L. Aken, Ewan Birney, Fiona Cunningham, Ian Dunham, Richard Durbin, Xose M. Fernandez-Suarez, Javier Herrero, Tim J. P. Hubbard, Anne Parker, Glenn Proctor, Jan Vogel, Stephen M. J. Searle

NUCLEIC ACIDS RESEARCH (2011)

Article Biochemistry & Molecular Biology

Ensembl 2012

Paul Flicek, M. Ridwan Amode, Daniel Barrell, Kathryn Beal, Simon Brent, Denise Carvalho-Silva, Peter Clapham, Guy Coates, Susan Fairley, Stephen Fitzgerald, Laurent Gil, Leo Gordon, Maurice Hendrix, Thibaut Hourlier, Nathan Johnson, Andreas K. Kaehaeri, Damian Keefe, Stephen Keenan, Rhoda Kinsella, Monika Komorowska, Gautier Koscielny, Eugene Kulesha, Pontus Larsson, Ian Longden, William McLaren, Matthieu Muffato, Bert Overduin, Miguel Pignatelli, Bethan Pritchard, Harpreet Singh Riat, Graham R. S. Ritchie, Magali Ruffier, Michael Schuster, Daniel Sobral, Y. Amy Tang, Kieron Taylor, Stephen Trevanion, Jana Vandrovcova, Simon White, Mark Wilson, Steven P. Wilder, Bronwen L. Aken, Ewan Birney, Fiona Cunningham, Ian Dunham, Richard Durbin, Xose M. Fernandez-Suarez, Jennifer Harrow, Javier Herrero, Tim J. P. Hubbard, Anne Parker, Glenn Proctor, Giulietta Spudich, Jan Vogel, Andy Yates, Amonida Zadissa, Stephen M. J. Searle

NUCLEIC ACIDS RESEARCH (2012)

Article Biochemistry & Molecular Biology

Ensembl 2013

Paul Flicek, Ikhlak Ahmed, M. Ridwan Amode, Daniel Barrell, Kathryn Beal, Simon Brent, Denise Carvalho-Silva, Peter Clapham, Guy Coates, Susan Fairley, Stephen Fitzgerald, Laurent Gil, Carlos Garcia-Giron, Leo Gordon, Thibaut Hourlier, Sarah Hunt, Thomas Juettemann, Andreas K. Kaehaeri, Stephen Keenan, Monika Komorowska, Eugene Kulesha, Ian Longden, Thomas Maurel, William M. McLaren, Matthieu Muffato, Rishi Nag, Bert Overduin, Miguel Pignatelli, Bethan Pritchard, Emily Pritchard, Harpreet Singh Riat, Graham R. S. Ritchie, Magali Ruffier, Michael Schuster, Daniel Sheppard, Daniel Sobral, Kieron Taylor, Anja Thormann, Stephen Trevanion, Simon White, Steven P. Wilder, Bronwen L. Aken, Ewan Birney, Fiona Cunningham, Ian Dunham, Jennifer Harrow, Javier Herrero, Tim J. P. Hubbard, Nathan Johnson, Rhoda Kinsella, Anne Parker, Giulietta Spudich, Andy Yates, Amonida Zadissa, Stephen M. J. Searle

NUCLEIC ACIDS RESEARCH (2013)

Article Biochemistry & Molecular Biology

Ensembl 2014

Paul Flicek, M. Ridwan Amode, Daniel Barrell, Kathryn Beal, Konstantinos Billis, Simon Brent, Denise Carvalho-Silva, Peter Clapham, Guy Coates, Stephen Fitzgerald, Laurent Gil, Carlos Garcia Giron, Leo Gordon, Thibaut Hourlier, Sarah Hunt, Nathan Johnson, Thomas Juettemann, Andreas K. Kaehaeri, Stephen Keenan, Eugene Kulesha, Fergal J. Martin, Thomas Maurel, William M. McLaren, Daniel N. Murphy, Rishi Nag, Bert Overduin, Miguel Pignatelli, Bethan Pritchard, Emily Pritchard, Harpreet S. Riat, Magali Ruffier, Daniel Sheppard, Kieron Taylor, Anja Thormann, Stephen J. Trevanion, Alessandro Vullo, Steven P. Wilder, Mark Wilson, Amonida Zadissa, Bronwen L. Aken, Ewan Birney, Fiona Cunningham, Jennifer Harrow, Javier Herrero, Tim J. P. Hubbard, Rhoda Kinsella, Matthieu Muffato, Anne Parker, Giulietta Spudich, Andy Yates, Daniel R. Zerbino, Stephen M. J. Searle

NUCLEIC ACIDS RESEARCH (2014)

Article Biochemistry & Molecular Biology

Ensembl 2015

Fiona Cunningham, M. Ridwan Amode, Daniel Barrell, Kathryn Beal, Konstantinos Billis, Simon Brent, Denise Carvalho-Silva, Peter Clapham, Guy Coates, Stephen Fitzgerald, Laurent Gil, Carlos Garcia Giron, Leo Gordon, Thibaut Hourlier, Sarah E. Hunt, Sophie H. Janacek, Nathan Johnson, Thomas Juettemann, Andreas K. Kaehaeri, Stephen Keenan, Fergal J. Martin, Thomas Maurel, William McLaren, Daniel N. Murphy, Rishi Nag, Bert Overduin, Anne Parker, Mateus Patricio, Emily Perry, Miguel Pignatelli, Harpreet Singh Riat, Daniel Sheppard, Kieron Taylor, Anja Thormann, Alessandro Vullo, Steven P. Wilder, Amonida Zadissa, Bronwen L. Aken, Ewan Birney, Jennifer Harrow, Rhoda Kinsella, Matthieu Muffato, Magali Ruffier, Stephen M. J. Searle, Giulietta Spudich, Stephen J. Trevanion, Andy Yates, Daniel R. Zerbino, Paul Flicek

NUCLEIC ACIDS RESEARCH (2015)

Article Multidisciplinary Sciences

A rare variant in APOC3 is associated with plasma triglyceride and VLDL levels in Europeans

Nicholas J. Timpson, Klaudia Walter, Josine L. Min, Ioanna Tachmazidou, Giovanni Malerba, So-Youn Shin, Lu Chen, Marta Futema, Lorraine Southam, Valentina Iotchkova, Massimiliano Cocca, Jie Huang, Yasin Memari, Shane McCarthy, Petr Danecek, Dawn Muddyman, Massimo Mangino, Cristina Menni, John R. B. Perry, Susan M. Ring, Amadou Gaye, George Dedoussis, Aliki-Eleni Farmaki, Paul Burton, Philippa J. Talmud, Giovanni Gambaro, Tim D. Spector, George Davey Smith, Richard Durbin, J. Brent Richards, Steve E. Humphries, Eleftheria Zeggini, Nicole Soranzo

NATURE COMMUNICATIONS (2014)

Article Biochemistry & Molecular Biology

Ensembl 2008

P. Flicek, B. L. Aken, K. Beal, B. Ballester, M. Caccamo, Y. Chen, L. Clarke, G. Coates, F. Cunningham, T. Cutts, T. Down, S. C. Dyer, T. Eyre, S. Fitzgerald, J. Fernandez-Banet, S. Graf, S. Haider, M. Hammond, R. Holland, K. L. Howe, K. Howe, N. Johnson, A. Jenkinson, A. Kahari, D. Keefe, F. Kokocinski, E. Kulesha, D. Lawson, I. Longden, K. Megy, P. Meidl, B. Overduin, A. Parker, B. Pritchard, A. Prlic, S. Rice, D. Rios, M. Schuster, I. Sealy, G. Slater, D. Smedley, G. Spudich, S. Trevanion, A. J. Vilella, J. Vogel, S. White, M. Wood, E. Birney, T. Cox, V. Curwen, R. Durbin, X. M. Fernandez-Suarez, J. Herrero, T. J. P. Hubbard, A. Kasprzyk, G. Proctor, J. Smith, A. Ureta-Vidal, S. Searle

NUCLEIC ACIDS RESEARCH (2008)

Article Biochemistry & Molecular Biology

Ensembl 2007

T. J. P. Hubbard, B. L. Aken, K. Beal, B. Ballester, M. Caccamo, Y. Chen, L. Clarke, G. Coates, F. Cunningham, T. Cutts, T. Down, S. C. Dyer, S. Fitzgerald, J. Fernandez-Banet, S. Graf, S. Haider, M. Hammond, J. Herrero, R. Holland, K. Howe, K. Howe, N. Johnson, A. Kahari, D. Keefe, F. Kokocinski, E. Kulesha, D. Lawson, I. Longden, C. Melsopp, K. Megy, P. Meidl, B. Overduin, A. Parker, A. Prlic, S. Rice, D. Rios, M. Schuster, I. Sealy, J. Severin, G. Slater, D. Smedley, G. Spudich, S. Trevanion, A. Vilella, J. Vogel, S. White, M. Wood, T. Cox, V. Curwen, R. Durbin, X. M. Fernandez-Suarez, P. Flicek, A. Kasprzyk, G. Proctor, S. Searle, J. Smith, A. Ureta-Vidal, E. Birney

NUCLEIC ACIDS RESEARCH (2007)

Article Biochemistry & Molecular Biology

Ensembl 2006

E. Birney, D. Andrews, M. Caccamo, Y. Chen, L. Clarke, G. Coates, T. Cox, F. Cunningham, V. Curwen, T. Cutts, T. Down, R. Durbin, X. M. Fernandez-Suarez, P. Flicek, S. Graf, M. Hammond, J. Herrero, K. Howe, V. Iyer, K. Jekosch, A. Kahari, A. Kasprzyk, D. Keefe, F. Kokocinski, E. Kulesha, D. London, I. Longden, C. Melsopp, P. Meidl, B. Overduin, A. Parker, G. Proctor, A. Prlic, M. Rae, D. Rios, S. Redmond, M. Schuster, I. Sealy, S. Searle, J. Severin, G. Slater, D. Smedley, J. Smith, A. Stabenau, J. Stalker, S. Trevanion, A. Ureta-Vidal, J. Vogel, S. White, C. Woodwark, T. J. P. Hubbard

NUCLEIC ACIDS RESEARCH (2006)

Article Biochemistry & Molecular Biology

Ensembl 2005

T Hubbard, D Andrews, M Caccamo, G Cameron, Y Chen, M Clamp, L Clarke, G Coates, T Cox, F Cunningham, V Curwen, T Cutts, T Down, R Durbin, XM Fernandez-Suarez, J Gilbert, M Hammond, J Herrero, H Hotz, K Howe, V Iyer, K Jekosch, A Kahari, A Kasprzyk, D Keefe, S Keenan, F Kokocinsci, D London, I Longden, G McVicker, C Melsopp, P Meidl, S Potter, G Proctor, M Rae, D Rios, M Schuster, S Searle, J Severin, G Slater, D Smedley, J Smith, W Spooner, A Stabenau, J Stalker, R Storey, S Trevanion, A Ureta-Vidal, J Vogel, S White, C Woodwark, E Birney

NUCLEIC ACIDS RESEARCH (2005)

No Data Available