This SAND report provides the technical progress through April 2005 of the Sandia-led project, "Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling," funded by the DOE Office of Science Genomics:GTL Program. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO2 are important terms in the global environmental response to anthropogenic atmospheric inputs of CO2 and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. In this project, we will investigate the carbon sequestration behavior of Synechococcus Sp., an abundant marine cyanobacteria known to be important to environmental responses to carbon dioxide levels, through experimental and computational methods. This project is a combined experimental and computational effort with emphasis on developing and applying new computational tools and methods. Our experimental effort will provide the biology and data to drive the computational efforts and include significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microarray experiments. Computational tools will be essential to our efforts to discover and characterize the function of the molecular machines of Synechococcus. To this end, molecular simulation methods will be coupled with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes. In addition, we will develop a set of novel capabilities for inference of regulatory pathways in microbial genomes across multiple sources of information through the integration of computational and experimental technologies. These capabilities will be applied to Synechococcus regulatory pathways to characterize their interaction map and identify component proteins in these - 4 -pathways. We will also investigate methods for combining experimental and computational results with visualization and natural language tools to accelerate discovery of regulatory pathways. The ultimate goal of this effort is develop and apply new experimental and computational methods needed to generate a new level of understanding of how the Synechococcus genome affects carbon fixation at the global scale. Anticipated experimental and computational methods will provide ever-increasing insight about the individual elements and steps in the carbon fixation process, however relating an organism's genome to its cellular response in the presence of varying environments will require systems biology approaches. Thus a primary goal for this effort is to integrate the genomic data generated from experiments and lower level simulations with data from the existing body of literature into a whole cell model. We plan to accomplish this by developing and applying a set of tools for capturing the carbon fixation behavior of complex of Synechococcus at different levels of resolution. Finally, the explosion of data being produced by high-throughput experiments requires data analysis and models which are more computationally complex, more heterogeneous, and require coupling to ever increasing amounts of experimentally obtained data in varying formats. These challenges are unprecedented in high performance scientific computing and necessitate the development of a companion computational infrastructure to support this effort. More information about this project can be found at www.genomes-to-life.org Acknowledgment We want to gratefully acknowledge the contributions of: Grant Heffelfinger1*, Anthony Martino2, Brian Palenik6, Andrey Gorin3, Ying Xu10,3, Mark Daniel Rintoul1, Al Geist3, Matthew Ennis1, with Pratul Agrawal3, Hashim Al-Hashimi8, Andrea Belgrano12, Mike Brown1, Xin Chen9, Paul Crozier1, PguongAn Dam10, Jean-Loup Faulon2, Damian Gessler12, David Haaland1, Victor Havin4, C.F. Huang5, Tao Jiang9, Howland Jones1, David Jung3, Katherine Kang14, Michael Langston15, Shawn Martin1, Shawn Means1, Vijaya Natarajan4, Roy Nielson5, Frank Olken4, Victor Olman10, Ian Paulsen14, Steve Plimpton1, Andreas Reichsteiner5, Nagiza Samatova3, Arie Shoshani4, Michael Sinclair1, Alex Slepoy1, Shawn Stevens8, Charlie Strauss5, Zhengchang Su10, Ed Thomas1, Jerilyn Timlin1, WimVermaas13, Xiufeng Wan11, HongWei Wu10, Dong Xu11, Grover Yip8, Erik Zuiderweg8 *Author to whom correspondence should be addressed (gsheffe@sandia.gov) 1. Sandia National Laboratories, Albuquerque, NM 2. Sandia National Laboratories, Livermore, CA 3. Oak Ridge National Laboratory, Oak Ridge, TN 4. Lawrence Berkeley National Laboratory, Berkeley, CA 5. Los Alamos National Laboratory, Los Alamos, NM 6. University of California, San Diego 7. University of Illinois, Urbana/Champaign 8. University of Michigan, Ann Arbor 9. University of California, Riverside 10. University of Georgia, Athens 11. University of Missouri, Columbia 12. National Center for Genome Resources, Santa Fe, NM 13. Arizona State University 14. The Institute for Genomic Research 15. University of Tennessee 5 Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL8500.
Our aim is to determine the network of events, or the regulatory network, that defines an immune response to a bio-toxin. As a model system, we are studying T cell regulatory network triggered through tyrosine kinase receptor activation using a combination of pathway stimulation and time-series microarray experiments. Our approach is composed of five steps (1) microarray experiments and data error analysis, (2) data clustering, (3) data smoothing and discretization, (4) network reverse engineering, and (5) network dynamics analysis and fingerprint identification. The technological outcome of this study is a suite of experimental protocols and computational tools that reverse engineer regulatory networks provided gene expression data. The practical biological outcome of this work is an immune response fingerprint in terms of gene expression levels. Inferring regulatory networks from microarray data is a new field of investigation that is no more than five years old. To the best of our knowledge, this work is the first attempt that integrates experiments, error analyses, data clustering, inference, and network analysis to solve a practical problem. Our systematic approach of counting, enumeration, and sampling networks matching experimental data is new to the field of network reverse engineering. The resulting mathematical analyses and computational tools lead to new results on their own and should be useful to others who analyze and infer networks.
This SAND report provides the technical progress through April 2005 of the Sandia-led project, ''Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling'', funded by the DOE Office of Science GenomicsGTL Program. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO{sub 2} are important terms in the global environmental response to anthropogenic atmospheric inputs of CO{sub 2} and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. In this project, we will investigate the carbon sequestration behavior of Synechococcus Sp., an abundant marine cyanobacteria known to be important to environmental responses to carbon dioxide levels, through experimental and computational methods. This project is a combined experimental and computational effort with emphasis on developing and applying new computational tools and methods. Our experimental effort will provide the biology and data to drive the computational efforts and include significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microamy experiments. Computational tools will be essential to our efforts to discover and characterize the function of the molecular machines of Synechococcus. To this end, molecular simulation methods will be coupled with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes. In addition, we will develop a set of novel capabilities for inference of regulatory pathways in microbial genomes across multiple sources of information through the integration of computational and experimental technologies. These capabilities will be applied to Synechococcus regulatory pathways to characterize their interaction map and identify component proteins in these pathways. We will also investigate methods for combining experimental and computational results with visualization and natural language tools to accelerate discovery of regulatory pathways. The ultimate goal of this effort is develop and apply new experimental and computational methods needed to generate a new level of understanding of how the Synechococcus genome affects carbon fixation at the global scale. Anticipated experimental and computational methods will provide ever-increasing insight about the individual elements and steps in the carbon fixation process, however relating an organism's genome to its cellular response in the presence of varying environments will require systems biology approaches. Thus a primary goal for this effort is to integrate the genomic data generated from experiments and lower level simulations with data from the existing body of literature into a whole cell model. We plan to accomplish this by developing and applying a set of tools for capturing the carbon fixation behavior of complex of Synechococcus at different levels of resolution. Finally, the explosion of data being produced by high-throughput experiments requires data analysis and models which are more computationally complex, more heterogeneous, and require coupling to ever increasing amounts of experimentally obtained data in varying formats. These challenges are unprecedented in high performance scientific computing and necessitate the development of a companion computational infrastructure to support this effort.
This SAND report provides the technical progress through October 2004 of the Sandia-led project, %22Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling,%22 funded by the DOE Office of Science Genomes to Life Program. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO2 are important terms in the global environmental response to anthropogenic atmospheric inputs of CO2 and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. In this project, we will investigate the carbon sequestration behavior of Synechococcus Sp., an abundant marine cyanobacteria known to be important to environmental responses to carbon dioxide levels, through experimental and computational methods. This project is a combined experimental and computational effort with emphasis on developing and applying new computational tools and methods. Our experimental effort will provide the biology and data to drive the computational efforts and include significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microarray experiments. Computational tools will be essential to our efforts to discover and characterize the function of the molecular machines of Synechococcus. To this end, molecular simulation methods will be coupled with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes. In addition, we will develop a set of novel capabilities for inference of regulatory pathways in microbial genomes across multiple sources of information through the integration of computational and experimental technologies. These capabilities will be applied to Synechococcus regulatory pathways to characterize their interaction map and identify component proteins in these - 4 - pathways. We will also investigate methods for combining experimental and computational results with visualization and natural language tools to accelerate discovery of regulatory pathways. The ultimate goal of this effort is develop and apply new experimental and computational methods needed to generate a new level of understanding of how the Synechococcus genome affects carbon fixation at the global scale. Anticipated experimental and computational methods will provide ever-increasing insight about the individual elements and steps in the carbon fixation process, however relating an organism's genome to its cellular response in the presence of varying environments will require systems biology approaches. Thus a primary goal for this effort is to integrate the genomic data generated from experiments and lower level simulations with data from the existing body of literature into a whole cell model. We plan to accomplish this by developing and applying a set of tools for capturing the carbon fixation behavior of complex of Synechococcus at different levels of resolution. Finally, the explosion of data being produced by high-throughput experiments requires data analysis and models which are more computationally complex, more heterogeneous, and require coupling to ever increasing amounts of experimentally obtained data in varying formats. These challenges are unprecedented in high performance scientific computing and necessitate the development of a companion computational infrastructure to support this effort. More information about this project, including a copy of the original proposal, can be found at www.genomes-to-life.org Acknowledgment We want to gratefully acknowledge the contributions of the GTL Project Team as follows: Grant S. Heffelfinger1*, Anthony Martino2, Andrey Gorin3, Ying Xu10,3, Mark D. Rintoul1, Al Geist3, Matthew Ennis1, Hashimi Al-Hashimi8, Nikita Arnold3, Andrei Borziak3, Bianca Brahamsha6, Andrea Belgrano12, Praveen Chandramohan3, Xin Chen9, Pan Chongle3, Paul Crozier1, PguongAn Dam10, George S. Davidson1, Robert Day3, Jean Loup Faulon2, Damian Gessler12, Arlene Gonzalez2, David Haaland1, William Hart1, Victor Havin3, Tao Jiang9, Howland Jones1, David Jung3, Ramya Krishnamurthy3, Yooli Light2, Shawn Martin1, Rajesh Munavalli3, Vijaya Natarajan3, Victor Olman10, Frank Olken4, Brian Palenik6, Byung Park3, Steven Plimpton1, Diana Roe2, Nagiza Samatova3, Arie Shoshani4, Michael Sinclair1, Alex Slepoy1, Shawn Stevens8, Chris Stork1, Charlie Strauss5, Zhengchang Su10, Edward Thomas1, Jerilyn A. Timlin1, Xiufeng Wan11, HongWei Wu10, Dong Xu11, Gong-Xin Yu3, Grover Yip8, Zhaoduo Zhang2, Erik Zuiderweg8 *Author to whom correspondence should be addressed (gsheffe%40sandia.gov) 1. Sandia National Laboratories, Albuquerque, NM 2. Sandia National Laboratories, Livermore, CA 3. Oak Ridge National Laboratory, Oak Ridge, TN 4. Lawrence Berkeley National Laboratory, Berkeley, CA 5. Los Alamos National Laboratory, Los Alamos, NM 6. University of California, San Diego 7. University of Illinois, Urbana/Champaign 8. University of Michigan, Ann Arbor 9. University of California, Riverside 10. University of Georgia, Athens 11. University of Missouri, Columbia 12. National Center for Genome Resources, Santa Fe, NM Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energy's National Nuclear Security Administration under contract DE-AC04-94AL85000.
This SAND report provides the technical progress through June 2004 of the Sandia-led project, ''Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling'', funded by the DOE Office of Science Genomes to Life Program. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO{sub 2} are important terms in the global environmental response to anthropogenic atmospheric inputs of CO{sub 2} and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. In this project, we will investigate the carbon sequestration behavior of Synechococcus Sp., an abundant marine cyanobacteria known to be important to environmental responses to carbon dioxide levels, through experimental and computational methods. This project is a combined experimental and computational effort with emphasis on developing and applying new computational tools and methods. Our experimental effort will provide the biology and data to drive the computational efforts and include significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microarray experiments. Computational tools will be essential to our efforts to discover and characterize the function of the molecular machines of Synechococcus. To this end, molecular simulation methods will be coupled with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes. In addition, we will develop a set of novel capabilities for inference of regulatory pathways in microbial genomes across multiple sources of information through the integration of computational and experimental technologies. These capabilities will be applied to Synechococcus regulatory pathways to characterize their interaction map and identify component proteins in these pathways. We will also investigate methods for combining experimental and computational results with visualization and natural language tools to accelerate discovery of regulatory pathways. The ultimate goal of this effort is develop and apply new experimental and computational methods needed to generate a new level of understanding of how the Synechococcus genome affects carbon fixation at the global scale. Anticipated experimental and computational methods will provide ever-increasing insight about the individual elements and steps in the carbon fixation process, however relating an organism's genome to its cellular response in the presence of varying environments will require systems biology approaches. Thus a primary goal for this effort is to integrate the genomic data generated from experiments and lower level simulations with data from the existing body of literature into a whole cell model. We plan to accomplish this by developing and applying a set of tools for capturing the carbon fixation behavior of complex of Synechococcus at different levels of resolution. Finally, the explosion of data being produced by high-throughput experiments requires data analysis and models which are more computationally complex, more heterogeneous, and require coupling to ever increasing amounts of experimentally obtained data in varying formats. These challenges are unprecedented in high performance scientific computing and necessitate the development of a companion computational infrastructure to support this effort.
The U.S. Department of Energy recently announced the first five grants for the Genomes to Life (GTL) Program. The goal of this program is to ''achieve the most far-reaching of all biological goals: a fundamental, comprehensive, and systematic understanding of life.'' While more information about the program can be found at the GTL website (www.doegenomestolife.org), this paper provides an overview of one of the five GTL projects funded, ''Carbon Sequestration in Synechococcus Sp.: From Molecular Machines to Hierarchical Modeling.'' This project is a combined experimental and computational effort emphasizing developing, prototyping, and applying new computational tools and methods to elucidate the biochemical mechanisms of the carbon sequestration of Synechococcus Sp., an abundant marine cyanobacteria known to play an important role in the global carbon cycle. Understanding, predicting, and perhaps manipulating carbon fixation in the oceans has long been a major focus of biological oceanography and has more recently been of interest to a broader audience of scientists and policy makers. It is clear that the oceanic sinks and sources of CO(2) are important terms in the global environmental response to anthropogenic atmospheric inputs of CO(2) and that oceanic microorganisms play a key role in this response. However, the relationship between this global phenomenon and the biochemical mechanisms of carbon fixation in these microorganisms is poorly understood. The project includes five subprojects: an experimental investigation, three computational biology efforts, and a fifth which deals with addressing computational infrastructure challenges of relevance to this project and the Genomes to Life program as a whole. Our experimental effort is designed to provide biology and data to drive the computational efforts and includes significant investment in developing new experimental methods for uncovering protein partners, characterizing protein complexes, identifying new binding domains. We will also develop and apply new data measurement and statistical methods for analyzing microarray experiments. Our computational efforts include coupling molecular simulation methods with knowledge discovery from diverse biological data sets for high-throughput discovery and characterization of protein-protein complexes and developing a set of novel capabilities for inference of regulatory pathways in microbial genomes across multiple sources of information through the integration of computational and experimental technologies. These capabilities will be applied to Synechococcus regulatory pathways to characterize their interaction map and identify component proteins in these pathways. We will also investigate methods for combining experimental and computational results with visualization and natural language tools to accelerate discovery of regulatory pathways. Furthermore, given that the ultimate goal of this effort is to develop a systems-level of understanding of how the Synechococcus genome affects carbon fixation at the global scale, we will develop and apply a set of tools for capturing the carbon fixation behavior of complex of Synechococcus at different levels of resolution. Finally, because the explosion of data being produced by high-throughput experiments requires data analysis and models which are more computationally complex, more heterogeneous, and require coupling to ever increasing amounts of experimentally obtained data in varying formats, we have also established a companion computational infrastructure to support this effort as well as the Genomes to Life program as a whole.