Publications

Results 26–30 of 30

Search results

Jump to search filters

Detection and reconstruction of error control codes for engineered and biological regulatory systems

May, Elebeoba; Johnston, Anna M.; Hart, William E.; Watson, Jean-Paul; Pryor, Richard J.; Rintoul, Mark D.

A fundamental challenge for all communication systems, engineered or living, is the problem of achieving efficient, secure, and error-free communication over noisy channels. Information theoretic principals have been used to develop effective coding theory algorithms to successfully transmit information in engineering systems. Living systems also successfully transmit biological information through genetic processes such as replication, transcription, and translation, where the genome of an organism is the contents of the transmission. Decoding of received bit streams is fairly straightforward when the channel encoding algorithms are efficient and known. If the encoding scheme is unknown or part of the data is missing or intercepted, how would one design a viable decoder for the received transmission? For such systems blind reconstruction of the encoding/decoding system would be a vital step in recovering the original message. Communication engineers may not frequently encounter this situation, but for computational biologists and biotechnologist this is an immediate challenge. The goal of this work is to develop methods for detecting and reconstructing the encoder/decoder system for engineered and biological data. Building on Sandia's strengths in discrete mathematics, algorithms, and communication theory, we use linear programming and will use evolutionary computing techniques to construct efficient algorithms for modeling the coding system for minimally errored engineered data stream and genomic regulatory DNA and RNA sequences. The objective for the initial phase of this project is to construct solid parallels between biological literature and fundamental elements of communication theory. In this light, the milestones for FY2003 were focused on defining genetic channel characteristics and providing an initial approximation for key parameters, including coding rate, memory length, and minimum distance values. A secondary objective addressed the question of determining similar parameters for a received, noisy, error-control encoded data set. In addition to these goals, we initiated exploration of algorithmic approaches to determine if a data set could be approximated with an error-control code and performed initial investigations into optimization based methodologies for extracting the encoding algorithm given the coding rate of an encoded noise-free and noisy data stream.

More Details

Towards a biological coding theory discipline

Proposed for publication in New Thesis.

May, Elebeoba

How can information required for the proper functioning of a cell, an organism, or a species be transmitted in an error-introducing environment? Clearly, similar to engineering communication systems, biological systems must incorporate error control in their information transmissino processes. if genetic information in the DNA sequence is encoded in a manner similar to error control encoding, the received sequence, the messenger RNA (mRNA) can be analyzed using coding theory principles. This work explores potential parallels between engineering communication systems and the central dogma of genetics and presents a coding theory approach to modeling the process of protein translation initiation. The messenger RNA is viewed as a noisy encoded sequence and the ribosoe as an error control decoder. Decoding models based on chemical and biological characteristics of the ribosome and the ribosome binding site of the mRNA are developed and results of applying the models to the Escherichia coli K-12 are presented.

More Details

An error-correcting code framework for genetic sequence analysis

Proposed for publication in Journal of The Franklin Institute.

May, Elebeoba

A fundamental challenge for engineering communication systems is the problem of transmitting information from the source to the receiver over a noisy channel. This same problem exists in a biological system. How can information required for the proper functioning of a cell, an organism, or a species be transmitted in an error introducing environment? Source codes (compression codes) and channel codes (error-correcting codes) address this problem in engineering communication systems. The ability to extend these information theory concepts to study information transmission in biological systems can contribute to the general understanding of biological communication mechanisms and extend the field of coding theory into the biological domain. In this work, we review and compare existing coding theoretic methods for modeling genetic systems. We introduce a new error-correcting code framework for understanding translation initiation, at the cellular level and present research results for Escherichia coli K-12. By studying translation initiation, we hope to gain insight into potential error-correcting aspects of genomic sequences and systems.

More Details

Coding theory based models for protein translation initiation in prokaryotic organisms

May, Elebeoba

Our research explores the feasibility of using communication theory, error control (EC) coding theory specifically, for quantitatively modeling the protein translation initiation mechanism. The messenger RNA (mRNA) of Escherichia coli K-12 is modeled as a noisy (errored), encoded signal and the ribosome as a minimum Hamming distance decoder, where the 16S ribosomal RNA (rRNA) serves as a template for generating a set of valid codewords (the codebook). We tested the E. coli based coding models on 5' untranslated leader sequences of prokaryotic organisms of varying taxonomical relation to E. coli including: Salmonella typhimurium LT2, Bacillus subtilis, and Staphylococcus aureus Mu50. The model identified regions on the 5' untranslated leader where the minimum Hamming distance values of translated mRNA sub-sequences and non-translated genomic sequences differ the most. These regions correspond to the Shine-Dalgarno domain and the non-random domain. Applying the EC coding-based models to B. subtilis, and S. aureus Mu50 yielded results similar to those for E. coli K-12. Contrary to our expectations, the behavior of S. typhimurium LT2, the more taxonomically related to E. coli, resembled that of the non-translated sequence group.

More Details
Results 26–30 of 30
Results 26–30 of 30