My Dominant Hemisphere

The Official Weblog of 'The Basilic Insula'

Posts Tagged ‘translational research

Seeking Profundity In The Mundane

leave a comment »

seeking a new vision

Seeking A New Vision (via Jared Rodriguez/Truthout CC BY-NC-SA license)

The astronomer, Carl Sagan once said:

It has been said that astronomy is a humbling and character-building experience. There is perhaps no better demonstration of the folly of human conceits than this distant image of our tiny world. To me, it underscores our responsibility to deal more kindly with one another, and to preserve and cherish the pale blue dot, the only home we’ve ever known.

— in the Pale Blue Dot

And likewise Frank Borman, astronaut and Commander of Apollo 8, the first mission to fly around the Moon said:

When you’re finally up on the moon, looking back at the earth, all these differences and nationalistic traits are pretty well going to blend and you’re going to get a concept that maybe this is really one world and why the hell can’t we learn to live together like decent people?

Why is it I wonder, that we the human race, have the tendency to reach such profound truths only when placed in an extraordinary environment? Do we have to train and become astronomers or cosmonauts to appreciate our place in the universe? To find respect for and to cherish what we’ve been bestowed with? To care about each other, our environment and this place that we are loath to remember is the one home for all of life as we know it?

There is much to be learned by reflecting upon this idea. Our capacity to gain wisdom and feel impressed really does depend on the level to which our experiences deviate from the banal, doesn’t it? Ask what a grain of food means to somebody who has never had the luxury of a mediocre middle-class life. Ask a lost child what it must be like to have finally found his mother. Or question the rejoicing farmer who has just felt rain-drops on his cheeks, bringing hope after a painful drought.

I’m sure you can think of other examples that speak volumes about the way we, consciously or not, program ourselves to look at things.

The other day, I was just re-reading an old article about the work of biomathematician, Steven Strogatz. He mentioned how as a high-school student studying science, he was asked to drop down on his knees and measure the dimensions of floors, graph the time periods of pendulums and figure out the speed of sound from resonating air columns in hollow tubes partly filled with water, etc. Each time, the initial reaction was that of dreariness and insipidity. But he would then soon realize how these mundane experiments would in reality act as windows to profound discoveries – such as the idea that resonance is something without which atoms wouldn’t come together to form material objects or how a pendulum’s time period when graphed reflects a specific mathematical equation.

There he was – peering into the abstruse and finding elegance in the mundane. The phenomenon reminded me of a favorite quote:

The real voyage of discovery consists not in seeking new landscapes, but in having new eyes.

Marcel Proust

For that’s what Strogatz, like Sagan and Borman was essentially experiencing. A new vision about things. But with an important difference – he was doing it by looking at the ordinary. Not by gazing at extra-ordinary galaxies and stars through a telescope. Commonplace stuff, that when examined closely, suddenly was ordinary no more. Something that had just as much potential to change man’s perspective of himself and his place in the universe.

I think it’s important to realize this. The universe doesn’t just exist out there among the celestial bodies that lie beyond normal reach. It exists everywhere. Here; on this earth. Within yourself and your environment and much closer to home.

Perhaps, that’s why we’ve made much scientific progress by this kind of exploration. By looking at ordinary stuff using ordinary means. But with extra-ordinary vision. And successful scientists have proven again and again, the value of doing things this way.

The concept of hand-washing to prevent the spread of disease for instance, wasn’t born out of a sophisticated randomized-clinical trial. But by a mediocre accounting of mortality rates using a much less developed epidemiologic study. The obstetrician who stumbled upon this profound discovery, long before Pasteur later postulated the germ theory of disease, was called Ignaz Semmelweis, later to be known as the “savior of mothers”. His new vision led to the discovery of something so radical, that the medical community of his day rejected it and his results were never seriously looked at during his lifetime (So much for peer-review, eh?). The doctor struggled with this till his last breath, suffering at an insane asylum and ultimately dying at the young age of 47.

That smoking is tied with lung cancer was first conclusively learned by an important prospective cohort study that was largely done by mailing a series of questionnaires out to smoking and non-smoking physicians over a period of time, asking how they were doing. Yes, even questionnaires, when used intelligently, could be more than just unremarkable pieces of paper; they could be gateways that open our eyes to our magnificent universe!

From the polymath and physician, Copernicus’s seemingly pointless calculations on the positions of planets to the dreary routine of looking at microbial growth in petri-dishes by physician Koch, to physicist and polymath, Young‘s proposal of a working theory for color vision, to the physician, John Snow’s phenomenal work on preventing cholera by studying water wells long before the microbe was even identified, time and time again we have learned about the enormous implications of science on the cheap. And science of the mundane. There’s wisdom in applying the KISS (Keep It Simple Stupid) principle to science after all! Even in the more advanced technologically replete scientific studies.

More on the topic of finding extraordinary ideas in ordinary things, I was reminded recently of a couple of enchanting papers and lectures. One was about finding musical patterns in the sequence of our DNA. And the second was an old but interesting paper1 that proposes a radical model for the biology of the cell and that seeks to reconcile the paradoxes that we observe in biological experiments. That there could be some deep logical underpinning to the maxim, “biology is a science of exceptions”, is really quite an exciting idea:

Surprise is a sign of failed expectations. Expectations are always derived from some basic assumptions. Therefore, any surprising or paradoxical data challenges either the logical chain leading from assumptions to a failed expectation or the very assumptions on which failed expectations are based. When surprises are sporadic, it is more likely that a particular logical chain is faulty, rather than basic assumptions. However, when surprises and paradoxes in experimental data become systematic and overwhelming, and remain unresolved for decades despite intense research efforts, it is time to reconsider basic assumptions.

One of the basic assumptions that make proteomics data appear surprising is the conventional deterministic image of the cell. The cell is commonly perceived and traditionally presented in textbooks and research publications as a pre-defined molecular system organized and functioning in accord with the mechanisms and programs perfected by billions years of biological evolution, where every part has its role, structure, and localization, which are specified by the evolutionary design that researchers aim to crack by reverse engineering. When considered alone, surprising findings of proteomics studies are not, of course, convincing enough to challenge this image. What makes such a deterministic perception of the cell untenable today is the massive onslaught of paradoxical observations and surprising discoveries being generated with the help of advanced technologies in practically every specialized field of molecular and cell biology [1217].

One of the aims of this article is to show that, when reconsidered within an alternative framework of new basic assumptions, virtually all recent surprising discoveries as well as old unresolved paradoxes fit together neatly, like pieces of a jigsaw puzzle, revealing a new image of the cell–and of biological organization in general–that is drastically different from the conventional one. Magically, what appears as paradoxical and surprising within the old image becomes natural and expected within the new one. Conceptually, the transition from the old image of biological organization to a new one resembles a gestalt switch in visual perception, meaning that the vast majority of existing data is not challenged or discarded but rather reinterpreted and rearranged into an alternative systemic perception of reality.

— (CC BY license)

Inveigled yet 🙂 ? Well then, go ahead and give it a look!

And as mentioned earlier in the post, one could extend this concept of seeking out phenomenal truths in everyday things to many other fields. As a photography buff, I can tell you that ordinary and boring objects can really start to get interesting when viewed up close and magnified. A traveler who takes the time to immerse himself in the communities he’s exploring, much like Xuan Zang or Wilfred Thesiger or Ibn Battuta, suddenly finds that what is to be learned is vast and all the more enjoyable.

The potential to find and learn things with this new way to envision our universe can be truly revolutionary. If you’re good at it, it soon becomes hard to ever get bored!


  1. Kurakin, A. (2009). Scale-free flow of life: on the biology, economics, and physics of the cell. Theoretical Biology and Medical Modelling, 6(1), 6. doi:10.1186/1742-4682-6-6

Copyright Firas MR. All Rights Reserved.

“A mote of dust, suspended in a sunbeam.”

Search Blog For Tags: , , , ,

Written by Firas MR

November 13, 2010 at 10:48 am

Contrasts In Nerdity & What We Gain By Interdisciplinary Thinking

leave a comment »

scientific fields and purity

Where Do You Fit In This Paradigm? (via xkcd CC BY-NC license)

I’ve always been struck by how nerds can act differently in different fields.

An art nerd is very different from a tech nerd. Whereas the former could go on and on about brush strokes, lighting patterns, mixtures of paint, which drawing belongs to which artist, etc. the latter can engage in ad-infinitum discussions about the architecture of the internet, how operating systems work, whose grip on Assembly is better, why their code works better, etc.

And what about math and physics nerds? They tend to show their feathers off by displaying their understanding of chaos theory, why imaginary numbers matter, and how we are all governed by “laws of nature”, etc.

How about physicians and med students? Well, like most biologists, they’ll compete with each other by showing off how much of anatomy, physiology or biochemistry or drug properties they can remember, who’s uptodate on the most recent clinical trial statistics (sort of like a fan of cricket/baseball statistics), and why their technique of proctoscopy is better than somebody else’s, the latest morbidity/mortality rates following a given procedure, etc.

And you could actually go on about nerds in other fields too – historians (who remembers what date or event), political analysts (who understands the Thai royal family better), farmers (who knows the latest in pesticides), etc.

Each type has its own traits, that reflect the predominant mindset (at the highest of intellectual levels) when it comes to approaching their respective subject matter. And nerds, being who they are, can tend to take it all to their heads and think they’ve found that place — of ultimate truth, peace and solace. That they are at last, “masters” of their subjects.

I’ve always found this phenomenon to be rather intriguing. Because in reality, things are rarely that simple – at least when it comes to “mastery”.

In medicine for instance, the nerdiest of most nerds out there will be proud and rather content with the vast statistics, nomenclature, and learn-by-rote information that he has finally been able to contain within his head. Agreed, being able to keep such information at the tip of one’s tongue is an achievement considering the bounds of average human memory. But what about the fact that he has no clue as to what fundamentally drives those statistics, why one drug works for a condition whereas another drug with the same properties (i.e. properties that medical science knows of) fails or has lower success rates, etc.? A physicist nerd would approach this matter as something that lies at the crux of an issue — so much so that he would get sleepless nights without being able to find some model or theory that explains it mathematically, in a way that seems logical. But a medical nerd? He’s very different. His geekiness just refuses to go there, because of the discomforting feeling that he has no idea whatsoever! More stats and names to rote please, thank you!

I think one of the biggest lessons we learn from the really great stalwarts in human history is that, they refused to let such stuff get to their heads. The constant struggle to find and maintain humility in knowledge was central to how they saw themselves.

… I can live with doubt and uncertainty and not knowing. I think it’s much more interesting to live not knowing than to have answers which might be wrong. I have approximate answers and possible beliefs and different degrees of certainty about different things, but I’m not absolutely sure of anything and there are many things I don’t know anything about, such as whether it means anything to ask why we’re here, and what the question might mean. I might think about it a little bit and if I can’t figure it out, then I go on to something else, but I don’t have to know and answer, I don’t feel frightened by not knowing things, by being lost in a mysterious universe without having any purpose, which is the way it really is so far as I can tell. It doesn’t frighten me.

Richard Feynman speaking with Horizon, BBC (1981)

The scientist has a lot of experience with ignorance and doubt and uncertainty, and this experience is of great importance, I think. When a scientist doesn’t know the answer to a problem, he is ignorant. When he has a hunch as to what the result is, he is uncertain. And when he is pretty darn sure of what the result is going to be, he is in some doubt. We have found it of paramount importance that in order to progress we must recognize the ignorance and leave room for doubt. Scientific knowledge is a body of statements of varying degrees of certainty – some most unsure, some nearly sure, none absolutely certain.

Now, we scientists are used to this, and we take it for granted that it is perfectly consistent to be unsure – that it is possible to live and not know. But I don’t know everybody realizes that this is true. Our freedom to doubt was born of a struggle against authority in the early days of science. It was a very deep and very strong struggle. Permit us to question – to doubt, that’s all – not to be sure. And I think it is important that we do not forget the importance of this struggle and thus perhaps lose what we have gained.

What Do You Care What Other People Think?: Further Adventures of a Curious Character by Richard Feynman as told to Ralph Leighton

an interdisciplinary web of a universe

An Interdisciplinary Web of a Universe (via Clint Hamada @ Flickr; CC BY-NC-SA license)

Besides being an important aspect for high-school students to consider when deciding what career path to pursue, I think that these nerd-personality-traits also illustrate the role that interdisciplinary thinking can play in our lives and how it can add tremendous value in the way we think. The more one diversifies, the more his or her thinking expands — for the better, usually.

Just imagine a nerd who’s cool about art, physics, math or medicine, etc. — all put together, in varying degrees. What would his perspective of his subject matter and of himself be like? Would he make the ultimate translational research nerd? It’s not just the knowledge one could potentially piece together, but the mindset that one would begin to gradually develop. After all, we live in an enchanting web of a universe, where everything intersects everything!

Copyright Firas MR. All Rights Reserved.

“A mote of dust, suspended in a sunbeam.”

Search Blog For Tags: , , , , ,


Written by Firas MR

November 12, 2010 at 12:00 am

A Brief Tour Of The Field Of Bioinformatics

with 10 comments

This is an example of a full genome sequencing machine. It is the ABI PRISM 3100 Genetic Analyzer. Sequencers like it completely automate the process of sequencing the entire genome. Yes, even yours! [Courtesy: Wikipedia]

Some Background Before The Tour

Ahoy readers! I’ve had the opportunity to read a number of books recently. Among them, is “Developing Bioinformatics Computer Skills” by Cynthia Gibas and Per Jambeck. I dived into the book straight away, having no basic knowledge at all of what comprises the field of bioinformatics. Actually, it was quite like the first time I started medical college. On our first day, we were handed a tiny handbook on human anatomy, called “Handbook Of General Anatomy” by B D Chaurasia. Until actually opening that book, absolutely no one in the class had any idea of what Medicine truly was. All we had with us were impressions of charismatic white-coats who could, as if by magic, diagnose all kinds of weird things by the mere touch of a hand. Not to mention, legendary tales from the likes of Discovery Channel. Oh yes, our expectations were of epic proportions 😛 . As we flipped through the pages of that little book, we were flabbergasted by the sheer volume of information that one had to rote. It had soon become clear to us, what medicine was all about – Physiology is the study of normal body functions akin to physics, Anatomy is the study of the structural organization of the human body a la geography … – and this set us on the path to learning to endure an avalanche of learn-by-rote information for the rest of our lives.

Bioinformatics is shrouded in mystery for most medics. Because, so many of these ideas are completely new. The technologies are new. The data available are new. Before the human genome was sequenced, there was virtually no point of using computers to understand genes and alleles. Most of what needed to be sorted out could be done by hand. But now that we have huge volumes of data, and data that are growing at an exponential rate at that, it makes sense to use computers to connect the dots and frame hypotheses. I guess, bioinformatics is a conundrum to most other people too – whether you are coming from a math background, a computer science background or a biology background – we all have something missing from our repertoire of knowledge and skills.

What is the rationale behind using computation to understand genes? In yore times, all we had were a couple of known genes. We had the tools of Mendelian genetics and linkage analysis to solve most of the genetic mysteries. The human genome project changed that. We are suddenly flooded not only with sequences that we don’t know anything about, but also the gigantic hurdle of finding relationships between them. To give you a sense of the magnitude of numbers we’re talking about here: we could simplify DNA’s 3-D structure and represent the entire genetic code contained in a single polynucleotide strand of the human genome, as a string of letters A, C, G or T each representing a given nucleic acid (base) in a long sequence (like so …..ATCGTTACGTAAAA…..). Since it has been found that this strand is approximately 3 billion bases long, its entire length comes to 3 billion bytes. That’s because each letter A, T, C or G could be thought of as being represented by a single ASCII character. And we all know that an ASCII character is equal to 1 byte of data. Since we are talking about two complementary strands within a molecule of DNA, the amount of information within the genome is 6 billion bytes§. But human cells are diploid! So the amount of DNA information in the nucleus of a single human cell is 12 billion bytes! That’s 1.2 terabytes of data neatly packed in to the DNA sequence of every cell – we haven’t even begun to talk about the 3-D structure of DNA or the sequence and 3-D structure of RNA and proteins yet!

§ Special thanks to Martijn for bringing this up in the comments: If you really think about it for a moment, bioinformaticians don’t need to store the sequences of both the DNA strands of a genome in a computer, because the sequence of one strand can be derived from the other – they are complementary by definition. If you store 3 billion bytes from one strand, you can easily derive the complementary 3 billion bytes of information on the other strand, provided that the two strands are truly complementary and there aren’t any blips of mismatch mutations between them. Using this concept, you can get away with storing 3 billion bytes and not 6 billion bytes to capture the information in the human genome.

Special thanks also to Dr. Atul Butte ¥ of Stanford University who dropped by to say that a programmer really doesn’t need a full byte to store a nucleic acid base. A base can be represented by 2 bits (eg. 00 for A, 11 for C, 01 for G and 10 for T). Since 1 byte contains 8 bits, a byte can actually hold 4 bases. Without compression. So 3 billion bases can be held within 750,000,000 bytes. That’s 715 megabytes (1 megabyte = 1048576 bytes), which can easily fit on to an extended-length CD-ROM (not even a DVD). So the entire genetic code from a single polynucleotide strand of the human genome can easily fit on to a single CD-ROM. Since human cells are diploid, with two CD-ROMs – one CD-ROM for each set of chromosomes – you can capture this information for both sets of chromosomes. [go back]

To compound the issue, we don’t have a taxonomy system in place to describe the sequences we have. When Linnaeus invented his taxonomy system for living things, he used basic morphologic criteria to classify organisms. If it walked like a duck and talked like a duck, it was a duck! But how do you apply this reasoning to genes? You might think, why not classify them by organism? But there’s a more subtle issue here too. Some of these genetic sequences can be classified in to various categories – is this gene a promoter, exon, intron or could it be a sequence that plays a role in growth, death, inflammatory response, and so on. Not only that, many sequences could be found in more than one organism. So how do you solve the problem of classification? Man’s answer to this problem is simple – you don’t!

Here’s how we can get away with that. Simply create a relational database using MySQL, PostgreSQL or what have you and create appropriate links between sequence entries, their functions, etc. Run queries to find relationships and voila, there you have it! This was our first step in developing bioinformatics as a field. Building databases. You can do this with a genetic sequence (a string of letters A for ‘adenine‘, C for ‘cytosine‘, G for ‘guanine‘ and T for ‘thymine‘ …represented like so ATGGCTCCTATGCGGTTAAAATTT….) or with an RNA sequence (a string of letters A for ‘adenine’, C for ‘cytosine, G for ‘guanine’ and U for ‘Uracil‘ like so …AUGGCACCCU…) or even a protein sequence (a string of 20 letters each letter representing one amino acid). By breaking down and simplifying a 3-D structure this way, you can suddenly enhance data storage, retrieval and more importantly, analysis between:

  1. Two or more sequences of DNA
  2. Two or more sequences of RNA
  3. Two or more sequences of Protein

You can even find relationships between:

  1. A DNA sequence and an RNA sequence
  2. An RNA sequence and a Protein sequence
  3. A DNA sequence and a Protein sequence

If you can represent the spatial coordinates of the molecules within a protein 3-D structure as cartesian coordinates (x, y, z), you can even analyze structure not only within a given protein, but also try to predict the best possible 3-D structure for a protein that is hypothetically synthesized by a given DNA or RNA sequence. In fact that is the Holy Grail of bioinformatics today. How to predict protein structure from a DNA sequence? And consequentially, how to manipulate protein structure to suit your needs.

The Tour Begins

Let’s take a tour of what bioinformatics holds for us.

The Ability To Build Relational Databases

We have already discussed this above.

Local Sequence Comparison

An example of sequence alignment. Alignment of 27 avian influenza hemagglutinin protein sequences colored by residue conservation (top) and residue properties (bottom) [Courtesy: Wikipedia]

Before we delve in to the idea of sequence comparisons further, let’s take an example from the bioinformatics book I mentioned to understand how sequence comparisons help in the real world. It speaks of a gene-knockout experiment that targets a specific sequence in the fruit fly’s (Drosophila melanogaster) genome. Knocking this sequence out, results in the flies’ progeny being born without eyes. By knocking this gene – called eyeless – out you learn that it somehow plays an important role in eye development in the fruit fly. There’s a similar (but not quite the same) condition in humans called aniridia, in which eyes develop in the usual manner, except for the lack of an iris. Researchers were able to identify the particular gene that causes aniridia and called it aniridia. By inserting the aniridia gene in to an eyeless-knockout Drosophila’s genome, they observed that suddenly its offspring bore eyes! Remarkable isn’t it? Somehow there’s a connection between two genes separated not only by different species, but also by genera and phyla. To discern how each of these genes functions, you proceed by asking if the two sequences could be the same? How similar would they might be exactly? To answer this question you could do an alignment of the two sequences. This is the absolute basic kind of stuff when we do sequence analysis.

Instead of doing it by hand (which could be possible if the sequences being compared were small), you could find the best alignment between these two long sequences using a program such as BLAST. There are a number of ways BLAST can work. Because the two sequences may have only certain regions that fit nicely, with other regions that don’t – called gaps – you can have multiple ways of aligning them side by side. But what you are interested in, is to find the best fit that maximizes how much they overlap with each other (and minimize gaps). Here’s where computer science comes in to play. In order to maximize overlap, you use the concept of ‘dynamic programming‘. It is helpful to understand dynamic programming as an algorithm rather than a program per se (it’s not like you’ll be sitting in front of a computer and programming code if you want to compare eyeless and aniridia; the BLAST program will do the dirty work for you. It uses dynamic programming code that’s built in to it). Amazingly enough, dynamic programming is not something as hi-fi as you might think. It is apparently the same strategy used in many computer spell-checkers! Little did the bioinformaticians who first developed dynamic programming techniques in genetics know, that the concept of dynamic programming was discovered far earlier than them. There are apparently many such cases in bioinformatics where scientists keep reinventing the wheel, purely because it is such an interdisciplinary field! One of the most common algorithms that is a subset of dynamic programming and that is used for aligning specific sequences within a genome is called the Smith-Waterman algorithm. Like dynamic programming, another useful algorithm in bioinformatics is what is called a greedy algorithm. In a greedy algorithm, you are interested in maximizing overlap in each baby-step as you construct the alignment procedure, without consideration to the final overlap. In other words, it doesn’t matter to you how the sequences overlap in the end as long as each step of the way during the alignment process, you maximize overlap. Other concepts in alignment include, using a (substitution) matrix of possible scores when two letters – each in a sequence – overlap and trying to maximize scores using dynamic programming. Common matrices for this purpose are BLOSUM-62, BLOSUM-45 and PAM (Point Accepted Mutation).

So now that we know the basic idea behind sequence alignment, here’s what you can actually do in sequence analysis:

  1. Using alignment, find a sequence from a database (eg. GenBank from the NCBI) that maximizes overlap between it and a sequence that isn’t yet in the database. This way, if you discover some new sequence, you can find relationships between it and known sequences. If the sequence in the database is associated with a given protein, you might be able to look for it in your specimen. This is called pairwise alignment.
  2. Just as you can compare two sequences and find out if there is a statistically significant association between them or not, you can also compare multiple sequences at once. This is called multiple sequence alignment.
  3. If certain regions of two sequences are the same, it can be inferred that they are conserved across species or organisms despite environmental stresses and evolution. A sequence encoding development of the eye is very likely to remain unchanged across multiple species for which sight is an essential function to survive. Here comes another interesting concept – phylogenetic relationships between organisms at a genetic level. Using alignment it is possible to develop phylogenetic trees and phylogenetic networks that link two or more gene sequences and as a consequence find related proteins.
  4. Similar to finding evolutionary homology between sequences as above, one could also look for homology between protein structures – motifs – and then conclude that the regions of DNA encoding these proteins have a certain degree of homology.
  5. There are tools in sequence analysis that look at features characteristic of known functioning regions of DNA and see if the same features exist in a random sequence. This process is called gene finding. You’re trying to discover functionality in hitherto unknown sequences of DNA. This is important, as the vast majority of genetic code is as far as we know, non-functional random junk. Could there be some region in this vast ocean of randomness that might, just might have an interesting function? Gene finding uses software that looks for tRNA encoding regions, promoter sites, open reading frames, exon-intron splicing regions, … – in short, the whole gamut of what we know is characteristic of functional code – in random junk. Once a statistically significant result is obtained, you’re ready to test this in a lab!
  6. A special situation in sequence alignment is whole genome alignment (or global alignment). That is, finding the best fit between entire genomes of different organisms! Despite how arduous this sounds, the underlying ideas are pretty similar to local sequence alignment. One of the most common dynamic programming algorithms used in whole genome alignment is the Needleman–Wunsch algorithm.

Many of the things discussed for sequence analysis of DNA, have equal counterparts for RNA and proteins.

Protein Structure Property Analysis

Say that you have an amino acid sequence for a protein. There’s nothing in the databases that has your sequence. In order to build a 3-D model of this  protein, you’ll need to predict what could be the best possible shape given the constraints of bond angles, electrostatic forces between constituent atoms, etc. There’s a specific technique that warrants mentioning here – the Ramachandran Plot – that takes information on steric hindrance and plots the probabilities for different 3-D structures of an amino acid sequence. With a 3-D model, you could try to predict this protein’s chemical properties (such as pKa, etc.). You could also look for active sites on this protein that are the crucial regions that bind to substrates, based on known structures of active sites from other proteins… and so on.

This figure depicts an unrooted phylogenetic tree for myosin, a superfamily of proteins. [Courtesy: Wikipedia]

Protein Structure Alignment

This is when you try to find the best fit between two protein structures. The idea is very similar to sequence alignment, only this time the algorithms are a bit different. In most cases, the algorithms for this process are computationally intensive and rely on trial and error. You could build phylogenetic trees based on structural evolutionary homology too.

Protein Fingerprint Analysis

This is basically using computational tools to identify relationships between two or more proteins by analyzing their break-down products – their peptide fingerprints. Using protein fragments, it is possible to compare entire cocktails of different proteins. How does the protein mixture from a human retinal cell, compare to a protein mixture from the retinal cell of a mouse? This kind of stuff, is called Proteomics, because you’re comparing the entire protein from an organism to another. You could also analyze protein fragments from different cells within the same organism to see how they might have evolved or developed.

DNA Micro-array Analysis

A DNA microarray is a slide with hundreds of tiny dots on it. Each dot is tagged with a fluorescent marker that glows under UV (or another form of) light, if the cells within that dot produce a given protein. When a given protein is made, it means that a given genetic sequence is being expressed (or transcribed into RNA which in turn is being translated in to protein). By inoculating these dots with the same population of cells and by measuring the amount of light coming from these dots, you could develop a gene expression profile for these cells. You could then study the expression profiles of these cells under different environmental conditions to see how they behave and change.

You could also inoculate different dots with different cell populations and study how their expression profiles differ. Example: normal gastric epithelium vs cancerous gastric epithelium.

Of course you could try looking at all these light emitting dots with your eyes and count manually. If you want to take a shot at it, you might even be able to tell the difference between the different levels of brightness between dots! But why not use computers to do the job for you? There are software tools out there that can quantitatively measure these expression profiles for you.

Primer Design

There are many experiments and indeed diagnostic tests that use an artificially synthesized DNA sequence to serve as an anchor that flanks a specific region of interest in the DNA of a cell, and amplify this region. By amplify – we mean, make multiple copies. These flanking sequences are also called primers. Applications for example include, amplifying DNA material of the HIV virus to better detect presence or absence of HIV in the blood of a patient. The specific name for this kind of test or experiment is called the polymerase chain reaction. There are a number of other applications of primers such as gene cloning, genetic hybridization, etc. Primers ought to be constructed in specific ways that prevent them from forming loops or binding to non-specific sites on cell DNA. How do you find the best candidate for a primer? Of course, computation!


A fancy word for modeling metabolic pathways and their relationships using computational analyses. How does the glycolytic pathway relate to some random metabolic pathway found in the neurons of the brain? Computational tools help identify potential relationships between all of these different pathways and help you map them. In fact, there are metabolic pathway maps out there on the web that continually get updated to reflect this fascinating area of ongoing research.

I guess that covers a whole lot of what bioinformatics is all about. When it comes to definitions, some people say that bioinformatics is the application part whereas computational biology is the part that mainly deals with the development of algorithms.

Neologisms Galore!

As you can see, some fancy new words have come into existence as a result of all this frenzied activity:

  • Genomics: Strictly speaking, the study of entire genomes of organisms/cells. In bioinformatics, this term is applied to any studies on DNA.
  • Transcriptomics: Strictly speaking, the study of entire transcriptomes (the RNA complement of DNA) of organisms/cells. In bioinformatics, this term is applied to any studies on RNA.
  • Proteomics: Strictly speaking, the study of entire proteins made by organisms/cells. In bioinformatics, this term is applied to any studies on proteins. Structural biology is a special branch of proteomics that explores the 3-D structure of proteins.
  • Metabolomics: The study of entire metabolic pathways in organisms/cells. In bioinformatics, this term is applied to any studies on metabolic pathways and their inter-relationships.

Real World Impact

So what can all of this theoretical ‘data-dredging’ give us anyway? Short answer – hypotheses. Once you have a theoretical hypothesis for something you can test it in the lab. Without forming intelligent hypotheses, humanity might very well take centuries to experiment with every possible permutation or combination of data that has been amassed so far and mind you, which continues to grow as we speak!

Thanks to bioinformatics, we are now discovering genetic relationships between different diseases that were hitherto considered completely unrelated – such as diabetes mellitus and rheumatoid arthritis! Scientists like Dr. Atul Butte [go back] and his team are trying to reclassify all known diseases using all of the data that we’ve been able to gather from Genomics. Soon, the days of the traditional International Classification of Diseases (ICD) might be gone. We might some day have a genetic ICD!

Sequencing of individual human genomes (technology for this already exists and many commercial entities out there will happily sequence your genome for a fee) could help in detecting or predicting disease susceptibility.

Proteins could be substituted between organisms (a la pig and human insulin) and better yet, completely manipulated to suit an objective – such as drug delivery or effectiveness. Knowing a DNA sequence, would give you enough information to predict protein structure and function, giving you yet another tool in diagnosis.

And the list of possibilities is endless!

Bioinformatics, is thus man’s attempt to making biology and medicine a predictive science 🙂 .

Further Reading

I haven’t had the chance to read any other books on bioinformatics, what with exams just a couple of months away. Having read, “Developing Bioinformatics Computer Skills“, and found it a little too dense especially in the last couple of chapters, I would only recommend it as an introductory text to someone who already has some knowledge of computer algorithms. Because different algorithms have different caveats and statistical gotchas, it makes sense to have a sound understanding of what each of these algorithms do. Although the authors have done a pretty decent job in describing the essentials, the explanations of the algorithms and how they really function are a bit complicated for the average biologist. It’s difficult for me to recommend a book that I might not have read, but here are two I’m considering worth exploring in the future:

Understanding Bioinformatics
Understanding Bioinformatics by Marketa Zvelebil and Jeremy Baum

Introduction to Bioinformatics: A Theoretical and Practical Approach
Introduction to Bioinformatics: A Theoretical And Practical Approach by Stephen Krawetz and David Womble

As books to refresh my knowledge of molecular biology and genetics I’m considering the following:

Molecular Biology of the Cell
Molecular Biology Of The Cell by Bruce Alberts et al

Molecular Biology Of The Gene by none other than James D Watson himself et al (Of ‘Watson & Crick‘ model of DNA fame)

Let me know if you have any other suggested readings in the comments1.

There are also a number of excellent Opencourseware lectures on bioinformatics out on the web (example: at For beginners though, I suggest Dr. Daniel Lopresti’s (Lehigh University) fantastic high level introduction to the field here. Also don’t forget to check out “A Short Course On Synthetic Genomics” by George Church and Craig Venter on for a fascinating overview of what might lie ahead in the future! In the race to sequence the human genome, Craig Venter headed the main private company that posed competition to the NIH’s project. His group of researchers ultimately developed a much faster way to sequence the genome than had previously been imagined – the shotgun sequencing method.

Hope you’ve enjoyed this high level tour. Do send in your thoughts, suggestions and corrections!

UPDATE 1: Check out Dr. Eric Lander‘s (one of the stalwarts behind the Human Genome Project) excellent lecture at The Royal Society from 2005 called Beyond the Human Genome Project – Medicine in the 21st Century that tries to gives you the big picture on this topic.

UPDATE 2: Also check out NEJM’s special review on Genomics called Genomics — An Updated Primer.

Copyright © Firas MR. All rights reserved.

Your feedback counts:

1. Dr. Atul Butte ¥ suggests checking out some of the excellent material at NCBI’s Bookshelf. [go back]

Readability grades for this post:

Flesch reading ease score: 57.4
Automated readability index: 10.8
Flesch-Kincaid grade level: 9.7
Coleman-Liau index: 11.5
Gunning fog index: 13.4
SMOG index: 12.2

Powered by ScribeFire.

Evidence Based Medicine in Developing Countries

with 7 comments

Source, Author and License

UPDATE 1: Check out multimedia from recent international meetings of the Cochrane Collaboration that have touched on this topic: here, here and here.

Have developing countries actually been active in EBM (Evidence Based Medicine)? This was a question that kept ringing in my head during a discussion I had with some of my buds recently. Speak to a Joe medic in any of the medical establishments in a country like India, and you can’t help feeling that developing countries for the most part have become consumers of research that cannot be applied to them. These medics are not only being taught but are also being tested on guidelines developed by a plethora of alien organizations such as NICE (National Institute of Clinical Excellence-UK), SIGNS (Scottish Intercollegiate Guidelines Network-UK), Cochrane (UK), ACP (American College of Physicians-US), CDC (Centers for Disease Control-US), NIH (National Institutes of Health-US) and many others in their curricula. Most of these guidelines have been produced for patient populations that are entirely foreign to them.

The only international body with a modicum of relevance to their lives and that of their patients and one which cuts across all geographical and cultural lines is the WHO (World Health Organization). Some might argue that such an enormous and overarching agency as the WHO is intrinsically incapable of producing practice guidelines that might be sufficiently context-centric to be of any use. The WHO sure has a lot of responsibility on its hands and it really is difficult to produce guidelines that apply to all geo-cultural contexts. Indeed, the WHO has produced only a handful of guidelines to date.

India and developing countries like it, desperately need indigenous agencies to construct and regulate guidelines that are appropriate to their peoples’ resources and needs. It is extremely common, for example, to see how guidelines by some agency are taken lightly solely because of resource constraints (transportation problems, lack of appropriate instruments, etc.). Actions that a clinician needs to make given these constraints, need to be backed by evidence. The whole idea of EBM is that actions need to be based on the ‘best available’ collective body of scientific evidence pertaining to a problem – pathological, economic, whatever. Doesn’t it make sense then, to look for ‘evidence’ backing a given course of action to our problems?

Source, Author and License

We do have bodies like the ICMR (Indian Council of Medical Research) making progress, but honestly we aren’t doing enough. Over the course of my undergrad career, perhaps the only ICMR guidelines we came across were a handful of appendices at the back of a pediatrics textbook. I mean, come on! We can do better than that, right? The arguments linking this appalling void to decreased government funding are no doubt valid. Budgets allocated to healthcare are grossly below the minimum ‘5% of Gross Domestic Product’ standard set by the WHO and quite surprisingly have kept declining. Amidst this budget-strapping,  public healthcare establishments are overwhelmed by the demand for clinicians whose focus is on the manual delivery of healthcare services rather than research. In the ‘medical automobile’, these clinicians are just too busy being passengers in their back seats to care about driving. This unbalanced emphasis has had a profound impact on the very nature of our medical society. Its effects are visible right from the very beginning, as medical students enroll into institutes. Students are not even remotely exposed to the tenets underlying academic medicine and there is absolutely no mentorship mechanism in place at any level, all the way up to post-graduation and beyond. Departmental research is obscenely underfunded and students lack motivation to get involved in the absence of a nurturing environment. To make matters worse, owing to the abject lack of any academic medical component whatsoever in their curricula, students find it near impossible to take time out to engage in any form of academic activity at all. Even if they do manage it, their efforts often receive no curricular credit. Post-graduate students take the thesis requirement casually and often resort to a trial-and-error hodgepodge approach in the absence of necessary guidance. The situation finally spirals down to a vicious cycle where the blind lead the blind. End result: Institutes in chaos whose sole purpose is to produce en masse, semi-literate manual clinicians of low-innovative-potential who can’t even search or appraise medical literature, let alone use it properly.

Let’s just try to understand why this is the need of the hour. It not only paralyzes our education system but also our fragile economy. How does it degrade our economy? Well, without national guidelines there can’t be a just audit system in healthcare establishments. Without audits, resources are squandered and quality of care declines. When quality declines, the disease burden in a population rises and that in turn leads to an economic vicious cycle as national productivity declines.

How do we solve this?

  1. Government funding on healthcare ought to increase. Clearly, providing concessions and subsidies to private establishments hasn’t and most definitely isn’t going to produce results. Private establishments only care about making money – from the public or the government, and that’s all. Unless incentives are provided to them to engage in academic medicine or research, they aren’t going to bear the torch. In a developing country like India, the sheer demand for manual services forms a competing interest for these entities.
  2. Even if public funding is lacking, it might be possible to develop meaningful research. Some of the most groundbreaking research comes out of very small undertakings. It didn’t take a million dollars for us to realize the benefits of surgical asepsis.
  3. Hierarchical translational research bodies ought to be created – private or public or a possible mix of the two. Guidelines need to be produced and taught at medical schools. Students should no longer need to put up with the arbitrary whims of their superiors in the face of inapplicable guidelines in their textbooks.
  4. Audit systems should be enforced at all healthcare establishments. Students and practitioners should be taught how to audit their departments or practices.
  5. An academic component should be incorporated into the medical curriculum at all career grades – whether optional or otherwise. Mentorship mechanisms should be brought into place and could be incentive driven. Sources of funding and grants should be made more accessible and greater in number.

I hope readers have found this post interesting 🙂 . Do care to leave behind your comments.

Readability grades for this post:

Kincaid: 11.0
ARI: 12.2
Coleman-Liau: 14.7
Flesch Index: 49.1/100
Fog Index: 14.7
Lix: 50.3 = school year 9
SMOG-Grading: 13.0

Powered by Kubuntu Linux 7.10

Copyright © 2006 – 2008 Firas MR. All rights reserved.