As high throughput technologies such as Next Generation Sequencing, Microarrays and Mass Spec are getting cheaper, more and more labs are using them as part of their research. However with limited budget, inability to hire a full time Bioinformatician and the complexity in analysing large and complex dataset, researches are often left stranded with huge amount of data to be appropriately mined. This often leads to data being left for an unknown future, data being under-mined and data not submitted to public repository. In this article we discuss the pros and cons of various options of getting the right Bioinformatics from these high throughput OMICS datasets.
The Bad – Getting the Data analysed from the Core Facility
This might seems to be the easiest and the cheapest solution, but wait, before you add $100 to your samples for data analysis. Your data ends in an automated pipeline with little human intervention and all you get is more results but little valuable information.
With being cheap, fast and accurate why it’s a bad idea. There are multiple reasons for that.
- Inappropriate Quality Control: Quality control step is normally missing. Reason being two fold. First if the quality is low and if they report they might end up to sequence more adding to their cost. Secondly quality is a heuristic parameter and it needs a trained Bioinformatician to go through different aspect of raw data before making any judgement. Low quality reads and reads quality reducing at the tail end of sequencing is relatively very common. The ideal solution at this point would be to remove low quality reads, trim the reads which is all decided by a trained Bioinformatician. Unfortunately automated pipelines can’t do this, and you can end up in sub-optimal results
- Analytics which generate data rather than information: Automated pipelines are designed to suit a wide range of experiments rather than focus on your experiment. The result is you get results but no information. Most of the clients who comes to us has already paid Core Facility for data analysis and not able to make any sense from those results.
The Good – Getting the data analysed by potential collaborators.
This may seem to be a good option; however it can be challenging to find potential collaborators. Moreover your collaborator might be busy with other big projects and may not be in position to give enough time to your data. More than often, it happens that understanding of the collaborator and the expectation of the Life scientist don’t match, primarily because of less understanding of the Biology by the collaborator Bioinformatician. Most of the professional Bioinformatician comes from non-bio background and it becomes hard for each other to understand the requirements and expectations.
The Best – Getting Professionals to work on your data
Getting your own full time Bioinformatician with relevant experience would be the best options to get the right informatics from your data. However finding such a person for short duration as that of the project may be incredibly hard. And when the project is complete he/she leaves the organization, but you may still need to look back at the data or the results.
One of the sustainable solutions would be to work with Industry partners like us. This is always cheaper than hiring your own Bioinformatician. Plus they bring is the knowledge and experience of working with dozens of similar project. Your data gets handled in a highly professional manner and the data and results can always be further mined at a later date.
Automated Pipelines vs. Professional Bioinformaticians
Considering an example of Sequencing, an automated pipeline would consider all data to be similar with sequencing errors within limits. You will get the results quick and fast but may not be the most accurate. Now consider it being analysed by a Bioinformatician or a Professional company like us. We thoroughly go through all the quality metrics, remove bad reads, trim the reads if required so as to obtain maximum alignment. When the question comes to alignment, we take one sample and thoroughly work around it to find the best algorithm and alignment parameters to get maximum alignment. Once all is done, other samples are aligned with the similar set of parameters.
We have considerable developed our expertise in downstream data analysis to ensure that our clients get the information they are looking for rather than sheets and sheets of information. Please visit our website at http://seqome.com
We would like to hear your comments on the article. Email. us at firstname.lastname@example.org