Monday, 17 December 2018

Trapping a pet bioinformatician for the lab

OK, this isn't about how to get a pet bioinformatician but a response to many of my students on the questions to ask regarding choosing a PhD position. Bioinformatics skills are in demand and labs typically have an abundance of data. It can be the case that a PhD student is seen as the best way to get skills in house to process that data - in the words of Francis Crick 'solving other peoples crossword puzzles'. Without the right environment you will not thrive and grow, and thus your opportunities to move on become limited. You become trapped as a pet bioinformatician.

The task when finding a good PhD position is one where you can grow in skills, grow in your scientific knowledge and build a reputation through asking and answering new questions in a good way. To do that you need infrastructure and mentors, both official and unofficial, who can help and inspire.

I would also encourage you to read the post by Professor Geoff Barton which explains a bit more about the UK academic system.

Question 0 - Why do you want to do a PhD?
Yes I number things from 0, but this isn't a question to ask of your prospective supervisor but of yourself. Why do you want to spend the next 4 years doing further study? In the UK this is typically a three year research project under a single supervisor, often after a period of training and rotation projects. Doing a bioinformatics PhD can be a great thing to do even without a view to a career in academia. There are very few supported roles where you get the opportunity to learn and develop at such a rate. But you need motivation and self-discipline to get the best out of it. So having decided that you want to go for it, and don't want to become a pet, here are questions to ask of potential positions.

Question 1 - What is the environment like? 
Look at the institute. How many other bioinformaticians are there in the institute? How many of them are in leadership (PI, management) roles? Without an environment that has experts in your field it is difficult to get suitable recognition. Are there bioinformatics seminars? Is there a bioinformatics support group (mailing list/coffee morning/slack channel) to allow you to interact with other bioinformaticians outside your group? What is the lab culture like? Have they heard of Athena SWAN and how do they approach that? A culture of overwork does nobody any good.

Question 2 - What about the lab?
A key bioinformatics skill, indeed it is an ILO for several of my courses, is the ability to look new things up and apply them. However it is very difficult as a baby bioinformatician to know whether these are the best way to do things. Who in the lab, who you will have contact with on a day to day basis, is competent at training you in the things in which you will need to be trained?

Question 3 - What about the training?
What are the opportunities for training? Are there formal courses run in the institute? Do you have training events like Software Carpentry? Has your potential supervisor even heard of Software Carpentry? If there are external courses/workshops available, will you be able to go to them?

Question 4 - Is the supervisor competent to supervise a bioinformatics project?
Serious, but challenging question for a prospective student to ask. How do you assess whether you supervisor can actually supervise? Look at their publications. Ask them who did the bioinformatics on them. Does the supervisor have any bioinformatics skills beyond the web interface of their favourite database? Would they recognise a command line, or be able to give you guidance in how to structure an R or Python module tailored to your work? They don't necessarily need specific skills if there are others in the lab who can, but should be able to appreciate the technical details of what you are doing. And if the PI is reliant on others, how stable is their position?

Question 5 - Whose question is it anyway?
A key trap for bioinformatics students is to get carried away with the joy of developing your technical skills and forget to apply them to the biology. Building new tools and learning new skills is part of the trade, but it isn't all of it. Good tools are hard to do well, and there is a tool graveyard full of PhD projects. Ultimately the purpose is to ask and answer questions about biology and further your understanding. Will you have the freedom to develop your own lines of enquiry or will you be stuck developing new ways to answer other peoples questions?

Question 6 - Can you do it your way?
The most important thing in science is the question, not the means to the answer. A supervisor who insists you only use a specific tool (typically because that is the only one they know), should be a red flag. There are many good reasons for a lab to settle and require (in the absence of a good case to the contrary) that you use certain tools so that the knowledge gained is shareable - e.g. Python/git and certain testing frameworks. But that is not what is essential here. Can you suggest alternative ways to the answer or does the lab head insist it is only done with a certain package?
A related question is time to read more broadly - will that be encouraged? It is hard to think outside the box of your mind is always focussed within it.

Question 7 - Does it add up?
Will you have suitable statistics and analytical support? By this I mean those who can advise rather than do it for you. Is the supervisor statistically savvy - do they know what they are doing or are statistics an afterthought. If all their stats are done in Excel, that could be a bad sign. Do they encourage you to enhance your analysis toolkit or is stats a pernicious evil between them and publication?

Question 8 - Leading lady or supporting cast?
Which bioinformatics heavy papers have been published from the lab and where was the bioinformatician in the author list? If the intellectual direction of the paper is set through the analysis then it should be a bioinformatician as the lead author. How many non-tool description papers has the lab published where PhD students, and particularly bioinformatics PhD students are first author? What does the supervisor envisage from your project if you need data to be produced by others in the lab? Will you even be allowed to ask that question? Is your project to develop new insights in existing data with a specific question or questions, or is it to develop new methods and ways of thinking about more general questions? These require different approaches.

Question 9 - Can you meet the lab?
If you can't meet the lab then this is a big red flag. Find out from them what the PI is like to work for, where the lab members  see themselves going and why. How many conferences do they get to attend? How much collaborative work with other labs? You must be able to meet them without the supervisor present - a supervisor who is paranoid about what their lab says about them is unlikely to foster a good working relationship.

Question 10 - What resources will you have access to?
Is there a supported research computing environment  or will you be maintaining your own machine (and having to learn sysadmin skills as well)? Are there supported databases, file storage etc that are sufficient for your needs? Does the lab have it's own group repository? Can you get access to the compute you need, when you need it?

1 comment:

  1. And more good advice here that is not bioinformatics lab specific: https://kamounlab.tumblr.com/post/188810954020/how-to-select-a-phd-lab

    ReplyDelete