Tom Wemyss

Getting my genome sequenced




Last Black Friday, I saw a great deal on whole genome sequencing from Dante Labs. I didn’t enter into the process with any particular disease or hypothesis to investigate, but, given the amount of time I’ve spent during my degree carrying out genome sequence analysis, I thought it might be interesting to pursue, and see what I could find out about my own genome.

Confirming a prior suspicion

As part of my work on colour vision in John Mollon’s lab a few years ago, I carried out some colour vision tests on myself. While my colour vision was normal, it was notably different in some measures to others within the lab.

One of these tests was a Rayleigh match. In this test, observers are asked to look at a circle of light. One semicircle (usually the lower one) consists of a pure yellow light, made up of primarily 589nm light. The other semicircle consists of a mixture of red (665nm) and green (545nm) light. The proportions of red and green can be changed by the person carrying out the test. Their task is to make the appearance of two semicircles the same - the colour of the red/green mixture should appear the same as the colour of the pure yellow. Furthermore, the brightness of the two semicircles should appear the same - so the brightness of the yellow section can also be adjusted by the participant.

Completing a Rayleigh match gives two measures: (1) the ratio of red to green when the colour of the two fields appear the same; and (2) the set brightness of yellow when the two fields appear the same.

In general, people who are more sensitive to red light will require less red to make the colours of the two fields appear the same. My match required more red than other observers in the lab, which meant I was less sensitive to red light. The main factor that affects sensitivity to red light in people with normal colour vision is the sensitivity of the cones. The cones are cells within the eye which respond to colour. There are (usually) three classes of cones, which each respond (very roughly) to blue, green, and red light.

However, the colours (more properly, wavelengths) of light to which the cones respond is affected by the sequence of the genes that encode them. My supervisor suspected that I had a relatively common mutation, present in about 40% of European males, in which a Serine amino acid at position 180 of the long-wavelength (red) cone is replaced with an Alanine, discussed here.

At the time, this sounded reasonable, but it was untestable. However, now I had my genome sequence, I could just browse it, look for OPN1LW, the long-wavelength sensitive cone, and check what amino acid I had at position 180. Sure enough, I had Alanine at position 180.

Storing the data

Confirming a relatively inconsequential piece of data about my genome was pretty fun. However, our genomes also contain information which can be used as disease predictors. It’s not immediately obvious how private our own genome data needs to be kept - on one hand, we literally leave DNA around us all the times, but on the other hand, it’s data which can affect the price of health insurance.

I haven’t examined my genome for disease predictors, and I don’t intend to. That being said, I’d like to keep my data around, just in case there’s anything I want to look into in future, and I’d like to do so securely. I ended up encrypting the files. To do this, I used a GPG key (separate to my normal key) which is stored on a memory stick. Then, I leave the encrypted files on my computer. This isn’t really great practice - anything truly confidential should only be accessed on a PC which is “air gapped”, meaning it’s not physically connected to a network or the internet. However, it’s important to remember the threat model here - in this case, it’s mostly protecting against lower-effort attacks, such as viruses which may copy all files from a computer elsewhere. If anyone wanted someone’s genome sequence enough to do a targeted attack, there are a multitude of easier ways easier ways to get that data.

Thoughts on Dante Labs

Of course, when it comes to security, the company sequencing the genome also warrants consideration. In this case, Dante Labs did not have an unblemished history. There had been complaints about delays in their sequencing and about used kits being sent back out to customers. However, the company appeared to have responded to these issues appropriately - I haven’t seen any new reports about kit mishaps, and they upgraded to newer Illumina sequencing machines to improve their throughput.

In my case, as far as I can tell, there were no issues. The genome data seems very likely to be mine, as opposed to someone else’s - it predicts hair colour and eye colour well, and it has the Ser180Ala polymorphism. They were also relatively fast in my case - the cheaper kits promised an 8 week lead time, and I received my results just 21 days after posting the parcel, which was impressive, especially given the Christmas holidays fell within that period. The coverage (the amount of times a particular site was sequenced) was also pretty close to the 30x that they advertise.

Concluding thoughts

At current non-promotional prices, I’m not convinced that genome sequencing makes all that much sense unless there’s a disease that’s being screened for. That being said, I don’t regret my decision to get my genome sequenced. As our understanding of the link between genes and other factors (such as response to pharmaceutical drugs, the effect of diet, and disease) increases, I’m glad to have the option to come back to my genome and get some more evidence-driven conclusions about my health.