Big Data a Big Deal for Genome Research by: Geoff Geddes


Whether it’s perfume or your friend’s trip photos, less is more; when it comes to amassing data for the Efficient Dairy Genome Project, however, the more the merrier. Groundbreaking research in applying genomics to feed efficiency and methane emissions requires a lot of information and, as you might expect, gathering and analyzing big data is a big job.

“I handle almost all the data produced by the research farm, and because of the number of animals involved, it’s a huge amount,” said Dave Seymour, PhD Candidate – Animal Science & Nutrition in the Department of Animal Biosciences at the University of Guelph.

With 200 animals in the trial over the last two years, each one visiting the automatic feeder up to 30 times per day, the project has compiled about 9.5 million feed intake records to date. The only consolation is that feed data is recorded automatically, which is not the case for methane emissions.

Manual labor

“One of the biggest challenges is actually getting the methane measurements. We test four times a day from Monday to Friday, so I either have to do the testing myself or find volunteers. Driving out to the barn when roads are bad can be interesting, but it has to be done.”

Once the information is collected, the next task is figuring out how to put it all together in a way that makes sense. There are records for everything from milk production to body weight and body condition scores, many of them coming from multiple sources and for 200 animals. If you’re getting a headache just thinking about it, you’re not alone.

“Most of my time is spent organizing the data for analysis, as it must be organized in a certain way for the software to read it properly. I describe the process as trying to pick up the background noise amongst all the commotion and hone in on what’s really happening.”

On the feed side, another challenge is that there’s no set definition of what feed efficiency entails. Many people use a measure called residual feed intake, which is one type of statistical model that focuses on feed efficiency of an animal relative to a group of other animals; it’s essentially a ranking system.

“I’ve been trying to develop another measure that looks solely at the individual animal based on how much energy they take in from their diet and how much they expend producing milk and through other bodily processes that consume energy. That work is coming to a close and hopefully it will produce a more robust measure of feed efficiency that can be used in more situations.”

Big data, big impact

While the end goal of the Efficient Dairy Genome Project is finding genetic markers for traits important to industry, the analysis behind that effort is only as good as the observations made and data produced. Put it all together, and the impact could be huge for industry.

“The Journal of Dairy Science has been constantly publishing papers on feed efficiency and methane emission in the last couple of years, so this project is really timely. We’re working on a paper of our own now and hoping it will be well received by the scientific community. Clearly, this research is touching on aspects of importance to industry.”

For the average dairy farmer, recent developments around NAFTA and the TPP highlight the need to stay competitive. With feed costs representing their greatest expense, anything science can do to lower those costs will be a welcome development.

Unlike those tedious trip photos, the more progress producers see, the better off they’ll be.