Data That's Out of This World

How astronomers are using big data


Scientists have already gathered enough data to make many huge astronomy discoveries. There’s so much data, though, that they haven't been able to sift through it all yet to make those discoveries.

Earlier this year, for example, scientists discovered a massive group of black holes at the center of the Milky Way. Scientists collected some of the data they used for the discovery around 20 years ago.

An astronomical amount of data

The advanced observatories operating today cover the entire electromagnetic spectrum and collect huge amounts of data. The Hubble Space Telescope, for instance, transmits approximately 150 gigabits of raw data each week. A gigabit, for reference, is equal to 125 megabytes.

As technology improves, telescopes can collect more and more data. The amount of data we have about the universe doubles about every year thanks to this technology and this continuous advancement isn’t stopping any time soon.

The Large Synoptic Survey Telescope (LSST), for example, will be able to capture an image of the entire southern sky in three nights when it’s completed. The Hubble Space Telescope, as powerful as it is, would need 120 years to achieve this feat. In 10 years, LSST will create around 30 petabytes of data, which equals 30,000 terabytes.

The Square Kilometre Array (SKA) will be the largest radio telescope in existence when it’s completed in 2020. It will consist of thousands of dishes and as many as a million low-frequency antennas and could collect as much data as the entirety of the internet in just one day. It’s expected to be 10,000 times more powerful than any telescope that’s currently in operation.

The opportunities and challenges of big data

Such vast amounts of data present incredible opportunities but also challenges. Soon, there will be too much information for scientists to look at a representative sample of all the data available.

That’s where data analysis technologies such as machine learning come in. Increasingly, astronomers will need to rely on algorithms to sift through data to identify patterns and unusual events.

Another shift that may help scientists handle astronomical data comes from the fact that much of data that observatories collect becomes publicly available. That means that virtually anyone with internet access can analyze and interpret the data and, potentially, make groundbreaking discoveries.

That’s a big change from the past when astronomers stored data on photographic plates. They sometimes published it in catalogs, but it was typically difficult to access data from observatories you weren’t directly associated with.

In the future, though, amateur astronomers may make some of the biggest discoveries, with nothing but a computer and an internet connection.

Another challenge associated with these large volumes of data is the processing power it requires. Building huge telescopes isn’t enough. We also need facilities that can process and store the information they generate.

SKA will require supercomputers that have more than 100 petaflops of raw processing power. That’s about as much as one hundred million personal computers made in 2013.

A new kind of astronomy

What this all means is that we need to take a new approach to astronomy. Making observations is only part of the process. Interpreting the data is becoming an increasingly crucial part of making discoveries.

It’s now common to do astronomical work without ever touching a telescope. We have so much data that there’s plenty left to be discovered in the information we’ve already collected.

This opens up doors to many who might not traditionally be able to do astronomical work. Data scientists will play an increasingly vital role, and people all over the world, including citizen scientists, can now make groundbreaking discoveries about our universe.


Read next:

Why We Need Data Visualization To Understand Unstructured Data