LSST Lights the Way Towards the Frontiers of Data-driven Discovery
			
			
			New window on the Universe will drive advances in extracting knowledge from massive amounts of data
			
		
		
		
			
			
				
				Anonym
				
			
		
			Physics has long been the crucible in which Big Data has been wrought. High energy particle accelerators and large telescopic arrays have long deluged researchers with ever increasing amounts of data for analysis. But even the Large Hadron Collider at CERN in Switzerland, a place that pumps out and stores 25 petabytes of data per year, pales by comparison to the challenges presented to scientists by the Large Synoptic Survey Telescope (LSST) project.

Credit: LSST 
Construction officially began in August on LSST, an 8.4 meter, very wide field (3.5 degrees in diameter) view reflecting telescope with a 3.2-gigapixel prime focus digital camera. When fully completed sometime in the early 2020's, that camera will take a 15-second exposure every 20 seconds, capturing up to 30 terabytes of data per night, and effectively imaging the entire southern sky every few nights from its mountaintop perch in Chile.

 Credit: LSST 
Factoring in scheduled time for maintenance and cloudy nights and such, the camera will be taking over 200,000 pictures - 1.28 petabytes - per year. The data volumes are such that the challenge goes beyond simply collecting the images. The challenge lies in mining and extracting scientific knowledge from petabytes of highly complex data.
 
The telescope itself is innovative, using a tertiary mirror that will be cast simultaneously and from the same substrate as its primary. The LSST design provides a very wide, undistorted view that will enable the recording of 20 billion cosmological objects 800 times each over a ten year data collection. The database will contain approximately 500 attributes for each survey object, and grow to something like a 100 petabytes in size.

Credit: LSST
 
In the end, the project will provide a color movie of the universe visible through the whole southern sky that plays back all its changes over ten years. That's the synoptic part. The survey will catalog the orbits of asteroids down to 100 meters in diameter that potential could impact the Earth, detect weak gravitational lens signatures of dark energy and dark matter, and create an exquisitely accurate map of the Milky Way that will enable study of the structure and evolution history of our home galaxy.

Credit: LSST Project Office 
 
What's really exciting, however, is that scientists don't yet know what discoveries will be contained in that 500 dimension data cache, and finding correlations among the various attributes is something that will require advances in data management, mining and analysis before the science can even begin. It is hoped that such a vast volume of complex data will lead to serendipitous discoveries. Near-real-time alerts will be issued when the system automatically detects objects which have changed in position or brightness. New algorithms will run on machines pouring over massive volumes of data searching for something interesting without knowing before hand what necessarily constitutes being interesting. It will have to be done that way, because the task is beyond what any one human or group of humans can accomplish. In a way, LSST appears to be an outpost on the frontier of data-driven analytics and discovery.