Subscribe to CIO Magazine »

Conquering Big Data by stream computing

AUT academic says his research on managing data from mega science projects can revolutionise enterprise use of analytics.

NoneThere is big data, and then there is mind-bogglingly enormous data; the latter is the scale at which Mahmoud Mahmoud has been focusing his research on for the last three years. And he says his work will be a “paradigm shift” in the way businesses use big data in the future.

The AUT University computer scientist has been teaching on and off for the better part of a decade, and is currently working on finishing his doctorate. He originally came to New Zealand in 1994, from Kuwait where he was raised and educated.

Mahmoud started his career as a graphic designer, but followed a childhood passion for computers to his current position.

“I have always been a computer geek, even as a little child I remember while the other kids were doing their reports on ancient Egypt using colouring pencils and paper, I did mine using a word processor on my computer,” recalls Mahmoud.

Since 2009, Mahmoud and his team at AUT’s Institute of Radio Astronomy and Space Research (IRASR) have been working on ways to glean useful information from the enormous quantity of data that is produced by mega-science projects like the Square Kilometre Array (SKA).

IRASR was a part of the joint bid with Australia’s Commonwealth Scientific and Industrial Research Organisation to build the SKA radio telescope project in the Australia-New Zealand region.

Mahmoud’s research led to a paper published in late 2011 on the use of stream-computing to analyse the data as it is produced, instead of storing it to be mined later. He explains that stream-computing is much like putting your finger in the air to gauge which way the wind is blowing, it is quick and relatively effective.

“With stream-computing, rather than storing the data we store the queries we want to apply to it. We probe the data with questions using the queries, and are given real-time answers as it comes by,” says Mahmoud.

“The idea is you don’t need to wait until there is downtime to process the information. You can immediately elicit out of the stream the relevant information without needing to store it.

“When you consider that 99 percent of the data collected is likely to be nothing but noise, this saves a lot of time, and money wasted on storage.”

Mahmoud says stream-computing could be valuable for businesses looking to leverage the large amounts of data created from various sources online, to make better business decisions.

“Businesses are in the age of information overload. You have stock prices, market data, Twitter, Facebook, SMS, blogs — all the information just coming out of your ears and being wasted,” says Mahmoud.

“Each one of those points of information can lead to better forecasting and decision making when harnessed correctly, but it’s also important it is collected in a reasonable amount of time to give businesses agility and a competitive edge.”

An example of its use is in the financial sector, where banks and other financial institutions are constantly monitoring market data for the latest trends. Mahmoud says stream-data would enable those businesses to make better decisions on-the-fly without needing to wait several hours for the information to be compiled and analysed.

He says projects like SKA will help speed up research and development in stream-computing by bringing in business interest, but only if the infrastructure is interoperable with what is currently used in enterprise.

Mahmoud’s research uses IBM’s InfoSphere Stream technology as its parallelisation middleware to manage CPU usage and to hold queries. He says other stream-computing infrastructure, like that used by CERN for its Large Hadron Collider research, uses highly customised components which would be difficult to replicate for business use. “At CERN they use a protocol called White Rabbit to query their data. This is a very comprehensive system, but it’s not interoperable with other protocols,” says Mahmoud. “They manufacture everything right down to layer one, it needs special hardware and routers which couldn’t be used by most modern businesses.”

He says further research could help realise the “Holy Grail” of cloud computing, which is contextual search and answers.

“At the end of the day people should be able to use language specific to their domain or expertise, whether you are in the medical field or financial field or whatever, to ask questions to this stream, and it will give you a contextually aware answer.

“This is the Holy Grail for cloud computing, and while we are not there yet our research is heading in that direction.”

Comments are now closed.
Related Whitepapers
Latest Stories
Community Comments
  • Transform IT, Transform the Enterprise
    Existing IT operational models and an ageing infrastructure are CIOs back from their full potential. This paper reveals the three IT imperatives for a CIO-led transformation, and details how CIOs are adopting strategies to change IT and assert their organisations as business leaders and innovators.
    Learn more »
  • Top 20 Critical Security Controls - Compliance Guide
    Simply being compliant is not enough to mitigate attacks and protect critical information. Organizations can reduce chances of compromise by shifting away from a compliance-driven approach. This guide provides the Top 20 Critical Security Controls (CSCs) developed by the SANS Institute to address the need for a risk-based approach to security.
    Learn more »
  • How to Successfully Select an ERP System
    An Enterprise Resource Planning (ERP) system is a series of software applications that collect and compiles data from different departments to enhance collaboration and co-ordination within the business. If you’re looking to implement your first ERP system, or to upgrade from an existing system, this whitepaper offers eight simple steps for selection that will lead to long-term strategic success.
    Learn more »
All whitepapers
rhs_login_lockGet exclusive access to Invitation only events CIO, reports & analysis.
Recent comments