Talking in code

Open sourcing and how the oil and gas industry can benefit

In 2003 and 2004 Google published two technology papers, “Google File System” [1] and “Google – Map Reduce: Simplified Data Processing on Large Clusters” [2]. While not containing the actual code, these two papers set out the concepts behind Google’s unique search engine. In doing this, Google effectively open sourced their innovation.

The insights contained within these two papers allowed Doug Cutting to develop a software framework called Hadoop, which he named after his son’s toy elephant. The framework provided by Hadoop underpins much of the big data and analytics revolution we are witnessing today. True to its origins in the Google papers, Hadoop is open source software where the source code is made available to license holders with the rights to update and improve the code on the condition it is shared with the wider community. This collaborative community of users builds on the work of others and uses the concept “many minds are better than one” to drive improvement more quickly than would be possible in a closed software development setting. The community includes companies like IBM and Oracle who have standardized on Hadoop and contribute to the open source. Facebook and Yahoo! also use Hadoop for their collation and analysis of data, with Yahoo! making their source code available to the open source community. In 2014 it was estimated more than half the fortune 50 companies used Hadoop [3] and Visa claim their analysis model built on Hadoop “has identified $2 billion in potential annual incremental fraud opportunities, and given it the chance to address those vulnerabilities before that money was lost” [4].

Google’s decision to make their insights available gave us Hadoop, Hadoop and its open source community gave us the big data revolution, now Google are offering their big data software as a business. Doug Cutting is quoted in Thomas L. Friedman’s book, ‘Thank You For Being Late’, as saying, “Google is living a few years in the future, and they send us letters from the future in these papers and we are all following along and now they are following us and it’s all beginning to be two-way”. As we move into the future, open source is driving the acceleration of the big data revolution.

Open source successes

Learning from the success of open source, GE have applied the concept to Predix; their Industrial Internet of Things (IIOT) software platform. They describe Predix as being a “Multi-tenant “gated community” model” [5] where “independent third parties can also build apps and services on the platform, allowing businesses to extend their capabilities easily by tapping into the industrial app ecosystem” [6]. GE believe there may be a $225 billion market for the Predix platform and applications by 2020 [7]. But it is not just software that can benefit from the open source approach, GE’s Open Innovation initiative runs contests to solve engineering problems; they provide a problem and selected intellectual property to the global community in an effort to crowd source solutions. In 2013 GE presented a problem to GrabCAD’s open engineering community, redesign an aircraft jet engine bracket such that it could be 3D printed and reduce the weight, while still supporting the engine’s weight. The winner, M Arie Kurniawan from Indonesia, produced a design that reduced the bracket weight by 84% [8] while still meeting the requirements for axial and torsional loads. He had no aviation experience.

What can open sourcing big data mean for the oil and gas industry?

The industry can participate fully in the big data revolution and indeed is starting to, such as using Predix to harness the power of the IIOT. We can also follow GE’s lead and provide problems to a global community of makers and problem solvers to find innovative solutions borne out of collaboration. The real benefit, we believe, lies in a combination of the open source examples; it is time for the oil and gas industry to start open sourcing their big data. If there are valuable insights to be gleaned from the analysis of data on one piece of equipment, on one platform, in one operating environment, this value only scales up when we consider the data from all the equipment on the platform and the effect is compound when we start to consider multiple platforms from multiple operators. Visa attribute their fraud detection success to the ability to detect a pattern in a far larger data set than they had been able to analyse previously.

In a similar way, by analyzing the data from multiple assets with similarities and differences, will allow a far more sensitive pattern recognition to take place which will leverage the data set to drive innovative solutions to the challenges of operational efficiency. More than just increasing the data set, by open sourcing operational data, the operators will allow collaboration and innovation and a scale that simply is not possible for any one company to achieve. The increased sensitivity of pattern recognition may allow previously disparate suppliers to collaborate and identify means of solving efficiency challenges. By sharing their operational excellence, operators can learn from each other and raise the overall efficiency of the industry to a new level and if we are learning how to make operating assets more efficient, then we can read these learnings back into the concept phase to design and build more efficient assets from the outset.

At io we employ both single and double loop learning. Identifying an issue and correcting it is single loop learning, whereas identifying an issue, correcting it and then challenging the policies, processes and thinking which allowed the issue to arise is double loop learning. If asset performance is open sourced, it not only allows operators to improve the efficiency of operations, it also allows io to adopt these learnings and challenge the status quo to start with the end in mind; more certainty, efficiency and value for our clients.

Many people have discussed collaboration being the solution to many of the challenges posed by the current potential lower for longer environment and how many in the operating and contracting community are bold enough to truly collaborate? Concerns may exist about knowledge leakage being a loss of competitive edge but when obsolescence is accelerating, then speed of innovation is more important than knowledge. This concept was described in the 2009 Harvard Business Review article, “Abandon Stocks, Embrace Flows” [9], which postulated that knowledge stocks were being replaced by knowledge flows as the most important source of value. In this world, while you may know the best way to run an asset today, it will be obsolete tomorrow and the only way to stay with the advances is to participate in the flow of knowledge facilitated by open source. At io we believe in embracing new ideas, new technologies and innovations as much as challenging the status quo. The world is changing, it is changing the way we do business. The oil and gas industry can benefit now by adapting to changes and opportunities like open sourcing data to ensure it does not become an obsolete industry.

[1] https://static.googleusercontent.com/media/research.google.com/en//archive/gfs-sosp2003.pdf
[2] http://static.googleusercontent.com/media/research.google.com/es/us/archive/mapreduce-osdi04.pdf
[3] http://fortune.com/2014/06/30/hadoop-how-open-source-project-dominate-big-data/
[4] http://blogs.wsj.com/cio/2013/03/11/visa-says-big-data-identifies-billions-of-dollars-in-fraud
[5] https://www.predix.com/overview
[6] https://www.predix.com/sites/default/files/predix-the-industrial-internet-platform.pdf
[7] https://sloanreview.mit.edu/case-study/ge-big-bet-on-data-and-analytics/
[8] https://www.wired.com/2014/04/how-ge-plans-to-act-like-a-startup-and-crowdsource-great-ideas/
[9] https://hbr.org/2009/01/abandon-stocks-embrace-flows