We are all in the web


The key to future technological innovation lies in big data. How will our lives and our concept of privacy change? The opinion of Stephen Brobst.

It’s old news that websites and mobile apps control our movements for commercial purposes, to discover our tastes and generate customised retail offers. The history of our online actions is stored on websites through information called cookies, creating digital fingerprints: where we used an app or visited a site; what we clicked on, bought or shared on social media. This enormous and heterogeneous quantity of information is called big data. Stephen Brobst is one of the leading experts on creating tools that analyse online behaviour in order to identify other models of behaviour.

How aware are internet users of their digital footprint?

Most people are aware. If you are using Facebook, it is a free service, but (since) nothing is free, this means you are trading your data for that service. It also depends on how sophisticated the people are. Maybe the average person doesn’t know that if he has a Gmail account, Google reads his emails. It is all about trade — if you need help, you have to provide information in exchange. For example, if you are using Google Maps or Waze they have to know your location to take you where you want to go. You can turn off localisation, but the value of the service is worth more than knowing where you are all the time.

What are the main differences between structure data and big data in terms of analysis?

Structure data is largely coming from business processes of automated systems (like financial transactions). It is mainly internally focused and the information being captured is actually transactional. Big data, on the other hand, (has) three phases. The first phase is clickstream clustering data — if you are an online retailer like eBay, traditional transaction data would collect information on every purchase. big data implies interaction below the level of transaction. It is not just every purchase; it is every click and every search that led to that purchase. The second phase is collecting social media’s data, such as likes, shares and comments. This data is unstructured, but there’s a huge amount of signal in the noise, simply waiting to be connected to other information. The third phase is sensor data. You may have heard of the term ‘’internet of things’’. Everything will be sensor-enabled. Let’s take an example of automobile manufacture. The traditional data collected would be which cars you sell, costs, suppliers, etc., but with sensors in the vehicle, it is possible to see how cars are being driven. We are talking about all the data (we) create that helps automobile manufacturers design a better, safer vehicle. For example, Volvo (announced) that by 2020 nobody shall die in a Volvo car. With sensor technology, we can predict that you are running into the car in front of you and take an action to prevent an accident. ‘We’ is not a person; ‘we’ is a car that has all data that can save your life.

EU legislation on collecting cookies will soon take effect. Will it help internet users’ awareness?

Cookies, like most politicians, are ten years behind. Cookies are an outdated technology, really only functional on desktop navigation. Mobile use is the future and this law before it gets enacted is already obsolete.

What can regular — and not online — business get from big data?

Regular business is what I would call an omnichannel. Today, if you don’t have an online presence, you don’t have a successful business. big data upgrades a traditional business: you can track not only what your customers bought, but also what they looked at, how much they are influenced by promotions, etc. This information makes you more agile than your competitors. For example, a very traditional activity is going to the football match. There is one company in the US that sells events tickets. They have an app that provides you a lot of information that makes football a better experience. They are solving your problem of where to park and help you find your seat and your car after the event. Combining all these data, with big data analysis a traditional experience is becoming also more profitable for retailers.

How can big data analysis help sectors like healthcare and medical trials?

Healthcare is one of my favourite topics of big data because it has the most potential. The opportunity to collect clinical data, much as sensors in a car, is more valuable. We’re starting with some basic stuff like wearable devices that collect sleeping patterns, heart rate and activities during the day. Google, in partnership with Novartis, is developing contact lenses that collect blood sugars levels in your body in real time, really important for a diabetic. One step further is the collection of all data in one place: diagnoses, drugs prescriptions, x-ray images, medical and family tree and in the future your complete genome structure. This way we can treat a disease customising the drugs and the treatment. To avoid any privacy issue, we have to make sure that the data is owned by the consumer, so the best solution is that our governments own and host a database. It could also reduce healthcare costs and increase quality.

The use of big data is becoming global and more people are needed in the field. What are the key skills that a data scientist or analyst needs to possess?

There are six characteristics. A good data scientist is a very curious person. He has to have good intuition and must be able to figure out where to focus. The third thing is to know how to gather data, i.e. not to be afraid of technology (and) to use it as a tool. The next area is statistics: it is not enough just to be a mathematician. You should know how to use historical data to create a model and predict the future. Finally, good communication skills in order to explain why big data is a must-have for every business. The reality is that it is very hard to find all these characteristics in one person and that’s why we have to invest in training people. It turns out that graduates in applied science like physics or sociology are more inclined to become data scientists.