How is data collected then converted into information that helps city councils make decisions? We look at its journey from the physical world (in the field) to the digital world (storage, AI, business applications, etc.).

Data processing systems based on smart-type environments can appear truly baffling for the uninitiated. At the very start of the chain, you have sensors collecting information in the field, such as temperature readings, bin fill levels and CO2 rates, and transforming it into digital data. In addition to this field data, information is gathered from external systems like the weather forecast.

And at the other end of this labyrinthine journey, you have the complex world of applications, which are intended to guide decision-making and, as a result, action. These include business applications used by city technical departments, or web and mobile applications for operations personnel and end-users.

But there’s a whole part of the process that is more obscure, between the connected object and the application. It consists of the building blocks that enhance raw data, delivering value-added information that helps decisions to be made and action to be taken. The first step in this “black box” process involves collecting data from a wide range of sensors, which means being able to support all of the communication protocols used by the sensors.

Convergence towards a data lake

Data of all types, and indeed formats, then converge towards a data lake, a pool of “unstructured, unaltered data,” which serves as a storage area in which data scientists can easily get bogged down if it is not properly organised. “They can spend 75 % of their time cleaning the data before being able to interpret it,” says Gwendal Azous, IoT consultant at Axians, the VINCI Energies ICT brand.

For example, depending on the type of sensor, temperature data readings can be expressed in Celsius or in Fahrenheit. To make the process more effective, the data needs to be made uniform.

To do that, Axians implements a software programme at the pre-storage stage to standardise it. This means that a temperature reading in Fahrenheit is automatically translated into degrees Celsius if the client has chosen that standard with its integrator. The same goes for dates, whose structures vary from country to country. “The way in which a data lake is organised,” says Edouard Henry-Biabaud, business development manager at Axians, “is key to the system’s effectiveness.”

Once it has been standardised, the data follows a path through the various databases that make up the data lake. These have specific retention or persistence (data life span), speed of access, and storage capacity characteristics, based on the type of processing to be applied to the data.

Hot, warm, and cold databases

With so-called hot databases, for instance SQL-type relational systems, data can be accessed on fast storage and processed in near real time. These are used for the object repository, the monitoring data that requires processing or immediate action, for example. Data retention time is limited, running from one to four weeks.

Beyond this limit, data is not deleted but transferred to a cold database, designed for archiving. In this kind of database, data access time is less critical.

“Based on measurements like pedestrian flow, weather and car-park occupancy rate, we can predict how busy a shop will be”

Warm databases are the domain of big data. Combining quantity and speed of processing, they drive data analytics processes and smart algorithms.

“Available on average for at least one year (depending on the number of use cases and amount of data), this data is used to perform predictive analytics aimed at optimising maintenance, anticipating behaviours, and making decisions prior to an event,” points out Edouard Henry-Biabaud.

A key stage in processing data is contextualising it. “Raw data such as ‘It is 22 degrees’ is not sufficient to determine if it is important or not. But as soon as it is cross-referenced with month and location-related data, it becomes more meaningful. 22° in January in Paris is hot. It’s normal in August. Once combined by data analysts, the data gets enhanced and delivers information which, thanks to AI, then generates insights,” explains Henry-Biabaud.

Gwendal Azous adds another example: “Based on three measurements like pedestrian flow, weather, and car-park occupancy rate, we can predict how busy a shop will be.”

Applications that inform action

Once it has been collected, standardised and stored in the appropriate database, data then helps to drive end-use applications. The benefit of this architecture for a town or business is that it makes it possible to have a wide range of different applications, which pull the necessary data from the various hot, cold, and warm databases, in other words from this single “data lake” repository.

Various types of applications access the databases through APIs (Application Programming Interfaces). These include business applications, which provide a clear picture of a use case such as public lighting, air quality management ,or waste collection. This kind of application enables city technical departments, often organised by area of activity, to access the information that is relevant to them so that they can carry out their tasks.

They also include cross-cutting applications, which correlate information from various use cases, delivering a fuller, less silo-based, perspective of city operations. A hypervisor can be installed atop the system to facilitate the visualisation and consequently the integration of the key information generated by the various applications.

From data collection through to processing, a city’s information system set-up produces insights that inform decision-making because, at the end of the day, it’s well-informed humans who decide, not algorithms!