Analyzing Time series sensor data with Elasticsearch
Use the power of Elasticsearch and Kibana to visualize your IoT sensor data in an easy and intuitive way.
Data is the new oil for Digital Economy - and just like the crude oil, it has no real value unless it is refined and distilled.
Storing sensor data and finding patterns is a key problem to be solved in IoT space. Most of the IoT PaaS providers charge you a hefty sum of money to solve this problem. Interestingly, there are a variety of general purpose tools which are very good at visualizing data and showing insights. Based on Apache Lucene, Elasticsearch is one such tool which abstracts away the complexities of storing large amounts of records into a searchable database. It enjoys a huge popularity among e-commerce, log management, or any other kind of use cases for searchable data indexes. Coupled with an intuitive and elegant visualization tool (Kibana), the complete Elastic Stack provides an end to end solution for your analytics needs.
In this guide, we will explore the possibility of using Elasticsearch to visualize your sensor data for IoT. We will be using IoTIFY’s DBTool to quickly populate virtual sensor data into the database.
Approximate time: 30 minutes.
Objective: At the end of this guide, you will be able to visualize sensor data in a variety of ways and then take it further to develop your own solution on top.
So let’s get started.
Setting up Elasticsearch and Kibana is beyond the scope of this guide. We highly recommend trying a hosted platform at elastic.co. This is developed and maintained by Elastic guys themselves and has a decent 14 days free trial period to get you going. For this guide, we will be using the hosted version of the platform at https://www.elastic.co/cloud. You could also download a copy of Elasticsearch and Kibana and host it internally on your own server.
Sign up with Elastic and create your first cluster. In this guide, we will be using Elasticsearch 5.1.2. Note your Elastic password which will be generated automatically by the application while configuring the cluster. Save this at a safe place, as you will need to type it soon to log in to Kibana).
Wait till the cluster is up and running. Creating a new cluster usually takes 3-5 minutes. You can check the status of the cluster in the overview - Node section. Once the cluster is created, go to Configuration tab on the left menu and enable Kibana in your configuration.
Endpoints: Take a note of these two Endpoint URLs in the overview section.
HTTP: We will be using HTTPS endpoint in IoTIFY DBTool as an Elasticsearch URL.
Kibana: We will use this endpoint to log in to Kibana with username elastic.
Use the Kibana URL in the overview section as shown above to log in to Kibana. Log in using the username elastic and the password generated previously by the system. You will see following screen.
Don’t worry about the missing index pattern, we will create an index after we insert data into Elasticsearch shortly.
As a first step, we need to create a user account for IoTIFY DBTool who could insert data into elasticsearch.
Under Kibana UI, go to section Management -> Elasticsearch -> Users, and create a new user named iotify. For this example, we will assign the role superuser to this user so that it can create an index as well as insert records into the database. In the production environment, you may want to create seperate roles for creating and inserting data into indexes. Check as follows:-
Once the user is created, it’s time to insert data into Elasticsearch using DB Tool.
If you don’t have an account at iotify.io, sign up for free. The free account allows you to insert data and follow the guide. Otherwise, if you already have means to insert data into the instance, skip to Step 5 (in that case, you probably won’t need this guide anyway).
Go to Database section in IoTIFY application and create a new Template.
For this use case we will create a simple template in Database Tool as follows:
Explanation: temperature is a number with a mean value of 35 and standard deviation of 15; date is the timestamp field for each record. It is important to use moment() here because the time format generated by
moment().format() is automatically recognized by Elasticsearch dynamic mapping as date field.
Save this template with name “sensor“.
This is how the template editor will look like once you have created the template.
Once the template is ready, click on the preview button to make sure everything is working as expected. Then click on the generate menu on the top to go to database generation dialog.
In the generation dialog, select the sensor template you created earlier on the left side. Then specify following values on the form:-
Name of Database: This will be used as the index name of the elastic search. We will use iotify as index name for now.
Number of Records: In this example we will use 100 records but you could generate much more based upon your account types.
Reference Date: This is the time stamp of the first record. Set it to a recent value such as today or yesterday.
Interval: You could select the duration between each record. In this example, we will use a value of 5 seconds.
Important: Providing a correct reference date and interval is essential for this demo. Please don’t leave the fields blank.
Elasticsearch URL: Get the HTTPS endpoint of your newly created Elasticsearch cluster (from the cluster overview). It is important to use HTTPS (secure version) instead of HTTP.
Add the newly created Kibana username and password in Step 2 to the HTTPS URL as follows.
If your original URL is
https://abcd.eu-west-1.aws.found.io:9243, after adding your username and password it becomes
Remember to add @ at the end of the password field.
Click on the Start Generation button to insert sample records into the database. If everything goes well, you will see a green bar moving forward. Otherwise, you will see an error indicatiion.
A common mistake here is not inserting the correct HTTPS URL from the Cluster, or missing the @ character after the password.
If the above step is successful, the data has been inserted into Elasticsearch index. Now, it’s time to analyze the data. Go to Kibana once again and provide the index name iotify as follows:-
(in case you generated data with another name, use that here instead of iotify).
Thanks to dynamic mapping of the field, Elasticsearch will automatically recognize the date field as a time field. Leave the default value in the time field name and click Create.
If the date field has not been recognized as a time field, you probably didn’t create the template properly or inserted an incorrect index name above.
Now go to Discover section in Kibana. Most likely you will see a blank screen, don’t just panic yet. The data is already populated in the database, you simply need to select the right time scale to view your records. Click on the top right corner and choose the timescale to this month. Remember, you specified the reference date and time interval in DB Tool while generating records.
And voila, you will see this:
Great, so Elasticsearch has populated your records into its database. Now it’s time to play with them.
As a final step, we will do a line chart visualization of the average value of the sensor. Click on the Visualize tab in Kibana and select line chart type.
Choose iotify as an index.
Provide following values on the left side of the chart as seen in the picture below
Initially, you will see a green dot on the chart. The reason is simple, you have only inserted 100 records which are 5 seconds apart, while the chart horizontal time range is very large. Simply click and drag around the green dot to zoom in on the date range.
Once you have zoomed sufficiently on the exact date ranges, you will see the sensor data pattern as follows:-
In this chart, we will simply plot the distribution of the temperature readings within the range of 5 degrees (e.g. how many values were from 0 to 5, how many were from 5 to 10, and so on.)
Go again to visualization menu and choose Vertical Bar chart. Provide the configuration values as shown in the picture below. Make sure the time range on top is set to this Month.
As we have originally used a mean value of 30 in our template, we would see the largest number of records actually fall within the range of 30.
- Show the instances when Sensor value remains abnormally high for a longer time
- Filter out the spurious values in sensor (which do not fit into a pattern)
- Correlate two sensor values to predict a condition (e.g. high humidity and low temperature indicates rain)
We would be happy to learn more about your thoughts in our forum.
Follow us on Twitter to keep receiving interesting contents.
Hope you enjoyed it!