Most MQTT brokers don’t provide any built in mechanism for logging historical data for later analysis.
However due to the publish/subscribe nature of MQTT is is easy to monitor, and log a data stream or streams using an MQTT client.
In fact this method may in many circumstances be better than logging the data on the broker/server especially when the broker is being shared.
Logging Methods
There are two common ways of storing sensor data.
- A file
- A Database
Generally logging to a text file is faster and easier than logging to a database.
However if the data needs to be queried at a later date then logging to a database makes this much easier.
ref– Using a database for logging
Database Types
There are two database types to choose from
- SQL – SQlite, MYSQL etc
- NOSQL -MongoDB etc
ref– SQL or NoSQL
Sensor Data
Many sensors like state sensors (ON/OFF), temperature etc will fluctuate very little over the course of a day so does it makes sense to store the same sensor value every x seconds?
Take for example a light sensor that publishes its status every second.
After 1 hour it has published 3600 status messages.
If we assume that the light status hasn’t changed during that period then it doesn’t make much sense to store the 3600 status messages.
The only messages that need to be stored are changed messages.
With this method if our light changed state twice in the hour then we would store 2 messages.
Over a day assuming 20 state changes we would store 20 messages rather than 86,400(24*3600) messages.
However even if the sensor does send data a regular intervals regardless of state then it doesn’t make sense to store it.
Therefore a data logger should have the ability to only log changes.
Notes:
1. The sensor needs to publish with the retain message flag set.
2. A sensor shouldn’t publish a time stamp with the sensor data if it needs to be logged. If it does the message will appear to have changed and the data logger would need to take this into account.
Here is a screen shot of some real data that I collected from a public broker.
Notice the only change is the time stamp. Therefore if the time stamp is published with the data then it is a good idea to remove it if possible or not to include it in the changed data check.
Sensor Data Format
Sensor data can be simple state or value data e.g On/OFF,26.1 or more complex JSON encoded data.
MQTT Data and Topics
MQTT data is organised in topics and if you are monitoring several MQTT topics then you may want to log each topic stream to its own file rather than logging all topics to a single file.
Data Loggers
I have created several data loggers for logging IOT data and specifically MQTT data.
Simple Python MQTT Data logger Python script Logs data to text file as JSON encoded Data.
SQL MQTT data Logger Python script which logs data to a sqlite database.
Simple MQTT Topic logger -Python script Logs data by topic to a text file as JSON encoded Data
Node-Red Data and Topic Logger- Node Red Flow with web user interface that lets you choose to log by topic or not.
Logging Data To a SQL Database with Node-Red-How to store data in a SQL database using SQlite
Resources and related tutorials
- Simple Controllable MQTT Sensor Simulator in Python
- MQTT Sensors and Network Traffic Observations
- MQTT Topic and Payload Design Notes
- Checking Active MQTT Client Connections
Hi Steve
Great site. I often find myself back on your site regarding MQTT things.
I note one post from Christian Valerius regarding monitoring of bee hives / apiary
I am working on somthing similar, so it may be great to share and exchange ideas with him.
Would you mind sharing my email address with him to see if he would be interested in doing this ?
Many thanks Steve
I have forwarded your comment
I am trying to run your logger as a cron job @reboot. The logger starts and creates the log file, but it is empty. When I run the logger from a console, things work as expected. Any clue why it doesn’t work as a cron job?
Peter
No sorry But I did find this
https://stackoverflow.com/questions/6766730/python-multithreading-issue-in-cronjob
If that helps or doesn’t let me know and I will try it out myself
Rgds
Steve
I’m running a number of esp-32 microcontrollers with ethernet, wired into fire panels around our base. if any of the relays in the panel trigger (fire, trouble, supervisor), an event is published to a topic, specific for that panel / relay. There is a heartbeat signal that is also published. On the broker, a python script inserts the event into MySQL, and the heartbeat event to a separate table, that updates (not inserts) that topic. Another python script emails events and heartbeat change (loss / re-aquire) to a distro list on our exchange server. A web page on our server uses ajax to display the status of each topic it discovers, color-coded for clear, active, or loss of heartbeat (green / red / gray). That web page is monitored by the base fire dept. A separate microcontroller subscribes to various topics and activates strobes and alarms as necessary. A very clean and functional solution that saved us a few hundred $$ on a central monitoring system.
Steve, brilliant work and easy to understand.
I am working on a system to monitor several beehives (beeyards) collecting several physical units (such as weight, temperature, humidity, sound level, …). I will need the data (logging periode 1 year and longer) for analytics of the bee’s behavious in relation to weather (e.g. precipitation) and other circumstances. Hence historic data are important. SQlite seems to be the best choice, as my MQTT broker in running on a Raspberry PI.
Rgds Christian
Good luck with the project.
You said that “A sensor shouldn’t publish a timestamp with the sensor data if it needs to be logged.” Why not? In my thinking, it seems almost elemental that you would want a timestamp from the sensor to accompany the sensor data, regardless of logging status.
When logging sensor data you can choose to log only changes if you send a timestamp then the data always changes.
So when looking at what data you send then you should also consider what you are doing with it. Even if you send the timestamp you could always ignore it when considering a change. The data logger available on the site doesn’t do this but you can always change that.
When creating sensors being able to disable sending a time stamp would be useful as well as only sending data changes.
Rgds
Steve
I believe using a time series dB like influx would allow you to log just the data without timestamp and the DB will automatically timestamp it for you. This is probably only useful when latency in data logging is not as crucial.
I tried influxdb but it was difficult to setup and a bit of an overkill for most I believe.
Rgds
Steve
Thanks for you project.
is there a project which allows to visualize the information of the log files in a graphical interface?
Hi
No not at the moment but I will be doing a video on using node-red graphs to display both real time and stored data in a few weeks time as time permits.
What programming language would you be using ? Python ?
rgds
steve
Hi Steve, thanks a lot for your tutorials! Your website is a great resource.
Would you consider a guide on using Grafana to display SQL data?
I will take a look tks for the suggestion
Rgds
Steve
Hi Steve, your work is amazing. Your work helped me a lot. I am using this data logger with little modifications in my major project of IoT. I am building an IOT based system which gathers sensor data for monitoring activities of elderly people. Your explanation is also good.
Thanks.
Tks for the comment and good luck with your project
rgds
steve