Logging MQTT Sensor Data

Most MQTT brokers don’t provide any mechanism for logging historical data for later analysis.

However due to the publish/subscribe nature of MQTT is is easy to monitor, and log a data stream or streams using an MQTT client.

In fact this method may in many circumstances better than logging the data on the broker/server especially when the broker is being shared.

Logging Methods

There are two common ways of storing sensor data.

  • A file
  • A Database

Generally logging to a text file is faster and easier than logging to a database.

However if the data needs to be queried at a later date then logging to a database makes this much easier.

ref Using a database for logging

Database Types

There are two database types to choose from

  • SQL – SQlite, MYSQL etc
  • NOSQL -MongoDB etc

refSQL or NoSQL

Sensor Data

Many sensors like state sensors (ON/OFF), temperature etc will fluctuate very little over the course of a day so does it makes sense to store the same sensor value every x seconds?

Take for example a light sensor that publishes its status every second.

After 1 hour it has published 3600 status messages.

If we assume that the light status hasn’t changed during that period then it doesn’t make much sense to store the 3600 status messages.

The only messages that need to be stored are changed messages.

With this method if our light changed state twice in the hour then we would store 2 messages.

Over a day assuming 20 state changes we would store 20 messages rather than 86,400(24*3600) messages.

Preferably it would be better if the sensor only sent changes as this would also reduce the network traffic.

However even if the sensor does send data a regular intervals regardless of state then it doesn’t make sense to store it.

Therefore a data logger should have the ability to only log changes.

Notes:

1. The sensor needs to publish with the retain message flag set.

2. A sensor shouldn’t publish a time stamp with the sensor data if it needs to be logged.

If it does the message will appear to have changed and the data logger would need to take this into account.

Here is a screen shot of some real data that I collected from a public broker.

Notice the only change is the time stamp. Therefore if the time stamp is published with the data then it is a good idea to remove it if possible or not to include it in the changed data check.

data-logging

Sensor Data Format

Sensor data can be simple state or value data e.g On/OFF,26.1of more complex JSON encoded data.

MQTT Data and topics

MQTT data is organised in topics and if you are monitoring several  MQTT topics then you may want to log each topic stream to its own file rather than logging all topics to a single file.

Data Loggers

I have created several data loggers for logging IOT data and specifically MQTT data.

Simple Python MQTT Data logger Python script Logs data to text file as JSON encoded Data.

SQL MQTT data Logger  Python script which logs data to a sqlite database.

Simple MQTT Topic logger -Python script Logs data by topic to a text file as JSON encoded Data

Node-Red Data and Topic Logger- Node Red Flow with web user interface that lets you choose to log by topic or not.

Logging Data To a SQL Database with Node-Red-How to store data in a SQL database using SQlite

Resources and related tutorials

Please rate? And use Comments to let me know more

17 comments

  1. Hi Steve
    Great site. I often find myself back on your site regarding MQTT things.
    I note one post from Christian Valerius regarding monitoring of bee hives / apiary
    I am working on somthing similar, so it may be great to share and exchange ideas with him.
    Would you mind sharing my email address with him to see if he would be interested in doing this ?
    Many thanks Steve

  2. I am trying to run your logger as a cron job @reboot. The logger starts and creates the log file, but it is empty. When I run the logger from a console, things work as expected. Any clue why it doesn’t work as a cron job?

  3. I’m running a number of esp-32 microcontrollers with ethernet, wired into fire panels around our base. if any of the relays in the panel trigger (fire, trouble, supervisor), an event is published to a topic, specific for that panel / relay. There is a heartbeat signal that is also published. On the broker, a python script inserts the event into MySQL, and the heartbeat event to a separate table, that updates (not inserts) that topic. Another python script emails events and heartbeat change (loss / re-aquire) to a distro list on our exchange server. A web page on our server uses ajax to display the status of each topic it discovers, color-coded for clear, active, or loss of heartbeat (green / red / gray). That web page is monitored by the base fire dept. A separate microcontroller subscribes to various topics and activates strobes and alarms as necessary. A very clean and functional solution that saved us a few hundred $$ on a central monitoring system.

  4. Steve, brilliant work and easy to understand.

    I am working on a system to monitor several beehives (beeyards) collecting several physical units (such as weight, temperature, humidity, sound level, …). I will need the data (logging periode 1 year and longer) for analytics of the bee’s behavious in relation to weather (e.g. precipitation) and other circumstances. Hence historic data are important. SQlite seems to be the best choice, as my MQTT broker in running on a Raspberry PI.

    Rgds Christian

  5. You said that “A sensor shouldn’t publish a timestamp with the sensor data if it needs to be logged.” Why not? In my thinking, it seems almost elemental that you would want a timestamp from the sensor to accompany the sensor data, regardless of logging status.

    1. When logging sensor data you can choose to log only changes if you send a timestamp then the data always changes.
      So when looking at what data you send then you should also consider what you are doing with it. Even if you send the timestamp you could always ignore it when considering a change. The data logger available on the site doesn’t do this but you can always change that.
      When creating sensors being able to disable sending a time stamp would be useful as well as only sending data changes.
      Rgds
      Steve

      1. I believe using a time series dB like influx would allow you to log just the data without timestamp and the DB will automatically timestamp it for you. This is probably only useful when latency in data logging is not as crucial.

        1. I tried influxdb but it was difficult to setup and a bit of an overkill for most I believe.
          Rgds
          Steve

  6. Thanks for you project.

    is there a project which allows to visualize the information of the log files in a graphical interface?

    1. Hi
      No not at the moment but I will be doing a video on using node-red graphs to display both real time and stored data in a few weeks time as time permits.
      What programming language would you be using ? Python ?
      rgds
      steve

      1. Hi Steve, thanks a lot for your tutorials! Your website is a great resource.
        Would you consider a guide on using Grafana to display SQL data?

  7. Hi Steve, your work is amazing. Your work helped me a lot. I am using this data logger with little modifications in my major project of IoT. I am building an IOT based system which gathers sensor data for monitoring activities of elderly people. Your explanation is also good.
    Thanks.

Leave a Reply

Your email address will not be published. Required fields are marked *