This is a discussion post and these are currently my thoughts on this topic. I would be grateful for comments and feedback.
MQTT is still in the early stages of deployment and currently is used only on private networks.
However with the growth in the adoption of the MQTT protocol for information distribution the number of public brokers and topics will probably increase substantially just as happened with websites in the early days of the internet.
Currently no topic naming conventions are established,and there is no established method of publishing data over MQTT to a public audience.
So if you decide you want to make your data available over MQTT as opposed to,or in addition to a traditional website how would you do it?
There are several public brokers available, and these brokers are currently mainly used for testing purposes. See MQTT broker options
If you decided to use one of these brokers how would your intended audience know you were using it?
Let’s say, for example, you want to publish airplane arrival and departure information how would a user know what broker and topic you were publishing the data on?
One solution, and the one I expect to be used in the early adoption phases, is to use the existing website to link to the data.
However; going forward I would expect that MQTT only solutions will need to be created.
MQTT Directories and Search Engines
The how do I find a website problem that arose in the early days of the web was solved by directories notably Yahoo, followed by search engines.
How Search Engines Work and MQTT
Search as we currently know it is only possible due to the nature of web pages and links.
If web pages didn’t contain links to other sites then the only way search engines could know about a new website is if the website owner added the site manually into the search engine.
This is the nature of web directories, and because they need to be manually maintained they were largely replaced by search.
Because MQTT doesn’t contain well defined structured data like web pages and doesn’t contain hyperlinks, search engines cannot crawl an MQTT data feed and discover new topics and brokers.
Therefore a directory type structure is most likely but with automatic registration and maintenance.
MQTT Server and Topic Naming
With websites, the website domain name identified the website and also the server hosting the website.
In MQTT the topic and the MQTT server hosting that topic are not related.
In fact a topic could be made available on several MQTT brokers.
So for a user to subscribe to a topic e.g airline arrival information they would need to know :
- The broker or server name hosting that topic
- The topic name.
However if the airline already has a webpage displaying flight information using http.
Then they can easily transition that page to use MQTT over websockets and deliver the changes using MQTT.
The end user would use traditional search to locate the web page, and would be unaware that MQTT was being used to transmit the data. See flight arrivals
However what if the airline doesn’t currently have a website, and just wants to make the data available over MQTT?
Ideas for Locating MQTT topics on a Broker
To locate brokers and topics information on brokers and topics could be published to a collection of known topics.
To make these topics available on all brokers then they would need to be reserved.
This would remove the need for a client to subscribe to all topics to discover which topics were being used.
On a public broker a client would need to discover if the topic it wanted to used was already in use before it published data on that topic.
The $SYS topic is already semi-reserved and could server as a starting point.
All root topics starting with a $ could be classed as reserved and reserved topics should be in All capital letters.
A broker could publish a list of reserved topics on that broker using the reserved topic $RESERVED.
Only the broker administrator would be allowed to modify this list.
Topic Discovery using a Reserved Topic- $TOPICS
Every broker could maintain a reserved topic called $TOPICS or $_TOPICS .
There are two possible approaches:
1. Publishing clients could be responsible for publishing information about their topics on this root topic.
2. Alternatively the broker could publish a topic list.
If we take the example of a client publishing then the process could go something like.
If you wanted to use the topic bbc/radio1234
Step 1-Subscribe to $TOPICS/bbc/radio1234.
if message received then this topic is taken and so you must choose another topic
Step 2- If no message is received then publish a message on $TOPICS/bbc/radio1234 with retain flag set to reserve this topic.
Step 3- You can now publish on bbc/radio1234
Step 4- If you no longer need this topic structure then clear the retain flag on $TOPICS/bbc/radio1234
The message could contain similar information that is included on web pages using the title, description and keywords tags as well as other control information as well as a timestamp.
Title– voice transcript Radio Show xyx
Description– Live feed of voice transcript of radio show xyz
Keywords– radio,show bbc,transcript
Timestamp -Used to remove stale retained messages along with retain period
Retain – Retain Period in days
It could also contain user defined keys.
The message would be JSON encoded.
Finding brokers could be made easier if each broker published a list of known MQTT brokers. This is especially useful if the topic is available on multiple brokers.
Again a reserved topic like:
$BROKERS or $_BROKERS
The brokers could be divided into regions or countries similar to the way DNS is organised.
Extending the Concept to User/Organisational Topic Roots
If an organisation uses a topic root of abccorp then they could use the topic abccorp/$TOPICS to announce the topics that they are publishing.
Every broker would broadcast a list of topics, or topic roots that are currently active on that broker.
Root brokers would maintain a list of MQTT brokers like DNS maintains a list of domain names.
Finding out what clients are currently connected to a broker isn’t really possible.
However a simple mechanism would be for a client to publish it’s connection status on a reserved topic e.g. $CONNECTED or $CLIENTS.
This would take the form of
$CLIENTS/Client-id and the status in the payload
1 or True for connected and 0 or false for not connected.
Note: Mosquitto already provide a mechanism for restricting access to a topic using a client_id using the pattern keyword in the ACL. See Mosquitto ACL
Topics Subscribed to
In order to know the popularity of a particular topic it would be good to know how many clients are currently subscribed to a particular topic.
This information could easily be provided by the broker.
However it could also be collated by clients if each client published a list of topics it was publishing/subscribing to.
Again the connecting client could publish this as part of the connection information as well as the connection status. e.g.
Topic -$CONNECTED/Client-id or
Payload would be a JSON encoded string of form:
Status:True or False, Topics: Topic List.
There is clearly a lot of work to be done on this topic. Even if Public brokers didn’t become popular then having a standardized or generally accepted topic structure will be beneficial to private deployments.
How far this idea goes is uncertain but I will be developing a series of demo scripts that implement the ideas developed here.
If there is enough interest in this topic I will make it a Github project.
This post is effectively a trial balloon and I would be glad of any feedback that you may have.
Resources and References
- The homie convention – Attempts the standardize sensor discovery.
- Introduction to MQTT +Sparkplug For IIOT
- Checking Active MQTT Client Connections
- MQTT Topic and Payload Design Notes
- Republish HTML Data Over MQTT (Flight Arrivals )