[IoT] Stream Analytics reference data updates

If you’ve read my post on Azure Stream Analytics, you’ve seen how you can configure a reference data blob to be used to compare incoming IoT data with certain thresholds. The reference data is stored in Azure blob storage, within a certain structure of folders.

Now what about updating that file? I found that updates that I made in my blob weren’t picked up by the ASA job as I published them. The folder structure I was using was like this:

devicerules/{date}/{time}/devicerules.json

ASA will monitor that pattern for changes in the {date} and {time} parameters which align with the date and time at that moment. This way you can change the reference data right now (using DateTime.Now), but also in the future. Also, when you start a job with a date in the past, ASA will use the correct reference data depending on the date and time of the incoming stream data. More information about this can be found here: https://azure.microsoft.com/en-us/documentation/articles/stream-analytics-use-reference-data/.

Ok, so why is my job not reading the correct data? In particular, it seemed like it would pick up the first update I wrote to the file but not subsequent ones. In the above pattern, you need to configure the format used for {date} and {time}. This is done in the portal where you configure the input job. The weird thing is that for {time}, the portal doesn’t give you a lot of options:

asa_refdata_timeselection

Confirmed here, this is actually a bug in the new portal. You can only select hours. This means that your reference blog is stored in a structure like:

devicerules/2016-04-21/09/devicerules.json

If you now store the rules again within the same hour, your logic will (should) overwrite the above file with a new one. ASA will not pick up overwrites of existing files, so it will keep using the old one published first.

There are more correct patterns, but you cannot select them. To do so, head over to the old portal (manage.windowsazure.com) and edit the job there. You’ll see some more options:

asa_timeformat_oldportal

So now you should store your blob within the following folder structure:

devicerules/2016-04-21/09-49/devicerules.json

Now that we’ve got the minutes in the format, ASA will pick up new blobs as long as the file is not republished within the same minute. For reference data this should not be an issue, if your reference data is even more volatile you should probably look for other ways to analyse the incoming data.

Leave a Reply

Your email address will not be published. Required fields are marked *