To develop and deploy a micro-services based IOT application on AWS which monitored per-second data from physical IOT devices.
Considering the modular nature of the application, the micro-services architecture works best in this scenario where at a high level, the micro-services can be split into the data consumption and data processing.
With the traditional monolithic architecture (tightly coupled applications), all changes must be pushed at once making continuous integration impossible. In Microservices Architecture, the application is build as a set of loosely coupled services, which implement business capabilities. The microservices uses lightweight HTTP / REST to communicate between themselves.
Consider a business application with 5 different business functionalities. There is a Sprint N which deals with Business Functionality 3 and has to be released.
In the traditional Monolithic architecture, even a change in a particular Business Functionality needed a full release of the app with other functionalities un-necessarily deployed even if they are not touched. With, the micro-services architecture, all the functionalities are broken down into services, which makes it easier to deploy and continuously integrate. Also, when it comes to scaling up and down, we can scale up or down only those services that are needed to and not the full app. Plus, Netflix, eBay, Twitter, PayPal have all evolved from monolithic architecture to micro-services architecture. The prominent micro-services were:
The Data Collection Service is nothing but a proxy service which expose a single API which will consume message from the sensors and publish the the message to the queue (In this case: AWS SQS - Simple Queue Service)
The Data Processing Service listens to the queue and consumes messages from the queue (AWS SQS) and save it in the Time Series Database. The Time Series Database that was used was: AWS DynamoDB. AWS Timestream was released just a few days back and it right now does not have any reviews / data points, however, both DynamoDB and Timestream are well suited for IoT type of data. Note that we will be only storing the data from the IOT devices in this database, for the other business level data we used relational MySQL database.
The Data Processing Service listens to the queue and consumes messages from the queue (AWS SQS) and save it in the Time Series Database. The Time Series Database that was used was: AWS DynamoDB.
These are a collection of micro-services which expose very business specific APIs.
The auth service dealt with the Login APIs. They communicated with the API Gateway Service in order to determine user validity / user token validity. The app was a stateless app and hence all the tokens were stored in the database and not in any files in form of any sessions.
The Notification Service dealt with nothing but sending notifications to the end business users in the form of:
The API Gateway Service was nothing but a proxy service (Zuul Proxy) to all the API calls being made into the system from outside (i.e. from browser or mobile apps or anything else). The external world only knows the gateway service and nothing else and based on the URL the gateway service will route it to appropriate service internally.
The Eureka Service is a auto discovery service in order to facilitate the communication between each of the microservices. It is nothing but a simple spring boot app deployed which will auto discover each of the microservice deployed in the env.
The Config Service again is a basic service to provide external properties to all the other services since the other services will be config driven. The actual properties files will be stored in a GIT Server.
Other aspect along with the services involved are the components. The following components were used in the whole lifecycle:
Each of the microservices that will be used is described in the above section.
Since the nature of the data coming in from the IOT devices is based on time, it is in the best interest to have a time series database store this data. For this we propose to use either of the following: AWS DynamoDB or AWS Timestream.
AWS Timestream is particularly new as is just released, however DynamoDB has been used for years now.
The messages from the Data Collection Service needs to be processed before storing in the database, and this is a best use case for having a queue. The AWS SQS (Simple Queue Service) provides a basic, simple yet effective queue service on AWS.
CDN - Content Delivery Network is where we will be storing all the images related to the movies, theatres, distributors. AWS S3 is the best option here since its simplicity and ease of use. We debated the use of AWS CloudFront however, the business use case did not have much of streaming capabilities, hence we settled on for AWS S3.
All the other business data is structured data and hence can be stored in a Relational Database. We opted for AWS RDS for MySQL for this use case since MySQL.
Following diagram will give a detailed view of the deployment architecture that we used: