The Response from Generative AI depends on Our Intelligence more than the Intelligence within It

In-Short

CaveatWisdom

Caveat:

It is easy to type a question and get a response from the Generative AI, however it is important to get the right answer as per the context, because Large Language Models (LLMs) of Generative AI are designed to predict only the next word and they can hallucinate if they don’t get the context right or if they don’t have the required information with-in them.

Below is the screenshot of above example and response from Gen AI model in Amazon Bedrock

Wisdom:

  1. Don’t be 100% sure that what ever Generative AI says is true, many times it can be false, and you need to apply your critical thinking.
  2. It is important to understand the limitations of the Generative AI and frame our prompts to get right answers.
  3. Be specific with what you need and provide detailed and precise prompts.
  4. Give clear instructions instead of jargon and overly complex phrases.
  5. Ask open ended questions instead of questions for which the answers could be yes or no.
  6. Give proper context with purpose of your request.
  7. Break down complicated tasks into simple tasks.
  8. Choose the right model as per the task.
  9. Consider the cost factor for different models to perform different tasks. Sometimes traditional AI is much less costly than Generative AI.

In-Detail

In this post I will be using different LLMs available in Amazon Bedrock Service to demonstrate where the models can go wrong and show you how to write prompts in the right manner to get meaningful answer with the appropriate model.

Amazon Bedrock

Amazon Bedrock is a fully managed serverless pay-as-you-go service offering multiple foundational Generative AI model through a single API.

In this repo I have discussed on how to develop an Angular Web App for any scenario accessing the Amazon Bedrock from the backend. In this post I will be using Bedrock Playground in the AWS Console.

Understanding Basics

Tokens – These are basic units of text or code that LLMs use to process and generate language. These can be individual characters, parts of words, words, or parts of sentences. These tokens are then assigned numbers which, in turn, are put into a vector that becomes the actual input to the first neural network of the LLM.

Some thumb rules with respect to tokens are.

  • 1 token ~= 4 chars in English
  • 1 token ~= ¾ words
  • 100 tokens ~= 75 words

Or

  • 1-2 sentence ~= 30 tokens
  • 1 paragraph ~= 100 tokens
  • 1,500 words ~= 2048 tokens

Some common configuration options you find in Amazon Bedrock playground across models are Temperature, Top P and Maximum Length.

Temperature – By increasing the temperature, you can make the model more creative by decreasing it the model become stable and you get repetitive completions. By making temperature zero you can disable random sampling and deterministic results.

Top P – It is the percentile of probability from which tokens are sampled. If the value is less than 1.0 then you get only the corresponding top percentile of options are considered, this results in more stable and repetitive results.

Maximum Length – You can control number of tokens generated by the model by defining maximum number of tokens.

I have used the above parameters with different variations in different models and posting only the important examples in this post.

The art of writing prompts to get right answers from Generative AI is called Prompt Engineering.

Some of the Prompting Techniques are as follows:

Zero-Shot Prompting

Large LLMs are trained to follow instructions given by us. So we can start the prompt by giving the instructions first on what to do with the given information.

In the below example we are giving instruction in the starting to classify the text for sentiment analysis.

Few-Shot Prompting

Here we enable the model for in-context learning where we provide examples in the prompt which serve as conditioning for subsequent examples where we would like the model to generate a response.

In the below example the model Titan Text G1 is unable to predict the right sentiment with few shot prompts. It says model is unable to predict negative opinion!

The same prompt works with other model A21 Lab’s Jurassic-2 Mid. So you need to test and be wise before choosing the right model for your task.

Few-Shot Prompting Limitation 

When dealing with complex reasoning tasks Few-Shot Prompting is not a perfect technique.

You can see below, even after giving examples, the model fails to give the right answer.

In the last group of numbers (15, 32, 5, 13, 82, 7, 1) which is question the odd numbers are 15, 5, 13, 7, 1 and sum of them (15+5+13+7+1=41) is 41 which is an odd number, but the model (Jurassic-2 mid) says, “The answer is True” and agrees that it is an even number.

The above failure leads to our next technique Chain-Of-Thought

Chain-Of-Thought Prompting

In Chain-of-Thought (CoT) technique we explain to the model with examples on how to solve a problem in a step-by-step process.

I am repeating the same example discussed in the Few-Shot-Prompting discussed above with a variation of explaining to the model how to group the odd numbers and their sum is even or not. After that stating the answer True or false.

In the below screen shots, you can see that the Models Jurassic-2 Mid, Titan Text G1 – Express and Jurassic-2 Ultra doesn’t do well even after giving the example with Chain-of-Thought. This shows their limitations. At the same time, we can see that Claude v2 does an excellent job in reasoning and arriving at the answer with the step-by-step process, that is Chain-of-Thought.

Zero-Shot Chain-Of-Thought Prompting

In this technique we directly instruct the model to think Step-by-Step and give answer in complex reasoning tasks.

One of the nice features of Amazon Bedrock is that we can compare the models’ side by side and decide the appropriate one for our use case.

In the below example I compared Jurassic-2 Ultra, Titan Text G1 – Express, Claude v2. And we can see that Claude v2 does an excellent job however the cost of it is also on the higher side.

So, its again our intelligence which defines which model to use as per the task at hand considering the cost factor.

Prompt Chaining

Breaking down a complex task into smaller tasks and output response of one small task used as an input to next task is called Prompt Chaining.

Tree-Of-Thoughts

This technique extends prompt chaining technique by asking the Gen AI to act as different personas or SMEs and then chain the responses from one persona as input to another persona.

Below are the screen shots of the example in which I have given a document stating about a complex cloud migration project of a global renewable energy company.

In the first step I have asked the model to act as a Business Analyst and give Functional and Non-Functional requirements of the project.

Next the response from the first step is given as input and asked to act as a Cloud Architect and give the Architecting considerations as per the functional and non-functional requirements.

Best Practices in Implementing Security Groups for Web Application on AWS

In-Short

CaveatWisdom

Caveat: Its easy to assign source as large VPC wide CIDR range (ex: 10.0.0.0/16) in Security Groups for private instances and avoid painful debugging of data flow however we are opening our systems to a plethora of security vulnerabilities. For example, a compromised system in the network can affect all other systems in the network.

Wisdom:

  1. Create and maintain separate private subnets for each tier of the application.
  2. Only allow the required traffic for instances, you can do this easily by assigning “Previous Tier Security Group” as the source (from where the traffic is allowed) in the in-bound rule of the “Present tier’s Security Group”.
  3. Keep Web Servers as private and always front them with a managed External Elastic Load Balancer.
  4. Access the servers through Session Manager in the System Manager Server.

In-Detail

Some Basics

Security Group is an Instance level firewall where we can apply allow rules for in-bound and out-bound traffic. In-fact security groups associate with Elastic Network Interfaces (ENIs) of the EC2 instances through which data flows.

We can apply multiple security groups for a instance, all the rules from all security groups associated with instance will be aggregated and applied.

Connection Tracking

Security Groups are stateful, that means when a request is allowed in Inbound rules, corresponding response is automatically allowed and no need to apply outbound rules explicitly. This is achieved by tracking the connection state of the traffic to and from the instance.

It is to be noted that connections can be throttled if traffic increases beyond max number of connections. If all traffic is allowed for all ports (0.0.0.0/0 or ::/0) for both in-bound and out-bound traffic then that traffic is not tracked.

Scenario

Let’s take a three-tier web application where the front end or API receiving the traffic from users will be the Web tier, application logic API lies at App tier and Database in the third tier.

Directly exposing the web servers to the open internet is a big vulnerability, it is always better to keep them in a private subnet and front them with a Load Balancer in a public subnet.

It is better to maintain separate private subnets for each tier with their own auto scaling groups.

Overall, we can have one public subnet and three private subnets in each availability zone where we host the application. It is recommended to use at least two availability zones for high availability.

The architecture for our three-tier web application can be as below.

    Architecture for 3-tier Web Application

    Architecture for 3-tire Web Application

    Chaining Security Groups

    In the above architecture Security Groups are chained from one tier to the next tier. We need to create a separate security group for each tier and a security group for load balancer in the public subnet. For Application load balancer, we need to select at least 2 subnets, 1 in each availability zone.

    Implementing Chaining of Security Groups

    1. A Security Group ALB-SG for an External Application Load Balancer should be created with source open to internet (0.0.0.0/0) in the Inbound rule for all the traffic on HTTPS Port 443. TLS/SSL can be terminated at the ALB which can take the heavy lifting of encryption and decryption. An ID for the ALB-SG will be created automatically let’s say sgr-0a123.
    2. For Web tier a Security Group Web-SG with the source as ALB-SG sgr-0a123 in the Inbound rule on HTTP port 80 should be created. With this rule only connections from ALB are allowed to web servers. Let the ID created for Web-SG be sgr-0b321.
    3. For App tier a Security Group APP-SG with the source as Web-SG sgr-0b321 in the Inbound rule on Custom port 8080 should be created. With this rule only connections from Instances with Web-SG security group are allowed to App servers. Let the ID created for App-SG be sgr-0c456.
    4. For Database tier a Security Group DB-SG with the source as App-SG sgr-0c456 in the Inbound rule on MySQL/Aurora port 3306 should be created. With this rule only connections from Instances with App-SG security group are allowed to Database servers. Let the ID created for DB-SG be sgr-0d654.

    Running Containers on AWS as per Business Requirements and Capabilities

    We can run containers with EKS, ECS, Fargate, Lambda, App Runner, Lightsail, OpenShift or on just EC2 instances on AWS Cloud. In this post I will discuss on how to choose the AWS service based on our organization requirements and capabilities.

    In-Short

    CaveatWisdom

    Caveat: Meeting the business objectives and goals can become difficult if we don’t choose the right service based on our requirements and capabilities.

    Wisdom:

    1. Understand the complexity of your application based on how many microservices and how they interact with each other.
    2. Estimate how your application scales based on business.
    3. Analyse the skillset and capabilities of your team and how much time you can spend for administration and learning.
    4. Understand the policies and priorities of your organization in the long-term.

    In-Detail

    You may wonder why we have many services for running the containers on AWS. One size does not fit all. We need to understand our business goals and requirements and our team capabilities before choosing a service.

    Let us understand each service one by one.

    All the services which are discussed below require the knowledge of building containerized images with Docker and running them.

    Running Containers on Amazon EC2 Manually

    You can deploy and run containers on EC2 Instances manually if you have just 1 to 4 applications like a website or any processing application without any scaling requirements.

    Organization Objectives:

    1. Run just 1 to 4 applications on the cloud with high availability.
    2. Have full control at the OS level.
    3. Have standard workload all the time without any scaling requirements.

    Capabilities Required:

    1. Team should have full understanding of AWS networking at VPC level including load balancers.
    2. Configure and run container runtime like docker daemon.
    3. Deploying application containers manually on the EC2 instances by accessing through SSH.
    4. Knowledge of maintaining OS on EC2 instances.

    The cost is predictable if there is no scaling requirement.

    The disadvantages in this option are:

    1. We need to maintain the OS and docker updated manually.
    2. We need to constantly monitor the health of running containers manually.

    What if you don’t want to take the headache of managing EC2 instances and monitoring the health of your containers? – Enter Amazon Lightsail

     Running Containers with Amazon Lightsail

    The easiest way to run containers is Amazon Lightsail. To run containers on Lightsail we just need to define the power of the node (EC2 instance) required and scale that is how many nodes. If the number of containers instances is more than 1, then Lightsail copies the container across multiple nodes you specify. Lightsail uses ECS under the hood. Lightsail manages the networking.

    Organization Objectives:

    1. Run multiple applications on the cloud with high availability.
    2. Infrastructure should be fully managed by AWS with no maintenance.
    3. Have standard workload and scale dynamically when there is need.
    4. Minimal and predictable cost with bundled services including load balancer and CDN.

    Capabilities Required:

    1. Team should have just knowledge of running containers.

    Lightsail can dynamically scale but it should be managed manually, we cannot implement autoscaling based on certain triggers like increase in traffic etc.

    What if you need more features like building a CI/CD pipeline, integration with a Web Application Firewall (WAF) at the edge locations? – Enter AWS App Runner

     

    Running Containers with AWS App Runner

    AWS App runner is one more easy service to run containers. We can implement Auto Scaling and secure the traffic with AWS WAF and other services like private endpoints in VPC. App Runner directly connects to the image repository and deploy the containers. We can also integrate with other AWS services like Cloud Watch, CloudTrail and X-Ray for advanced monitoring capability.

    Organization Objectives:

    1. Run multiple applications on the cloud with high availability.
    2. Infrastructure should be fully managed by AWS with no maintenance.
    3. Auto Scale as per the varying workloads.
    4. Implement high security features like traffic filtering and isolating workloads in a private secured environment.

    Capabilities Required:

    1. Team should have just knowledge of running containers.
    2. AWS knowledge of services like WAF, VPC, CloudWatch is required to handle the advanced requirements.

    App Runner supports full stack web applications including front-end and backend services. At present App Runner supports only stateless applications, stateful applications are not supported.

    What if you need to run the containers in a serverless fashion, i.e., an event driven architecture in which you run the container only when needed (invoked by an event) and pay only for the time the process runs to service the request? – Enter AWS Lambda.

    Running Containers with AWS Lambda

    With Lambda, you pay only for the time your container function runs in milliseconds and how much RAM you allocate to the function, if your function runs for 300 milliseconds to process a request then you pay only for that time. You need to build your container image with the base image provided by AWS. The base images are open-source made by AWS and they are preloaded with a language runtime and other components required to run a container image on Lambda. If we choose our own base image then we need to add appropriate runtime interface client for our function so that we can receive the invocation events and respond accordingly.

    Organization Objectives:

    1. Run multiple applications on the cloud with high availability.
    2. Infrastructure should be fully managed by AWS with no maintenance.
    3. Auto Scale as per the varying workloads.
    4. Implement high security features like traffic filtering and isolating workloads in a private secured environment.
    5. Implement event-based architecture.
    6. Pay only for the requests process without idle time for apps.
    7. Seamlessly integrate with other services like API Gateway where throttling is needed.

    Capabilities Required:

    1. Team should have just knowledge of running containers.
    2. Team should have deep understanding of AWS Lambda and event-based architectures on AWS and other AWS services.
    3. Existing applications may need to be modified to handle the event notifications and integrate with runtime client interfaces provided by the Lambda Base images.

    We need to be aware of limitations of Lambda, it is stateless, max time a Lambda function can run is 15 minutes, it provides a temporary storage for buffer operations.

    What if you need more transparency i.e., access to underlying infrastructure at the same time the infrastructure is managed by AWS? – Enter AWS Elastic Beanstalk.

    Running Containers with AWS Elastic Beanstalk

    We can run any containerized application on AWS Elastic Beanstalk which will deploy and manage the infrastructure on behalf of you. We can create and manage separate environments for development, testing, and production use, and you can deploy any version of your application to any environment. We can do rolling deployments or Blue / Green deployments. Elastic Beanstalk provisions the infrastructure i.e., VPC, EC2 instances, Load Balances with Cloud Formation Templates developed with best practices.

    For running containers Elastic Beanstalk uses ECS under-the-hood. ECS provides the cluster running the docker containers, Elastic Beanstalk manages the tasks running on the cluster.

    Organization Objectives:

    1. Run multiple applications on the cloud with high availability.
    2. Infrastructure should be fully managed by AWS with no maintenance.
    3. Auto Scale as per the varying workloads.
    4. Implement high security features like traffic filtering and isolating workloads in a private secured environment.
    5. Implement multiple environments for developing, staging and productions.
    6. Deploy with strategies like Blue / Green and Rolling updates.
    7. Access to the underlying instances.

    Capabilities Required:

    1. Team should have just knowledge of running containers.
    2. Foundational knowledge of AWS and Elastic Beanstalk is enough.

    What if you need to implement more complex microservices architecture with advanced functionality like service mesh and orchestration? Enter Elastic Container Service Directly

    Running Containers with Amazon Elastic Container Service (Amazon ECS)

    When we want to implement a complex micro-services architecture with orchestration of container, then ECS is the right choice. Amazon ECS is a fully managed service with built-in best practices for operations and configuration. It removes the headache of complexity in managing the control plane and gives option to run our workloads anywhere in cloud and on-premises.

    ECS give two launch types to run tasks, Fargate and EC2. Fargate is a serverless option with low overhead with which we can run containers without managing infrastructure. EC2 is suitable for large workloads which require consistently high CPU and memory.

    A Task in ECS is a blueprint of our microservice, it can run one or more containers. We can run tasks manually for applications like batch jobs or with a Service Schedular which ensures the scheduling strategy for long running stateless microservices. Service Schedular orchestrates containers across multiple availability zones by default using task placement strategies and constraints.

    Organization Objectives:

    1. Run complex microservices architecture with high availability and scalability.
    2. Orchestrate the containers as per complex business requirements.
    3. Integrate with AWS services seamlessly.
    4. Low learning curve for the team which can take advantage of cloud.
    5. Infrastructure should be fully managed by AWS with no maintenance.
    6. Auto Scale as per the varying workloads.
    7. Implement high security features like traffic filtering and isolating workloads in a private secured environment.
    8. Implement complex DevOps strategies with managed services for CI/CD pipelines.
    9. Access to the underlying instances for some applications and at the same time have a serverless option for some other workloads.
    10. Implement service mesh for microservices with a managed service like App Mesh.

    Capabilities Required:

    1. Team should have knowledge of running containers.
    2. Intermediate level of understanding of AWS services is required.
    3. Good knowledge of ECS orchestration and scheduling configuration will add much value.
    4. Optionally Developers should have knowledge of services mesh implementation with App mesh if it is required.

    What if you need to migrate existing on-premises container workloads running on Kubernetes to the Cloud or what if the organization policy states to adopt open-source technologies? – Enter Amazon Elastic Kubernetes Service.

     

    Running Containers with Amazon Elastic Kubernetes Service (Amazon EKS)

    Amazon EKS is a fully managed service for Kubernetes control plane and it gives option to run workloads on self-managed EC2 instances, Managed EC2 Instances or fully managed serverless Fargate service. It removes the headache of managing and configuring the Kubernetes Control Plane with in-built high availability and scalability. EKS is an upstream implementation of CNCF released Kubernetes version, so all the workloads presently running on-premises K8S will work on EKS. It gives option to extend and use the same EKS console to on-premises with EKS anywhere.

    Organization Objectives:

    1. Adopt open-source technologies as a policy.
    2. Migrate existing workloads on Kubernetes.
    3. Run complex microservices architecture with high availability and scalability.
    4. Orchestrate the containers as per complex business requirements.
    5. Integrate with AWS services seamlessly.
    6. Infrastructure should be fully managed by AWS with no maintenance.
    7. Auto Scale as per the varying workloads.
    8. Implement high security features like traffic filtering and isolating workloads in a private secured environment.
    9. Implement complex DevOps strategies with managed services for CI/CD pipelines.
    10. Access to the underlying instances for some applications and at the same time have a serverless option for some other workloads.
    11. Implement service mesh for microservices with a managed service like App Mesh.

    Capabilities Required:

    1. Team should have knowledge of running containers.
    2. Intermediate level of understanding of AWS services is required and deep understanding of networking on AWS for Kubernetes will a lot, you can read my previous blog here.
    3. Learning curve is high with Kubernetes and should spend sufficient time for learning.
    4. Good knowledge of EKS orchestration and scheduling configuration.
    5. Optionally Developers should have knowledge of services mesh implementation with App mesh if it is required.
    6. Team should have knowledge on handling Kubernetes updates, you can refer to my vlog here.

     

    Running Containers with Red Hat OpenShift Service on AWS (ROSA)

    If the Organization manages its existing workloads on Red Hat OpenShift and want to take advantage of AWS Cloud then we can migrate easily to Red Hat OpenShift Service on AWS (ROSA) which is a managed service. We can use ROSA to create Kubernetes clusters using the Red Hat OpenShift APIs and tools, and have access to the full breadth and depth of AWS services. We can also access Red Hat OpenShift licensing, billing, and support all directly through AWS

     

    I have seen many organizations adopt multiple service to run their container workloads on AWS, it is not necessary to stick to one kind of service, in a complex enterprise architecture it is recommended to keep all options open and adopt as the business needs changes.

    Interweaving Purpose-Build Databases in the Microservices Architecture

    It is best practice to have a separate database for each microservice based on its purpose. In this post we will understand how to analyse the purpose based on a scenario and choose the right database.

    In-Short

    CaveatWisdom

    Caveat: We can easily run into cost overruns if we do not choose the right database and design it properly based on the purpose of our application.

    Wisdom:

    1. Understand the access patterns (Queries) which you make on our database.
    2. Understand how your database storage scales, will it be in terra bytes or petabytes.
    3. Analyse what is most important for your application among Consistency, Availability and Partition Tolerance.
    4. Choose Purpose-Built databases on AWS cloud based on Application Purpose.

    In-Detail

    Scenario

    Let us consider we are developing a loan processing application for a Bank. The Loan could be an Auto Loan, Home Loan, Personal Loan or any other loan.

    Requirements of Scenario

    1. Customer visits loan application portal, reviews all the loan products and interest rates and applies for a loan. The portal should give a smooth experience to the customer without any latencies. Once Application is submitted it should be acknowledged immediately and should be queued for processing. 
    1. Bank Expects huge volume of loan applications across regions with its marketing efforts, loan application data should be stored in a scalable database and as it is very important data it should be replicated in multiple regions for high availability.  
    1. While processing the loan, credit worthiness of the customer has to be analysed.  
    1. Customer Profiles and Loan Documents with content management system should be stored in a secured scalable database.  
    1. Based on credit worthiness of customer loan documents should be sent for final manual approval.  
    1. Loan account for the customer should be created and loan transactions data should be maintained in it. We also need to do ad hoc queries on the transactions data in relation to floating interest rates and repayment schedules for generating different statements and reports.  
    1. Loan Application data should be sent to data warehouse for marketing and customer analytics. In the same data warehouse data from other sources and products will be ingested to improve marketing strategies and showcase relevant products to customers.  
    1. Immutable records of loan application events should be maintained for regulatory and compliance purposes and also these records should be securely shared with the insurer of the asset created by the customer with the loan.

    Note: The architecture is simplified for purpose of discussion, real production scenario architecture could be much more complex.

    Architecture Brief

    Customer facing loan application portal is a static website hosted on Amazon S3 with the CloudFront integration. API Calls are made to API Gateway which transfers the request data to Lambda functions for processing. Fanout mechanism with a combination of SNS and SQS is adopted to process and ingest data to multiple databases parallelly. Process workflow including manual approval is handled by AWS Step Functions. SNS is used for internal notifications and SES is used to intimate the status of the loan by email to the customer.

    Command Query Responsibility Segregation (CQRS) pattern is adopted in the above architecture with separate lambda functions for ingesting data and processing the data.

    Analysing Purpose and Choosing the Database

    1. When handling customer queries, especially at the time of acquiring a new customer or selling a new product to an existing customer, the response time should be very less. To give a very good experience to the customer all general frequent query responses, session state and the data which afford to be stale should be cached. Amazon ElastiCache for Redis is a managed distributed in-memory data store built for this kind of purpose. It gives a high-performance microsecond latency caching solution. It comes with multi-AZ capability for high availability.
    2. Loan application data will be mostly key value pairs like loan amount, loan type, customer id, etc. As per the requirement 2, huge volume of loan data has to stored and retrieved for procession and at the same time it should be replicated to multiple regions for very high availability. Amazon DynamoDB is a key-value store database which can give single-digit millisecond latency even at peta-byte scale. It has an inherent capability to replicate the data to other regions with Global Tables and enabling DynamoDB streams. So, this is suitable for storing Loan application data and triggering Loan Processing Lambda function with DynamoDB streams.
    3. As per the requirement 3 of the scenario, creditworthiness of the customer has to be analysed before arriving at a decision of sanctioning loan amount to the customer. It is assumed that bank collects data of customers from various sources and also maintain the data of its relationships with its existing customers. Creditworthiness is calculated on many factors especially the history of the relationship the customer had with bank with various products like savings account, credit cards, income and repayment history. When it comes to querying the relationships and analysing the data, we need a graph database. Amazon Neptune is a fully managed graph database service that work with highly connected datasets. It can scale to handle billions of relationships and lets you query them with milliseconds latency. It stores data items as vertices of the graph, and the relationships between them as edges. Loan application data can be ingested to Amazon Neptune and creditworthiness can be analysed.
    4. As per requirement 4, customer profiles and loan documents should be maintained with a content management system. Loan documents contain critical legal information which could be changing based on various products. The documents can differ based on the law of different states in which the bank operates. To address these requirements schema of the database should be dynamic. We may need to query and process these documents with-in milliseconds. We need to have a No-SQL Document Database for content management system of loan documents which scale. Amazon DocumentDB (with MongoDB Compatibility) is a fully managed database service which can supports both instance-based clusters which can scale up-to 128tb and also Elastic Clusters which can scale even to petabytes of data. We can put loan documents with dynamic schema as JSON documents in DocumentDB. We can use MongoDB drivers for developing our application with the DocumentDB. Additionally signed and scanned copies of the documents can be maintained in an S3 bucket with a reference of the document in the DocumentDB.
    5. As per the requirement 6, Loan Account should be opened for the customer where transactions data should be maintained. Here we need to maintain the integrity of the transactions data with a fixed schema and Online Transaction Processing (OLTP ). SQL is more suitable for doing ad hoc queries for generating statements. A Relational Database is more suitable for this purpose. Amazon RDS which supports six SQL database engines (Aurora, MySQL, PostgreSQL, MariaDB, Oracle and MS SQL Server) is managed service for relational databases. Amazon RDS manages backups, software patching, automatic failure detection, and recovery which are tedious manual tasks if we maintain the database ourselves. We can focus on our application development instead of maintaining these tasks. If we are comfortable with MySQL and PostgreSQL we can choose Amazon Aurora based on version compatibility. Aurora gives more throughput than standard MySQL and PostgreSQL engines as it uses Clustered Volume storage which is native to cloud.
    6. As per requirement 7, loan application data has to be stored for marketing and customer analytics. The data warehouse also stores data from multiple sources, which could run into petabytes. The data could be analysed with Machine Learning algorithms which can help in targeted marketing. Amazon Redshift is a fully managed, petabyte-scale data warehouse service with a group of nodes called cluster. Amazon Redshift service manages provisioning capacity, monitoring and backing up the cluster, and applying patches and upgrades to the Amazon Redshift engine which can run one or more databases. We can run SQL commands for analysis on the database. Amazon Redshift supports SQL client tools connecting through Java Database Connectivity (JDBC) and Open Database Connectivity (ODBC). Amazon Redshift also give serverless options where we need not to provision any clusters, it automatically provisions data warehouse capacity and scales the underlying resources. With serverless we pay only when the data warehouse. We can use Amazon Redshift ML to train and deploy machine learning models with SQL. We can use Amazon SageMaker to train the model with data in Amazon RedShift for customer analytics. 
    7. As per the requirement 8, immutable records of loan application events should be maintained for regulatory and compliance purposes. We need to have ledge database to maintain immutable records and also securely share the date with other stakeholders in block chain applications. The insurer who is insuring the asset which is created out of the loan take by the customer may need this data for insuring purpose. With Amazon Quantum Ledger Database (Amazon QLDB) we can maintain all the activities with respect to the loan in an immutable, and cryptographically verifiable transaction log owned by the bank. We can track the history of credits and debits in loan transactions and also verify the data lineage of an insurance claim on the asset. Amazon QLDB is a fully managed service and we pay only for what we use.  

    In this post I have discussed how to choose a purpose-built database based on our application. I will be discussing designing and implementation of these databases in my future posts.

    Managed Services for Open-Source Technology on AWS Cloud

    In-Short

    CaveatWisdom

    Caveat: Developing Solutions with Open-Source technologies gives us freedom from licensing and also run them anywhere we want, however it becomes increasingly complex and difficult to scale and manage at high velocities with Open Source.

    Wisdom:

    We easily can offload the management and scalability to the managed services in the cloud and concentrate more on our required business functionality. This can also save total cost of ownership (TCO) in the long term.

    In-Detail

    Let’s consider an On-premises solution stack which we would like to migrate to cloud for gaining scalability and high availability.

    Application Stack

    Legacy application stack hosted on the VMs in the On-Prem data centre can be easily moved to Amazon EC2 instances with lift and shift operations.

    Kubernetes Cluster

    Now a days many organizations are using Kubernetes to orchestrate their containerised microservices. In Kubernetes installation and management of control plane nodes and etcd database cluster with high availability and scalability becomes a challenging task. Added to it we also need to manage worker nodes on data plane where we run our applications.

    We can easily migrate our Kubernetes workloads to Amazon Elastic Kubernetes Service (EKS) which completely manages the controls plane and etcd cluster in it with high availability and reliability. Amazon EKS gives us three options with worker nodes, Managed Worker Nodes, Un-Managed Worker Nodes and Fargate which is a fully managed serverless option for running containers with Amazon EKS. It is a certified to CNCF Kubernetes software conformance standards, so we need not to worry about any portability issues. You can refer to my previous blog post on Planning and Managing Amazon VPC IP space in Amazon EKS cluster for best practices.

    Amazon EKS Anywhere can handle Kubernetes clusters on your on-premises also if you wish to keep some workloads on your on-premises.

    Front-End Stack

    If we have front-end stack developed with Angular, React, Vue or any other static web sites then we can take advantage of web hosting capabilities of Amazon S3 which is fully managed object storage service.

    With AWS Amplify we can host both static and dynamic full stack web apps and also implement CI/CD pipelines for any new web or mobile projects.

    Load balancer and API stack

    We can use Amazon Elastic Load Balancer for handling the traffic, Application Load balancer can do content based routing to the microservices on the Amazon EKS cluster, this can be easily implemented with ingress controller provided by AWS for Kubernetes.

    If we have REST API stack developed with OpenAPI Specification (Swagger), we can easily import  the swagger files in Amazon API gateway and implement the API stack.

    If we have GraphQL API stack we can implement it with managed serverless service AWS AppSync, which support any back-end data store on AWS like DynamoDB.

    If we have Nginx load balancers on on-premises, we can take them on to Amazon Lightsail instances.

    Messaging and Streaming Stack

    It is important to decouple the applications with the messaging layer, in microservices architecture generally messaging layer is implemented with popular open source tech like Kafka, Apache Active MQ and Rabbit MQ. We can shift these workloads to managed service on AWS which are Amazon Managed Streaming for Apache Kafka (Amazon MSK) and Amazon MQ.

    Amazon MSK is a fully managed service for Apache Kafka to process the streaming data. It manages the control-plane operations for creating, updating and deleting clusters and we can use the data-plane for producing and consuming the streaming data. Amazon MSK also creates the Apache ZooKeeper nodes which is an open-source server that makes distributed coordination easy. MSK serverless is also available in some regions with which you need not worry about cluster capacity.

    Amazon MQ which is a is a managed message broker service supports both Apache ActiveMQ and RabbitMQ. You can migrate existing message brokers that rely on compatibility with APIs such as JMS or protocols such as AMQP 0-9-1, AMQP 1.0, MQTT, OpenWire, and STOMP.

    Monitoring Stack

    Well known tools for monitoring the IT resources and application are OpenTelemetry,  Prometheus and Grafana. In many cases OpenTelemetry agents collects the metrics and distributed traces data from containers and application and pumps it to the Prometheus server and can be visualized and analysed with Grafana.

    AWS Distro for OpenTelemetry consists of SDKs, auto-instrumentation agents, collectors and exporters to send data to back-end services, it is an upstream-first model which means AWS commits enhancements to the CNCF (Cloud Native Computing Foundation Project) and then builds the Distro from the upstream. AWS Distro for OpenTelemetry supports Java, .NET, JavaScript, Go, and Python. You can download AWS Distro for OpenTelemetry and implement it as a daemon in the Kubernetes cluster. It also supports sending the metrics to popular Amazon CloudWatch.

    Amazon Managed Service for Prometheus is a serverless, Prometheus-compatible monitoring service for container metrics which means you can use the same open-source Prometheus data model and query language that you are using on the on-premises for monitoring your containers on Kubernetes. You can integrate AWS Security with it securing your data. As it is a managed service high availability and scalability are built into the service. Amazon Managed Service for Prometheus is charged based on metrics ingested and metrics storage per GB per month and query samples processed, so you only pay for what you use.

    With Amazon Managed Grafana you can query and visualize metrics, logs and traces from multiple sources including Prometheus. It’s integrated with data sources like Amazon CloudWatch, Amazon OpenSearch Service, AWS X-Ray, AWS IoT SiteWise, Amazon Timestream so that you can have a central place for all your metrics and logs. You can control who can have access to Grafana with IAM Identity Center. You can also upgrade to Grafana Enterprise which gives you direct support from Grafana Labs.

    SQL DB Stack

    Most organizations certainly have SQL DB stack and many prefer to go with the opensource databases like MySQL and PostgreSQL. We can easily move these database workloads to Amazon Relation Database service (Amazon RDS) which support six database engines MySQL, PostgreSQL, MariaDB, MS SQL Server, Oracle and Aurora. Amazon RDS manages all common database administration tasks like backups and maintenance, it gives us the option for high availability with multi-AZ which is just a switch, once enabled, Amazon RDS automatically manages the replication and failover to a stand by database instance in another availability zone. With Amazon RDS we can create read replicas for read-heavy workloads. It makes sense to shift MySQL and PostgreSQL workloads to Amazon Aurora if we have compatible supported versions by Aurora with little or no change, because Amazon Aurora give 5x more throughput for MySQL and 3X more throughput for PostgreSQL than the standard ones as it takes advantage of cloud clustered volume storage. Aurora can support scaling up to 128TB storage.

    If you have SQL Server workloads and op to go for Aurora PostgreSQL, then Babelfish can help to accept connections from your SQL Server clients.

    We can easily migrate your database workload with AWS Database Migration Service and Schema Conversion tool.

    No-SQL DB Stack

    If we have No-SQL DB stack like MongoDB and Cassandra on our on-prem data center the we can choose to run our workload on Amazon DocumentDB (with MongoDB compatibility) and Amazon Keyspaces (for Apache Cassandra). Amazon DocumentDB (with MongoDB compatibility) is a fully managed database service which can grow the size of storage volume as the database storage grows. The storage volume grows in increments of 10 GB, up to a maximum of 64 TB. We can also create up to 15 read replicas in other availability zones for high read throughput. You may also consider MongoDB Atlas on AWS service from AWS marketplace which is developed and supported by MongoDB.

    You can use same Cassandra application code and developer tools on Amazon Keyspaces (for Apache Cassandra) which gives high availability and scalability. Amazon Keyspaces is serverless, so you pay for only the resources that you use, and the service automatically scales tables up and down in response to application traffic.

    Caching Stack

    We use in-memory data store for very low latency applications for caching. Amazon ElastiCache supports the Memcached and Redis cache engines. We can go for Memcached if we need simple model and multithreaded performance. We can choose Redis if we require high availability with replication in another availability zone and other advanced features like pub/sub and complex data types. With Redis you go for Cluster mode enabled or disabled. ElastiCache for Redis manages backups, software patching, automatic failure detection, and recovery.

    DevOps Stack

    For DevOps and configuration management we could be using Chef Automate or Puppet Enterprise, when we shift to the cloud, we can use the same stack to configure and deploy our applications with AWS OpsWorks, which is a configuration management service that creates fully-managed Puppet Enterprise servers and Chef Automate servers.

    With OpsWorks for Puppet Enterprise we can configure, deploy and manage EC2-instances and as well as on-premises servers. It gives full-stack automation for OS configurations, update and install packages and change management.

    You can run your cookbooks which contain recipes on OpsWorks for Chef Automate to mange infra and applications on the EC2 instances.

    Data Analytics Stack

    OpenSearch is an open-source search and analytics suite made from a fork of ALv2 version of Elasticsearch and Kibana (Last open-source version of Elasticsearch and Kibana). If you have workloads developed on opensource version of Elasticserach or OpenSearch then you can easily migrate to Amazon OpenSearch which is a managed service. AWS has added several new features for OpenSearch such as support for cross-cluster replication, trace analytics, data streams, transforms, a new observability user interface, and notebooks in OpenSearch Dashboards.

    Big data frameworks like Apache Hadoop and Apache Spark can be ported to Amazon EMR. We can process petabyte-scale of data from multiple data stores like Amazon S3 and Amazon DynamoDB using open-source frameworks such as Apache Spark, Apache Hive, and Presto. We can create an Amazon EMR cluster which is collections of managed EC2 instances on which EMR installs different software components, each EC2 instance become a node in the Apache Hadoop framework. Amazon EMR Serverless is a new option which makes it easy and cost-effective for data engineers and analysts to run applications without managing clusters.

    Workflow Stack

    For Work flow management orchestration, we can port the Apache Airflow to Amazon MWAA which is a managed service for Apache Airflow. As usual in the cloud we gain scalability and high availability without the headache of maintenance.

    ML Stack

    Apart from the AWS native ML tools in Amazon SageMaker, AWS supports many opensource tools like MXNet, TensorFlow and PyTorch with managed services and Deep learning AMIs.

    Open-Source Technology Stack on AWS Cloud

    ‘Security Of the Pipeline’ and ‘Security In the Pipeline’ with AWS DevOps Tools By Design

    There are many great tools out there for building CI/CD pipelines on AWS Cloud, for the sake of simplicity I am limiting my discussion to AWS native tools.

    In-Short

    CaveatWisdom

    Caveat: Achieving Speed, Scale and Agility is important for any business however it should not be at the expense of Security.

    Wisdom: Security should be implemented by design in a CI/CD pipeline and not as an afterthought.

    Security Of the CI/CD Pipeline: It is about defining who can access the pipeline and what they can do. It is also about hardening the build servers and deployment conditions.

    Security In the CI/CD Pipeline: It is about static code analysis and validating the artifacts generated in the pipeline.

    In-Detail

    The challenge when automating the whole Continuous Integration and Continuous Delivery / Deployment (CI/CD) process is implementing the security at scale. This can be achieved in DevSecOps by implementing Security by Design in the pipeline.

    Security added as an afterthought

    Security by Design

    Security by Design will give confidence to deliver at high speed and improve the security posture of the organization. So, we should think of the security from every aspect and at every stage while designing the CI/CD pipeline and implement it while building the Pipeline.

    Some Basics

    Continuous Integration (CI): It is the process which start with committing the code to the Source control repository and includes building and testing the artifacts.

    Continuous Delivery / Deployment (CD): It is the process which extends the CI till deployment. The difference between Delivery and Deployment is, in Delivery there will be a manual intervention phase (approval) before deploying to production and Deployment is fully automated with thorough automated testing and roll back in case of any deployment failure.

    DevSecOps AWS Toolset for Infrastructure and Application Pipelines

    By following the Security Perspective of AWS Cloud Adoption Framework (CAF) we can build the capabilities in implementing Security of the Pipeline and Security in the Pipeline.

    Implementing Security Of the CI/CD Pipeline

    While designing security of the pipeline, we need to look at the whole pipeline as a resource apart from its security of individual elements in it. We need to consider the following factors from security perspective

    1. Who can access the pipeline
    2. Who can commit the code
    3. Who can build the code and responsible for testing
    4. Who is responsible for Infrastructure as a Code
    5. Who is responsible for deployment to the production

    Security Governance

    Once we establish what access controls for the pipeline are needed, we can develop security governance capability of pipeline by implementing directive and preventive controls with AWS IAM and AWS Organization Policies.

    We need to follow the least privilege principle and give necessary access, for example a read only access to the Pipeline can be given with the following policy

    {
      "Statement": [
        {
          "Action": [
            "codepipeline:GetPipeline",
            "codepipeline:GetPipelineState",
            "codepipeline:GetPipelineExecution",
            "codepipeline:ListPipelineExecutions",
            "codepipeline:ListActionExecutions",
            "codepipeline:ListActionTypes",
            "codepipeline:ListPipelines",
            "codepipeline:ListTagsForResource",
            "iam:ListRoles",
            "s3:ListAllMyBuckets",
            "codecommit:ListRepositories",
            "codedeploy:ListApplications",
            "lambda:ListFunctions",
            "codestar-notifications:ListNotificationRules",
            "codestar-notifications:ListEventTypes",
            "codestar-notifications:ListTargets"
          ],
          "Effect": "Allow",
          "Resource": "arn:aws:codepipeline:us-west-2:123456789111:ExamplePipeline"
        },
        {
          "Action": [
            "codepipeline:GetPipeline",
            "codepipeline:GetPipelineState",
            "codepipeline:GetPipelineExecution",
            "codepipeline:ListPipelineExecutions",
            "codepipeline:ListActionExecutions",
            "codepipeline:ListActionTypes",
            "codepipeline:ListPipelines",
            "codepipeline:ListTagsForResource",
            "iam:ListRoles",
            "s3:GetBucketPolicy",
            "s3:GetObject",
            "s3:ListBucket",
            "codecommit:ListBranches",
            "codedeploy:GetApplication",
            "codedeploy:GetDeploymentGroup",
            "codedeploy:ListDeploymentGroups",
            "elasticbeanstalk:DescribeApplications",
            "elasticbeanstalk:DescribeEnvironments",
            "lambda:GetFunctionConfiguration",
            "opsworks:DescribeApps",
            "opsworks:DescribeLayers",
            "opsworks:DescribeStacks"
          ],
          "Effect": "Allow",
          "Resource": "*"
        },
        {
          "Sid": "CodeStarNotificationsReadOnlyAccess",
          "Effect": "Allow",
          "Action": [
            "codestar-notifications:DescribeNotificationRule"
          ],
          "Resource": "*",
          "Condition": {
            "StringLike": {
              "codestar-notifications:NotificationsForResource": "arn:aws:codepipeline:*"
            }
          }
        }
      ],
      "Version": "2012-10-17"
    

    Many such policies can be found here in AWS Documentation.

    Usually, Production, Development and Testing environments are isolated with separate AWS Accounts each for maximum security under AWS Organizations and security controls are established with organizational SCP policies.

    Threat Detection and Configuration Changes

    It is a challenging task to see to that everyone in the organization follows best practices in creating the CI/CD pipelines. We can automate the changes in configuration of pipelines with AWS Config and take the remediation actions by integrating lambda functions.

    It is important to audit the Pipeline access with AWS CloudTrail to find the knowledge gap in personal and also find the nefarious activities.

     Implementing Security In the CI/CD Pipeline

    Let us look at Security in the Pipeline at each stage. Although there could be many stages in the Pipeline, we will consider main stages which are Code, Build, Test and Deploy. If any security test fails, the Pipeline stops, and the code does not move to production

    Code Stage

    AWS CodePipeline supports various source control services like GitHub, Bitbucket, S3 and CodeCommit. The advantage with CodeCommit and S3 is we can integrate them with AWS IAM security and define who can commit to the code repo and who can merge the pull requests.

    We can automate the process of Static Code Analysis with Amazon CodeGuru once the code is pushed to the source control. Amazon CodeGuru uses program analysis and machine learning to detect potential defects that are difficult for developers to find and offers suggestions for improving your Java and Python code.

    CodeGuru also detects hardcoded secrets in the code and recommends remediation steps to secure the secrets with AWS Secrets Manager.

    Build Stage

    Security of the build servers and their vulnerability detection is important when we maintain our own build servers like Jenkins, we can use Amazon GuardDuty for threat detection and Amazon Inspector for vulnerability scan.

    We can avoid the headache of maintaining the build server if use fully managed build service AWS CodeBuild.

    The artifacts built at the build stage can be stored securely in ECR and S3. Amazon ECR supports automated scanning of images for vulnerabilities.

    The code and build artifacts stored on S3, CodeCommit and ECR can be encrypted with AWS KMS keys.

    Test Stage

    AWS CodePipeline supports various services for test action including AWS CodeBuild, CodeCov, Jenkins, Lambda, DeviceFarm, etc.

    You can use AWS CodeBuild for unit testing and compliance testing by integrating test scripts in it and stop the pipeline if the test fails.

    Open source vulnerabilities can be challenging, by integrating WhiteSource with CodeBuild we can automate scanning of any source repository in the pipeline.

    Consider testing an end-to-end system including networking and communications between the microservices and other components such as databases.

    Consider mocking the end points of third-party APIs with lambda.

    AMI Automation and Vulnerability Detection

    It is recommended to build custom pre-baked AMIs with all the necessary software installed on the AWS Quick Start AMIS as base for faster deployment of instances.

    We can automate the creation of AMIs with Systems Manager and restrict the access to launch EC2 instance to only Tagged AMIs while building the pipeline for deploying infrastructure with CloudFormation.

    It is important to periodically scan for vulnerabilities on the EC2 instances which are made up of our custom AMIs, Amazon Inspector which is a managed service makes our life easy in finding the vulnerabilities in EC2 instances and ECR images. We integrate the Amazon Inspector in our pipeline with Lambda function and we can also automate the remediation with the help of Systems Manager

    Deploy

    When we have an isolated production environment in a separate Production account, consider building AWS CodePipeline which can span across multiple accounts and regions also. We just need to give necessary permissions with the help of IAM Roles.

    Even the user who control the production environment should consider accessing it with IAM Roles whenever required which can give a secure access with temporary credentials, this helps prevents you from accidentally making changes to the Production environment. This technique of switching roles allows us to apply security best practices that implement the principle of least privilege, only take on ‘elevated’ permission while doing a task that requires them, and then give them up when the task is done.

    Implement Multi-Factor Authentication (MFA) while deploying critical applications.

    Consider doing canary deployment, that is testing with a small amount of production traffic, which can be easily managed with AWS CodeDeploy.