Confusion Matrix is a tabular summary of prediction result of a classification problem. It is a table with different combinations of predicted and actual values. The below image summarizes those combinations. Here rows represent actual value while columns represent predicted values.
Cloud adoption is like dating. If you rush, a security incident will be around the corner. To avoid that we need to have a good strategy. In this blog we will see some pointers to have a secure relationship with cloud.
Manual deployments are prone to misconfigurations. We need to develop a culture of minimizing console usage for deployments. All deployments in non-POC environments should come from a Provisioning/Deployment engine which takes code as input. This can be CFN based or DSL based (terraform or troposphere) or any other custom solution.
We should have a list of controls to check against and write Infrastructure code such that Security is built into the system. For example we can implement intelligent Identity and Network based guardrails while we write our IaC. Below are some possible & actionable categories of the goals. …
Apache Spark is a unified computing engine/framework for large-scale data processing.
Consume a service from a private network without traversing internet.
This is what a private link is at it core. It allows application in a private subnet of VPC to connect to a service which is not in current VPC without leaving the AWS network. VPC peering allows us to do same but it establishes a connection with a scope which is too big. Also peering does have a restriction of non-overlapping CIDRs. Private Link allows us to only connect with the service we need. It gives control at both end i.e. service provider end as well as consumer end. …
EMR is a managed cluster platform that simplifies running big data frameworks e.g. Hadoop, Spark, Presto on AWS cloud.
Cluster: A cluster is simply collection of EC2 instances called Nodes. Based on the nodes role, they are categorized in three types. Master, Core and Task Node.
Master Node: Manages & monitors the cluster, coordinates the distribution of data and tasks among other nodes. It also tracks the status of tasks. Every
Core Node: Nodes that hold data and execute tasks.
Task Node: Provides extra compute power. Does not hold any data.
Amazon EMR service architecture is comprised of four basic layers. …
NLP is a sub-field of AI which enables computers understand & process human generated text data. In this blog we will learn the basic tasks of NLP and also some applications of NLP.
Once we have text, first task that is performed is to pre-process the data.
Break the text into individual sentences.
Logging is the mode an application uses to communicate with the audience. The key in any communication (logging) is to think about your audience and what it needs. That is why we have different log level for different audience (developer, sysadmins etc).
AWS CloudWatch Logs Insights is an SQL like interactive solution for querying, analysing & visualising log-data from cloudWatch. Cloudwatch logs can be VPC flow log, cloudTrail logs, Contact Flow Logs, RDS Logs, Service specificlogs, or custom application logs.
In this blog we will try to build a miniature version of AWS Simple Notification Service.
Simple notification service allows user to publish messages to topic. A user subscribe to topic(s). Whenever a message is published to the topic by publisher, subscriber receives the message published in the topic. Both publisher and consumer are unaware of each other. They do not communicate directly.
We are required to Design SNS service that clients all over the world can use to read and write messages.
Message Order— Message order must be maintained
Grouping by topic — Message must be grouped by topic. …
Stack is a collection of AWS resources which we can manage as a single unit. We can perform operations like CREATE, DELETE and UPDATE on this unit. While we perform these operation the stack transitions from one state to another state. Like from CREATE_IN_PROGRESS to CREATE_COMPLETE. Knowing the transition states helps in debugging any issues. This blog tries to depict the state transition visually.
At a high level, we perform CREATE | UPDATE | DELETE operation for stack. First we create a stack using some template. Once created we can either delete or update the stack. Also we can delete an stack after update. Internally during the each of the above mentioned operation stack transition through multiple state. …
About