8. Kinesis Firehose

Create the missing Kinesis Data Firehose

The Analytics Team has complained that even though the application is running, there is no data being sent to their data lake’s staging area. Let’s take another look at the Alien Attack architecture diagram:

Image

According to this diagram, the Kinesis Data Stream sends the data through a Kinesis Data Firehose to a bucket in S3. The Analytics Team has appended the following information: “We have already created the S3 bucket to store this data in our AWS account. It has the suffix ‘raw’”.

Action Item: before we move on, try to navigate to your S3 console and find the bucket that the Analytics Team is referring to. Click the checkbox to the left of the bucket, select Copy Bucket ARN and paste the ARN to a separate text editor file.

The team said that this fix should be pretty simple: “All you need to do is connect the Kinesis Data Firehose to our existing Kinesis Data Stream. If the Kinesis Data Firehose doesn’t exist, create one! Grant us access to the environment and we can help, or call us if you need”.

Let’s get started.

What are we fixing? Fix or create a Kinesis Data Firehose so that it is properly sending data from our Kinesis Data Stream to the Analytics Team’s S3 bucket. We are currently missing a mechanism to do this within our AWS architecture.

Hint: Click here to see a diagram of your broken architecture.

Solution guidance

  1. From your AWS Management Console, navigate to the Amazon Kinesis Console (note that this is separate from the Kinesis Video console). Make sure you are still in the same region you chose at the beginning of this workshop.
  2. Inside of the Amazon Kinesis dashboard, you’ll see a panel for Kinesis data streams on the left and a panel for Kinesis Firehose delivery streams on the right.

Do you see any Kinesis Firehose Delivery Streams for your Alien Attack environment?

It looks like our application doesn’t have any Kinesis Firehose Delivery Streams built, so let’s create one:

  1. In Kinesis Firehose Delivery streams panel, click Create Delivery Stream.
  2. Configure your delivery stream name and source:
    • Delivery stream name: YourEnvironmentName_Firehose (If our environment name is alienenv123456, then the stream name should be ALIENENV123456_Firehose. Note the capitalization.)
    • Source: Kinesis Data Stream
    • Choose Kinesis stream: YourEnvironmentName_InputStream
  3. Click Next. Decide how your delivery stream will process records:
    • Record transformation: Disabled
    • Record format conversion: Disabled
  4. Click Next. Let’s choose a destination for our delivery stream. In this case, our destination is the raw data S3 bucket created by our Analytics Team.
    • Destination: Amazon S3.
    • S3 bucket: Select the bucket attached to your application. The name will have the form .raw.
  5. Click Next. Configure your settings:
    • Buffer size: 1
    • Buffer interval: 300
    • S3 compression: GZIP
    • S3 encryption: Disabled
    • Error logging: Enabled
    • IAM Role: Click on the radio button Create or update IAM role KinesisFirehoseServiceRole-<YourEnvironmentName_Firehose>-<region>-<epoch_time_in_ms>
  6. Click Next. Review your configuration. Click Create delivery stream. If the stream was properly created, you should see it appear in your Kinesis Dashboard.
Stuck? Click here for a Fast Fix