How to Install Apache Flink On Ubuntu

Hey, Tea Lovers! In this post, we will be talking about how to install Apache Flink on Ubuntu. It will be a very quick guide We will be installing the latest version of Apache Flink 1.12.1 on Ubuntu 20.04, though it applies to all the versions and possibly any newer one too.

For the Installation of Apache Flink on Mac OS check out the post " How to Install Apache Flink On Mac OS".

And for theWindows check out the post " How to Install Apache Flink On Local Windows"

Requirements to Install Apache Flink on Ubuntu

I will be using Ubuntu 20.04 for this but this can be applied to other versions of Ubuntu. You only need Java installed on Ubuntu. But make sure you have JAVA_HOME set up on the machine.

If you don’t have Java installed yet on your machine or didn’t set the JAVA_HOME, then check this post, " How to Install Latest Java and Set JAVA_HOME on Ubuntu". You can use Java 8 or Java 11, it’s up to you.

Download Latest Apache Flink on Ubuntu

I will be installing Apache Flink version 1.2.1. But to get the list of all stable Flink binaries released and available to download, head over to this link. Select the version you want to choose and download.

Select any one from these 2 to Install Flink ubuntu

You may be redirected to another page and in the 1st paragraph, you can see the mirror download link.

Flink Download Page to Install Flink ubuntu

Download Old Stable Apache Flink on Ubuntu

For the latest release, we can just look in the first paragraph for the link. But for the old release like Flink Version 1.7.1, you need to select from the old stable release list present on the same page. From the list select the binaries in front of the desired version and then download a binary file that contains the -bin-scala_ in the name.

I will be selecting Flink Version 1.7.1 for download. and then select the -bin-scala_ file (without Hadoop as I don’t need it)

Start and Stop Apache Flink Cluster on

Now that we have downloaded the Apache Flink binaries, let’s extract the tar file and copy it to a more appropriate location, for it would be,

/home/imran/programs/apache-flink/flink-1.12.1

In the terminal, cd into the path and run the cluster with the following command.

./bin/start-cluster.sh

If it is successful then you will see a cluster start to message and the cluster will run in the background.

Then simply go to the localhost:8081 to access the Flink UI Dashboard.

To stop the cluster use the following command.

./bin/stop-cluster.sh

Run a Flink Job on Cluster

There are 2 ways you can run a Flink Job on the cluster. UI and via command.

Run Flink Job via Flink Dashboard

The simplest way is to use the UI. First, in the dashboard, go to the Submit New Job page and upload the Jar. Next, select the Jar from the list, and pass the Main class, the argument of the program. Then, submit the job and it will redirect you to the running job page, where you will see the pipelines and their status.

Run a Flink Job via Terminal

The second way is to run it via the Terminal or Command-Line. You can use the following command to run the Flink Job.

./bin/flink run \
    --detached \
    /path/to/jar/SampleFlinkJob.jar

This will run the Flink job in the background. You can then check the status in the Flink Dashboard, on localhost, if it runs successfully.

Conclusion

That’s it for this post. This post is focused on Flink Version 1.12.1 and running it on Ubuntu. Keep in mind to set up the JAVA_HOME properly and make sure the JAVA_HOME path doesn’t contain any space in it.

You can also check out my other post " How to Select Specific Folders or Files As Input in Flink" and the code is on GitHub.

See you in the next post.

HAKUNA MATATA!!!

Requirements to Install Apache Flink on Ubuntu#

Download Latest Apache Flink on Ubuntu#

Download Old Stable Apache Flink on Ubuntu#

Start and Stop Apache Flink Cluster on#

Run a Flink Job on Cluster#

Run Flink Job via Flink Dashboard#

Run a Flink Job via Terminal#

Conclusion#