Using and Installing the Alation Data Catalog

Reading Time: 3 minutes

What is the Alation Data Catalog?

Alation is a data catalog tool used to search, query and collaborate on large data sets, using machine learning to gain insights incredibly quickly. In my environment, I have a Teradata source database that uses clickstream to record user behavior on the website, it tracks everything for clicks on the site, orders, user interaction – everything!

Alation’s enterprise data catalog dramatically improves the productivity of analysts, increases the accuracy of analytics, and drives confident data-driven decision-making while empowering everyone in your organization to find, understand, and govern data.

The Alation data catalogue is used to make sense of these Petabytes of data sets. Alation also connects to our AWS Redis Data sources and a few PowerBI instances on site.


The key tasks ahead of installation of the Alation Data Catalogue to ensure a successful implementation are:

  • Procure & configure Alation compute instance
  • Confirm network rules are in place
  • Obtain Alation email account and SMTP server details
  • Create DNS entries for Alation URL
  • Procure & Configure Alation Analytics V2 compute instance
  • Prepare Service Accounts and collect connection details for in-scope data sources

Ports needed for you security group:

DNSoutbound53DNS Server
SSHoutbound465Email server
Alation Node
Management Consoleinbound443Alation Node
LDAPoutbound389LDAP / AD Server
LDAPSoutbound636LDAP / AD Server

How to install Alation

First thing you need to do is reach out to Alation for a trail licence. This step was done for me by other members of the project. This guide is a high level overview of how to install Alation.

  • Reach out to Alation for Trial Licence and Installation files. An install can be done offline or via RPM or YUM. I would only recommend using Linux for Alation.
  • Provision a server instance. I did this in AWS – here are the specs:

AWS Instance – M5.2xLarge ( 8 CPU and 32GB RAM)

Configure Storage – 3x XFS file system 80GB Root partition, 500GB App Partition, 750GB Backup Partition)

sudo mkdir /data
sudo mkdir /backup
sudo lvcreate -n data vg_xfs
sudo lvcreate -l 100%FREE -n data 
sudo vgcreate vg_xfs /dev/nvme2n1
sudo vgcreate vg_backup_xfs /dev/nvme2n1
sudo lvcreate -l 100%FREE -n backup  
sudo mkfs.xfs /dev/vg_backup_xfs/backup

UUID=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx     /root           xfs    defaults,noatime  1   1
UUID=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx     /data xfs    defaults,noatime  1   1
UUID=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx     /backup xfs defaults,noatime  1   1
, Using and Installing the Alation Data Catalog
  • Download the Alation package. This can be done offline using the Alation Customer Portal or via RPM. You get you access code from the Alation Customer Portal.
curl -kLH "Authorization: Token YOUR ACCESS TOKEN" > alation-2021.2-
, Using and Installing the Alation Data Catalog
Snapshot of the Alation Website
  • Next Install Alation
sudo yum update -y
sudo rpm -ivh alation-
sudo service alation init /data /backup
  • Now enter the Alation Shell
sudo service alation shell
  • You can look at the existing configuration by typing
  • Here is my recommended Alation configuration
alation_conf alation.profiling.v2.distribution.show_distribution_chart -s True
alation_conf alation.profiling.v2.distribution.max_unbatched_values -s 10
alation_conf alation.profiling.v2.distribution.batch_count -s 10
alation_conf alation.feature_flags.enable_profiling_v2 -s True
alation_conf alation.taskserver_timeouts.profileColumnV2 -s 120
alation_conf alation.feature_flags.enable_gbm_v2_connector_strategy -s True
alation_conf alation.feature_flags.enable_permissions_middleware_feature -s True
alation_conf alation.feature_flags.enable_swagger -s True
alation_conf alation.authentication.token.disable_v0_api_token_auth -s True
alation_conf alation.feature_flags.enable_lineage_v2 -s True
alation_conf alation.backup_v2.incr_backup -s True
alation_conf alation.backup_v2.incr_backup_versions -s 6
alation_conf alation.install.is_trial -s true
alation_conf nginx.use_ssl -s False
  • Now enable backups
alation_action enable_backupv2
  • restart the alation server
alation_action restart_alation
  • You now need to configure an AWS application load balancer. Note: It MUST be an Application Load Balancer
, Using and Installing the Alation Data Catalog

You may also like...

Leave a Reply

Your email address will not be published. Required fields are marked *