Analyzing your data with Apache Superset

You can use Apache Superset to analyze, explore, and visualize data stored in your Postgres clusters. You can open Superset from the BigAnimal portal using the Analyze links.

Note

Contact BigAnimal Support to enable the Apache Superset feature.

Default permissions include the ability to view data and create dashboards. To add or modify data sources, you need additional permissions. For more information on permissions to access Superset, see Managing Superset access.

Connecting Superset to your cluster

In addition to your password, you need your user, host, port, and database name to connect to Superset. To find this information for your cluster:

  1. Sign in to the BigAnimal portal.

  2. Go to the Clusters page.

  3. Select the name of your cluster.

  4. Select the Connect tab.

Note

To connect with Superset in a private network database cluster, both the cluster and Superset must be on the same network.

To connect to a BigAnimal cluster:

  1. Sign in to the BigAnimal portal.

  2. Select Analyze > Connections.

  3. Select + Database.

  4. In the Add Database dialog box, enter a value for Database Name.

  5. Enter the connection string for your cluster in the SQLALCHEMY URI field, using the following format:

    postgresql://{<username>}:{<password>}@{<host>}:{<port>}/{<dbname>}?sslmode=verify-full

    Note

    Your password is always encrypted before storage and never leaves your cloud environment. It's used only by the Superset software running in your BigAnimal infrastructure. As a defense-in-depth mechanism, we recommend using a Postgres user dedicated to Superset with a minimal set of privileges to just the database you're connecting. Never use your edb_admin user with Superset.

  6. Check the connection by selecting Test Connection. Select Add if the connection was successful.

For more information on connecting to Superset, see the Superset documentation:

Upon successful connection, you can add datasets, charts, and dashboards.

Using Superset dashboards

You can use Superset dashboards to analyze data stored in your cluster. For a tutorial on creating a simple dashboard, see Creating Your First Dashboard.

To view all available Superset dashboards, select Analyze > Dashboards.

To create a dashboard for monitoring EDB Postgres Distributed, you can use the template we provide. See Configuring an EDB Postgres Distributed dashboard for more information.

Configuring an EDB Postgres Distributed dashboard

We provide a template for an EDB Postgres Distributed dashboard in JSON format (utils/superset/pgd_monitoring_template.json) in the cloud-utilities repository. The JSON file includes the schema of the dashboard and the individual charts.

To add the dashboard:

  1. Clone the cloud-utilities repository on your local system.

  2. Using Python 3.4 or later, create an output JSON file:

    1. Change your working directory:

      cd cloud-utilities/utils/superset
    2. Change the permissions on the script to make it executable:

      chmod +x db_name_change.py
    3. Run the script:

      ./db_name_change.py <database_name> -i <input_file> -o <output_file>

      For example:

      ./db_name_change.py edb -i utils/superset/pgd_monitoring_template.json  -o utils/superset/upload.json

    To get more information on the db_name_change script, run:

    ./db_name_change.py -h
  3. In Superset, import your output file by selecting Analyze > Dashboards > Import dashboard.

Using Superset charts

You can use Superset charts to visualize data stored in your cluster. See the Superset documentation for instructions and examples of creating Superset charts.

To view all available Superset charts, select Analyze > Charts.

Using Superset SQL Lab

You can use Superset SQL Lab to write queries to access and modify data stored in your cluster. To access SQL Lab, select Analyze > SQL Editor.