LogoLogo
HomeUser GuidesAPI Reference
  • ๐Ÿ‘Welcome to Whaly ๐Ÿณ
  • Team
    • ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘งโ€๐Ÿ‘ฆWhat is a team?
    • ๐Ÿ›ก๏ธSingle Sign On
    • ๐ŸฅทImpersonate
  • Organisation
    • ๐ŸซWhat is an organisation?
    • ๐Ÿ“คUpload your Organisation logo
    • ๐Ÿ”‘Manage Access to your organisation
    • โ“Understanding Licences
    • ๐Ÿ‘ฎUnderstanding User Roles
  • User Management
    • ๐Ÿท๏ธUser Attributes
    • ๐Ÿ‘ญUser Groups
    • ๐Ÿค–Service Accounts
  • Workspace
    • โœ๏ธWorkspace
    • ๐Ÿ“‚Report Folders
    • โœจSharing & Collaboration
      • Share a report to the Web
    • ๐Ÿ“—Catalog
    • โš™๏ธSettings
  • Warehouse
    • ๐ŸฆConnect your Warehouse
    • โš”๏ธAmazon Athena
      • Connect your Athena
    • ๐ŸฎAmazon Redshift
      • Connect your Redshift
    • ๐ŸงฑDatabricks
      • Connect your Databricks
    • ๐Ÿ”ทGoogle BigQuery
      • Connect your BigQuery
      • Grant access to BigQuery datasets
      • Enable multi project support
    • ๐Ÿ˜Postgres
      • Connect your Postgres
      • Whitelisting Whaly IPs
    • โ„๏ธSnowflake
      • Connect your Snowflake
      • Giving access to Snowflake data
  • Models
    • ๐Ÿ’žModels sync
      • Where should my models be managed?
      • dbt Cloud
        • Configuration
        • Exposing models into Whaly
    • ๐Ÿ“ฅPersistence Engine
      • Configuration
        • Snowflake
      • Check Materialisation runs status
  • Workbench
    • ๐Ÿš€Navigating the workbench
    • ๐Ÿ› ๏ธModeling
      • Understanding Datasets
        • General Information
        • Drills
        • Relationships
        • Primary Keys
        • Cache
      • Model Data
        • SQL Models
        • Flow Models
          • Create a Flow
          • Update a Flow
          • Flow steps
            • From Model
            • From Raw
            • Hide Column
            • Filter
            • Lookup
            • Rollup
            • Formulas
            • Group
            • Union
      • Import raw data
        • From your warehouse
        • From third party data
    • ๐ŸงญExplorations
      • Configure an exploration
      • Exploration Templates
      • Tables
        • Configure a table
        • Add related data
      • Metrics
        • Create a Metric
        • Create a Calculated Metric
        • Create Drill Downs
        • Using custom formatting
      • Dimensions
        • Create a dimension
      • Check measure usage
      • Row Level Access
  • Data consumption
    • ๐Ÿ’กExploring data
      • How to explore data
      • Drill Down
      • Forecasting
    • ๐Ÿ’นWhat is a Report?
    • ๐Ÿ“ŠDashboards
      • Create a dashboard
      • Manage tiles
        • Add chart tiles
        • Add text tiles
        • Add navigation tiles
        • Arranging tiles
      • Add a description
      • Share a dashboard
      • Filter a dashboard
      • Push dashboard
      • Delete a dashboard
    • ๐Ÿ“ˆQuestions
      • Create a question
      • Add a description
      • Share a question
      • Push question data
      • Delete a question
    • ๐Ÿ”Refreshing a report
  • Data visualisation
    • ๐ŸŽจTheming
    • ๐Ÿ–Œ๏ธChart your data
      • Bar chart
      • Calendar chart
      • Funnel chart
      • Gauge chart
      • Geo map chart
      • Heatmap chart
      • Interactive map chart
      • Line chart
      • Metric chart
      • Pie chart
      • Retention chart
      • Table chart
      • Treemap chart
      • Waterfall chart
      • Custom time format in time series
  • Content management
    • โญExplorations Section
    • โœ‚๏ธBulk Content Management
  • Embedding
    • ๐Ÿ“ŒEmbed in Business apps
      • Notion
      • Clickup
      • Hubspot
      • Google Chrome
        • ๐ŸŒฑInstall
        • โš™๏ธConfigure the Chrome extension
    • ๐Ÿ‘ฉโ€๐Ÿ’ปEmbedding API
    • ๐ŸชŸPartner Portal
  • Workflows
    • ๐Ÿš€Push
      • Configure a Push
      • Manage Push
    • ๐Ÿ’ผManage Installed Actions
    • โšกActions catalog
      • Airtable
      • Google Sheets
      • Slack
      • Sendgrid
      • Webhook
  • Platform concepts
    • โœณ๏ธQuery Mode
    • ๐Ÿ’ซCaching
  • Guides
    • โ›‘๏ธSupport
  • User
    • ๐ŸคฉUpload your profile picture
  • Connectors
    • ๐Ÿ”ŒConnect your Sources
    • โš™๏ธWarehouse setup
      • BigQuery
        • Configure a Cloud Storage cleaning rule
      • Snowflake
    • โ˜๏ธWhitelisting Whaly connectors IPs
    • ๐Ÿ”SSH Tunneling
    • ๐Ÿ„Schema drift
    • ๐Ÿ”Replication method
    • ๐Ÿง™Source monitoring
    • ๐ŸŽSource catalog
      • Community
        • Github Stars
        • Slack
        • Orbit
      • Database
        • PostgreSQL / Postgres
          • ๐Ÿ’กTip: Extracting the relationships
        • MariaDB / MySQL
      • eCommerce
        • WooCommerce
      • Engineering
        • Github
      • Finance
        • Brex
        • Pennylane
          • Pennylane (Redshift) - General Ledger & Trial Balance
          • Pennylane API - Customer Invoices
        • Qonto
        • Stripe
        • QuickBooks
      • Marketing / Growth
        • Facebook Ads
        • Google Ads
        • Google Analytics
          • Google Analytics (V4)
          • Google Analytics (Universal Analytics)
        • LaGrowthMachine
        • lemlist
        • LinkedIn Ads
        • Salesloft
      • No-Code
        • Airtable
        • Bubble
        • Google Sheets
      • Support
        • Intercom
      • Product
        • Amplitude
        • MixPanel
        • Segment
      • Sales / CRMs
        • Aircall
        • Pipedrive
        • Hubspot
        • Recruit CRM
        • Salesforce
Powered by GitBook
On this page
  • Prerequisites
  • Create an IAM User and generate an Access Key (+secret)
  • Select your region & Workgroup

Was this helpful?

  1. Warehouse
  2. Amazon Athena

Connect your Athena

PreviousAmazon AthenaNextAmazon Redshift

Last updated 2 years ago

Was this helpful?

In order to connect your Athena cluster, Whaly needs some credentials. This guide will details the necessary steps:

  1. Create an IAM User and generate an Access Key (+secret)

  2. Select your region & work group

Prerequisites

To connect Athena to Whaly, you need the following:

  • An AWS Project

  • Admin rights on the AWS Project (to create IAM User, custom policy and a S3 Bucket)

  • An S3 bucket on which the query results can be written.

To save cost on the Output Bucket, you can configure to delete any file after 1 day as the results won't be used after a query have resolved.

Create an IAM User and generate an Access Key (+secret)

To connect to your AWS Athena cluster, Whaly need to have a User and its credentials (Access Key). In order to create such a User,

When being asked which permissions and policies the user should have, that have the following rights:

In the Policy definition, you need to fill the ARNs of the S3 buckets that Whaly will have access to.

Whaly user needs to access to 2 kinds of S3ย buckets:

  • Input buckets: Those are the buckets in which you have the data that is being queried by Athena

  • A single Output bucket: This is the bucket that Whaly will use to store the query results

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "athena:GetTableMetadata",
                "athena:StartQueryExecution",
                "athena:ListDataCatalogs",
                "athena:GetQueryResults",
                "athena:GetDatabase",
                "athena:GetDataCatalog",
                "athena:ListWorkGroups",
                "athena:ListQueryExecutions",
                "athena:GetWorkGroup",
                "athena:StopQueryExecution",
                "athena:ListEngineVersions",
                "athena:GetQueryResultsStream",
                "athena:ListDatabases",
                "athena:GetQueryExecution",
                "athena:ListTableMetadata",
                "athena:BatchGetQueryExecution"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabases",
                "glue:GetDatabase",
                "glue:GetTables",
                "glue:GetTable",
                "glue:GetPartition",
                "glue:GetPartitions",
                "glue:BatchGetPartition"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucketMultipartUploads",
                "s3:PutBucketPublicAccessBlock",
                "s3:AbortMultipartUpload",
                "s3:CreateBucket",
                "s3:ListBucket",
                "s3:GetBucketLocation",
                "s3:ListMultipartUploadParts"
            ],
            "Resource": [
            // In this list, you should include the S3 ARNs of both inputs and output buckets
            // Ex:
            // Output bucket
            // "arn:aws:s3:::whaly-athena-output/*",
            // "arn:aws:s3:::whaly-athena-output",
            // Input buckets
            // "arn:aws:s3:::whaly-athena-input",
            // "arn:aws:s3:::whaly-athena-input/*",
            // ...
            ]
        }
    ]
}

Select your region & Workgroup

Once the IAM User created with the proper policy, under it to retrieve the Access Key Id and the Access Key Secret.

In order to properly query your Athena data, Whaly needs to know in which region you want to run the compute. It should be one of the (ex. us-east1). A good practise would be to use the same one as the one you are using when doing SQL Queries in the console:

Also, you'll need to Inside Whaly, you'll need to pass the name of the Workgroup you wish to use when querying your data with Whaly.

โš”๏ธ
You can create one if needed.
its Bucket Lifecycle rule
please follow this guide.
please create a custom Policy
you can create an Access Key
AWS Region
select an existing work group or create one.