# Connect your Athena

In order to connect your Athena cluster, Whaly needs some credentials. This guide will details the necessary steps:

1. Create an IAM User and generate an Access Key (+secret)
2. Select your region & work group

### Prerequisites <a href="#prerequisites" id="prerequisites"></a>

To connect Athena to Whaly, you need the following:

* An AWS Project
* Admin rights on the AWS Project (to create IAM User, custom policy and a S3 Bucket)
* An S3 bucket on which the query results can be written. [You can create one if needed.](https://docs.aws.amazon.com/AmazonS3/latest/userguide/create-bucket-overview.html)&#x20;

{% hint style="info" %}
To save cost on the Output Bucket, you can configure [its Bucket Lifecycle rule](https://docs.aws.amazon.com/AmazonS3/latest/userguide/object-lifecycle-mgmt.html) to delete any file after 1 day as the results won't be used after a query have resolved.&#x20;
{% endhint %}

## Create an IAM User and generate an Access Key (+secret)

To connect to your AWS Athena cluster, Whaly need to have a User and its credentials (Access Key). In order to create such a User, [please follow this guide.](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_users_create.html)

When being asked which permissions and policies the user should have, [please create a custom Policy](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create.html) that have the following rights:

{% hint style="info" %}
In the Policy definition, you need to fill the ARNs of the S3 buckets that Whaly will have access to.&#x20;

Whaly user needs to access to 2 kinds of S3 buckets:

* Input buckets: Those are the buckets in which you have the data that is being queried by Athena
* A single Output bucket: This is the bucket that Whaly will use to store the query results
  {% endhint %}

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "athena:GetTableMetadata",
                "athena:StartQueryExecution",
                "athena:ListDataCatalogs",
                "athena:GetQueryResults",
                "athena:GetDatabase",
                "athena:GetDataCatalog",
                "athena:ListWorkGroups",
                "athena:ListQueryExecutions",
                "athena:GetWorkGroup",
                "athena:StopQueryExecution",
                "athena:ListEngineVersions",
                "athena:GetQueryResultsStream",
                "athena:ListDatabases",
                "athena:GetQueryExecution",
                "athena:ListTableMetadata",
                "athena:BatchGetQueryExecution"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "glue:GetDatabases",
                "glue:GetDatabase",
                "glue:GetTables",
                "glue:GetTable",
                "glue:GetPartition",
                "glue:GetPartitions",
                "glue:BatchGetPartition"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "s3:GetObject",
                "s3:ListBucketMultipartUploads",
                "s3:PutBucketPublicAccessBlock",
                "s3:AbortMultipartUpload",
                "s3:CreateBucket",
                "s3:ListBucket",
                "s3:GetBucketLocation",
                "s3:ListMultipartUploadParts"
            ],
            "Resource": [
            // In this list, you should include the S3 ARNs of both inputs and output buckets
            // Ex:
            // Output bucket
            // "arn:aws:s3:::whaly-athena-output/*",
            // "arn:aws:s3:::whaly-athena-output",
            // Input buckets
            // "arn:aws:s3:::whaly-athena-input",
            // "arn:aws:s3:::whaly-athena-input/*",
            // ...
            ]
        }
    ]
}
```

* Once the IAM User created with the proper policy, [you can create an Access Key](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_credentials_access-keys.html) under it to retrieve the **Access Key Id** and the **Access Key Secret**.

## Select your region & Workgroup

In order to properly query your Athena data, Whaly needs to know in which region you want to run the compute. It should be one of the [AWS Region](https://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/Concepts.RegionsAndAvailabilityZones.html) (ex. us-east1). A good practise would be to use the same one as the one you are using when doing SQL Queries in the console:

<figure><img src="https://34758050-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MC1g4I2_CgXd0qAq1P7%2Fuploads%2Fq63SRMCHJEjH7hUsjOC1%2Fimage.png?alt=media&#x26;token=001d83a2-3e0d-4b94-b413-19922b98a0b1" alt=""><figcaption></figcaption></figure>

Also, you'll need to [select an existing work group or create one.](https://docs.aws.amazon.com/athena/latest/ug/workgroups-create-update-delete.html) Inside Whaly, you'll need to pass the name of the Workgroup you wish to use when querying your data with Whaly.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.whaly.io/warehouse/amazon-athena/connect-your-athena.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
