Establishing a Connection
In order to connect to Apache Impala, set the Server, Port, and ProtocolVersion. You may optionally specify a default Database.
Authenticating to Apache Impala
There are several ways to authenticate to Apache Impala including NOSASL,LDAP,KERBEROS. The following sections cover how to connect over each.
Authenticating with NOSASL
When using NOSASL, no authentication is performaned. It is used when you are connecting to a server from a trusted location such as a test machine on your local network. By default, NOSASL is as the default AuthScheme, so no additional connection properties need to be set.
Authenticating with LDAP
To authenticate with LDAP, set the following connection properties:
- AuthScheme: Set this to LDAP.
- User: Set this to user to login as.
- Password: Set this to the password of the user.
Authenticating with Kerberos
Please see Using Kerberos for details on how to authenticate with Kerberos.
Using Kerberos
This section shows how to use the provider to authenticate to Apache Impala using Kerberos.
Authenticating with Kerberos
To authenticate to Apache Impala using Kerberos, set the following properties:
- AuthScheme: Set this to KERBEROS
- KerberosKDC: Set this to the host name or IP Address of your Kerberos KDC machine.
- KerberosRealm: Set this to the realm of the Hive Kerberos principal. This will be the value after the '@' symbol (for instance, EXAMPLE.COM) of the hive.metastore.kerberos.principal value (for instance, hive/_HOST@EXAMPLE.COM) of the hive-site.xml file.
- KerberosSPN: Set this to the service and host of the Hive Kerberos Principal. This will be the value prior to the '@' symbol (for instance, hive/_HOST) of the hive.metastore.kerberos.principal value (for instance, hive/_HOST@EXAMPLE.COM) of the hive-site.xml file. If '_HOST' is specified, the driver will attempt to identify the host using a reverse DNS lookup. If a reverse DNS lookup fails, it may be required to explicitly specify the host.
Retrieve the Kerberos Ticket
You can use one of the following three options to retrieve the required Kerberos ticket.
MIT Kerberos Credential Cache File
This option enables you to use the MIT Kerberos Ticket Manager to get tickets. Note that you won't need to set the User or Password connection properties with this option.
- Ensure that you have an environment variable created called KRB5CCNAME.
- Set the KRB5CCNAME environment variable to a path pointing to your credential cache file (for instance, C:\krb_cache\krb5cc_0). This file will be created when generating your ticket with MIT Kerberos Ticket Manager.
- To obtain a ticket, open the MIT Kerberos Ticket Manager application, click Get Ticket, enter your principal name and password, then click OK. If successful, ticket information will appear in Kerberos Ticket Manager and will now be stored in the credential cache file.
- Now that the credential cache file has been created, the provider will use the cache file to obtain the kerberos ticket to connect to Apache Impala.
Keytab File
If the KRB5CCNAME environment variable has not been set, you can retrieve a Kerberos ticket using a Keytab File. To do this, set the User property to the desired username and set the KerberosKeytabFile property to a file path pointing to the keytab file associated with the user.
User and Password
If both the KRB5CCNAME environment variable and the KerberosKeytabFile property have not been set, you can retrieve a ticket using a User and Password combination. To to do this, set the User and Password properties to the user/password combo that you use to authenticate with Apache Impala.
Troubleshooting the Connection
To show provider activity from query execution to network traffic, use Logfile and Verbosity. The examples of common connection errors below show how to use these properties to get more context. Contact the support team for help tracing the source of an error or circumventing a performance issue.
- Authentication errors: Typically, recording a Logfile at Verbosity 4 is necessary to get full details on an authentication error.
-
The certificate presented by the server cannot be validated: This error indicates that the provider cannot validate the server's certificate through the chain of trust. (If you are using a self-signed certificate, there is only one certificate in the chain).
To resolve this error, you must verify yourself that the certificate can be trusted and specify to the provider that you trust the certificate. One way you can specify that you trust a certificate is to add the certificate to the trusted system store; another is to set SSLServerCert.
Other Properties
- Database: A default database to use when one is not supplied in the SQL query. This enables using table names without having to specify database.tablename in the query.
- PageSize: The number of results to pull per page from Apache Impala when selecting data.
- QueryPassthrough: Indicates if the query should be passed to Impala as-is. When QueryPassthrough is set to false (default), the Lyftrondata Provider for Apache Impala will attempt to modify the query to conform to Impala required format.