Getting the most out of Oracle with Polybase & SQL Server 2019

Getting data out of Oracle database can be a real pain, but what if there was a way to leverage the toolset around SQL Server to do it? This is where Polybase comes in. A technology bundled with SQL Server that lets us map external data sources as tables in SQL Server. In this post we are going to look at how to connect to Oracle in order to get data from it using T-SQL.

Before we start, I am going to be making the following assumptions about the environment;

  1. Oracle has been installed and configured on a server that you have access to.
  2. SQL Server 2019 is installed with the Polybase feature.
  3. There is an Oracle user configured which you have the username and password for.

Enabling Polybase

Enabling Polybase is done by setting the appropriate sp_configure option within the SQL Server instance.

[T-SQL]
IF((SELECT SERVERPROPERTY ('IsPolyBaseInstalled'))= 1)
BEGIN

EXEC sp_configure 'polybase enabled', 1;
RECONFIGURE;

END
GO

Once this setting has been enabled then we are in a position to start working on connecting to Oracle to get at the data.

Note: It is recommended to make sure you are on the latest Cumulative Update for SQL Server 2019 as there have been several fixes deployed for Polybase since RTM.

Connecting to Oracle

Before we get into the details of how we connect to Oracle lets take a moment to have a quick look at what we want to achieve.

Diagram showing External Tables in SQL Server linking to tables in Oracle.

As we can see above we have three tables in the Oracle database which we want to surface in the SQL Server database. We will be using External Tables in SQL Server which we will query. These External Tables rely on External Data Sources which define where the target is. Finally, the External Data Source needs an identity with which to authenticate against the Oracle Instance, for this we will use a Database Scoped Credential.

Once all of this is in place we are then able to use T-SQL to query the objects in the Oracle database via the external tables in the SQL Server Database.

Database Setup

For the purposes of this I will create a new database which we will use for hosting the external tables. I am also going to create a dedicated schema to place these objects into as this then gives more versatility on how to grant access to users.

[T-SQL]
CREATE DATABASE OracleInterface;
GO

USE OracleInterface;
GO

CREATE SCHEMA ORA12
AUTHORIZATION DBO
;
GO

CREATE MASTER KEY
ENCRYPTION BY PASSWORD = 'MyRe@llyStr0ngPassw0rd!';
GO

Finally, we create a Database Master Key which allows us to use the Database Scoped Credentials mentioned earlier. Note: Make sure that you take a backup of the master key once you have created it.

Create Database Scoped Credential

The database scoped credential is where we set the username and password we will use to authenticate against the Oracle database. 

[T-SQL]
CREATE DATABASE SCOPED CREDENTIAL ora12_hr
WITH IDENTITY = 'HR', Secret = 'MyStr0ngP@ssw0rd'
;
GO

Once this has been completed we can look to define the target we want to connect to.

Create External Data Source

The scope and number of External Data Sources that you create can be defined by any number of factors. Typically I would look to think about how the target system is constructed and how likely parts are to be moved. As such I would look to create unique External Data Sources where the following are in play;

  • Different Oracle Schema’s that can be accessed by the same user.
  • Where the target table is located in a pluggable Oracle database.
  • Where you want to define different configurations for Pushdown in the External Data Source.

In this example I am going to create two External Data Sources to demonstrate that it is possible.

[T-SQL]
CREATE EXTERNAL DATA SOURCE ORA12
WITH
(
LOCATION = 'oracle://ORA12.jqmartin.info:1521',
CREDENTIAL = ora12_hr,
PUSHDOWN = ON
)
;
GO

CREATE EXTERNAL DATA SOURCE ORA12C
WITH
(
LOCATION = 'oracle://ORA12.jqmartin.info:1521',
CREDENTIAL = ora12_hr,
PUSHDOWN = ON
)
;
GO

Here we are using the same Database Scoped Credential for this example.

Create External Table

Now that we have the access components in place we can look to create our SQL Server based External Tables to map to the Oracle tables.

Important elements to take into account when defining the external tables is that it will be case sensitive for the object names in Oracle. We also want to make sure that we are using three part names to be certain that we are connecting to the right objects.

[T-SQL]
CREATE EXTERNAL TABLE ORA12.Employees
(
EMPLOYEE_ID INT NOT NULL ,
FIRST_NAME VARCHAR(20) COLLATE Latin1_General_100_BIN2_UTF8,
LAST_NAME VARCHAR(25) COLLATE Latin1_General_100_BIN2_UTF8 NOT NULL,
EMAIL VARCHAR(25) COLLATE Latin1_General_100_BIN2_UTF8 NOT NULL,
PHONE_NUMBER VARCHAR(20) COLLATE Latin1_General_100_BIN2_UTF8,
HIRE_DATE DATE NOT NULL,
JOB_ID VARCHAR(10) COLLATE Latin1_General_100_BIN2_UTF8 NOT NULL,
SALARY DECIMAL(8,2),
COMMISSION_PCT DECIMAL(2,2),
MANAGER_ID INT,
DEPARTMENT_ID SMALLINT
)
WITH
(
LOCATION = '[Ora12c].[HR].[EMPLOYEES]',
DATA_SOURCE = ORA12
)
;
GO

CREATE EXTERNAL TABLE ORA12.Jobs
(
JOB_ID VARCHAR(10) COLLATE Latin1_General_100_BIN2_UTF8 NOT NULL,
JOB_TITLE VARCHAR(35) COLLATE Latin1_General_100_BIN2_UTF8 NOT NULL,
MIN_SALARY INT,
MAX_SALARY INT
)
WITH
(
LOCATION = '[Ora12c].[HR].[JOBS]',
DATA_SOURCE = ORA12
)
;
GO

CREATE EXTERNAL TABLE ORA12.Departments
(
DEPARTMENT_ID INT NOT NULL,
DEPARTMENT_NAME VARCHAR(30) COLLATE Latin1_General_100_BIN2_UTF8 NOT NULL,
MANAGER_ID INT,
LOCATION_ID INT
)
WITH
(
LOCATION = '[Ora12c].[HR].[DEPARTMENTS]',
DATA_SOURCE = ORA12C
)
;
GO

You will notice in the T-SQL that we have to define the collation for string character columns. We also need to translate the Oracle data types to SQL Server types. If you want to see the structure of the table objects in Oracle you can use the DESCRIBE function to get the information.

[PL/SQL]
DESCRIBE HR.EMPLOYEES;
DESCRIBE HR.DEPARTMENTS;
DESCRIBE HR.JOBS;

The output from the Describe statement will look similar to this;

Name                        Null?              Type
————–                  ——–               ————
EMPLOYEE_ID        NOT NULL    NUMBER(6)
FIRST_NAME                                   VARCHAR2(20)
LAST_NAME            NOT NULL    VARCHAR2(25)
EMAIL                      NOT NULL     VARCHAR2(25)
PHONE_NUMBER                           VARCHAR2(20)
HIRE_DATE              NOT  NULL   DATE
JOB_ID                       NOT NULL   VARCHAR2(10)
SALARY                                            NUMBER(8,2)
COMMISSION_PCT                         NUMBER(2,2)
MANAGER_ID                                 NUMBER(6)
DEPARTMENT_ID                           NUMBER(4)

Once we have this we need to convert those types to SQL Server equivalents. In the case of NUMBER it is important to understand the behaviour for the definition. The Scale and Precision elements will help us decide on whether we select INT types or DECIMAL type. In this situation I defaulted to INT for the columns with no precision. In reality profiling the data to understand the min/max values present currently and what they can hold.

Querying Data

Now that we have the External Tables built we can select the data from the Oracle system using T-SQL.

[T-SQL]
SELECT e.Employee_id,
e.First_name,
e.last_name,
e.Hire_date,
j.job_title,
d.department_name
FROM ORA12.Employees AS e
JOIN ORA12.Jobs AS j ON e.Job_id = j.Job_id
JOIN ORA12.Departments AS d ON e.department_id = d.department_id
WHERE Employee_id BETWEEN 159 AND 197
;

Here we are joining to external tables in Oracle so when we look at the execution plan we can see it is treated as a Remote Query.

SQL Server Execution Plan showing Remote Query to Oracle system.

 

Summary

As we have moved through this post we have looked at how to connect SQL Server to Oracle via Polybase. Allowing us to pull data together from multiple systems and potentially any other data sources which Polybase can connect to. This concept is referred to as Data Virtualization and is becoming an alternative to traditional ETL processing.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.