Using SQOOP for moving data between HADOOP and SQL Server

 

Following the post on how to load data from HADOOP into SQL Server, here is an example how you can use different technology for transferring data using SQOOP.

 

Move data from HIVE table in HADOOP into SQL.

 

1. Hadoop service in the https://www.hadooponazure.com/ comes with the Hive sample table called HiveSampleTable. For the simplicity I will use this table. If you want to create your own, you can follow the steps in appendix, which describe how you can create your own source table

2. On the SQL Server side we need to make sure we have table, which will be receiving the data.

Here is the T-SQL syntax for the TARGET table

CREATE TABLE [dbo].[TARGET](

[clientid] [nvarchar](334) NULL,

[querytime] [nvarchar](334) NULL,

[market] [nvarchar](334) NULL,

[deviceplatform] [nvarchar](334) NULL,

[devicemake] [nvarchar](334) NULL,

[devicemodel] [nvarchar](334) NULL,

[state] [nvarchar](334) NULL,

[country] [nvarchar](334) NULL,

[querydwelltime] [float] NULL,

[sessionid] [bigint] NULL,

[sessionpagevieworder] [bigint] NULL

) ON [PRIMARY]

3. You also need to make sure that you can move data in and out of the HADOOP and appropriate ODBC ports were opened:

clip_image002

4. Open Hadoop command prompt and type following command to initiate SQOOP moving data:

c:\Apps\dist\sqoop>bin\sqoop.cmd export –connect "jdbc:sqlserver://alexeik0N:1433;database=RelationalDB;user=sa;password=Password" –table SOURCE –export-dir /hive/warehouse/hivesampletable –input-fields-terminated-by "\t"

where is:

jdbc:sqlserver://alexeik0N:1433 – your destination SQL Server

database=RelationalDB – destination DB

user=sa;password=Password – credentials for the user to allow SQOOP connect to the DB

table TARGET – destination table. SQOOP doesn’t like space in the name!!

export-dir /hive/warehouse/hivesampletable – Hive table directory

input-fields-terminated-by "\t" – delimiter symbol. In this particular case we say that original table uses horizontal TAB delimiter

5. Execute this command and monitor the execution in the Hadoop Command Prompt:

clip_image003

Verify that the records were exported

6. Now on the SQL Server side make sure that data successfully landed in the TARGET table:

clip_image005

Move data from SQL Table into HADOOP table.

 

1. Select SQL Server table which will be used as the Source. Here is the example of the table, that I will be using:

SELECT [clientid]

,[querytime]

,[market]

,[deviceplatform]

,[devicemake]

,[devicemodel]

,[state]

,[country]

,[querydwelltime]

,[sessionid]

,[sessionpagevieworder]

FROM [RelationalDB].[dbo].[SOURCE]

2. Now, Let’s write the SQOOP Command to load date from SQL Server to the HDFS

c:\Apps\dist\sqoop\bin>sqoop.cmd import –connect “jdbc:sqlserver://alexeik0N:1433;database=RelationalDB;user=sa;password=Password" –table SOURCE –target-dir /hive/warehouse/SQLDATA/ –as-textfile –split-by clientid –fields-terminated -by \t -m 1

Where:

jdbc:sqlserver://alexeik0N:1433Source SQL Server

database=RelationalDB Database on the server, which stores source table and data

user=sa;password=Password" – User credential, which SQOOP will be using to connect to SQL Server

–table SOURCE – Name of the source table in SQL Server. Doesn’t like space in the name!

–target-dir /hive/warehouse/SQLDATA/ – HDFSdirectory which will be created for uploading data

–split-by clientid – Column of the table used to split work units

–fields-terminated-by \t – Here you can define how do you separate columns in the text file

-m 1 – number of MAP tasks

 

Important note: you may be concerned submitting password in the connection string. In that case you may use the option –P, which will prompt you for entering password.

— HIVE IMPORT APPEND with -P (console password)

sqoop.cmd import –connect "jdbc:sqlserver://sqlserver.database.windows.net:1433;database=someDB1gb" –username username@sqlserverP  –table lineitem –split-by l_orderkey –hive-import –append

3. Monitor execution of the script in the HADOOP Command Prompt

clip_image006

You can also check that the folder was created and the data arrived successfully there:

clip_image008

And then in the folder bellow SQLDATA, you find your text file:

clip_image010

That’s all!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: