Setting up Data Sources

www.altova.com Print this Topic Previous Page Up One Level Next page

Home >  Comparing Database Data >

Setting up Data Sources

Before you can perform comparison of tabular data originating from a database, a data source which includes the database connection details and other configuration parameters must be configured. One data source must exist for each table, view, or custom SQL query that you want to include in the comparison.

 

To configure a data source, open the client configuration file and add a new [datasouce:<name>] section, where <name> is the unique name by which you want to identify this data source. The parameters that may be present in a [datasource] section are listed below.

 

You can also add data sources to the server configuration file. This is useful if you need to connect to the data source from the machine where DiffDog Server (not the DiffDog Command Line Client) runs. However, note that server-side connections that require Windows authentication for databases like SQL Server may encounter issues in this case, because the DiffDog Server service runs as a separate user who may not have access to the database. The workaround in that case is to define the data source in the client configuration file—the DiffDog Command Line Client runs as the user running the data-diff command.

 

As stated previously, each data source must provide one table, view or some custom named recordset retrieved by a SQL query. To specify this, set the table or sql options below (they are mutually exclusive). Importantly, for each table, view, or recordset, there should be at least one column which guarantees the uniqueness of each row. This column (or multiple columns) are subsequently referred to as the "key" columns. If the table has a primary key, a key column already exists and the uniqueness of the row is guaranteed. Otherwise, it is recommended that you specify one or more columns that should be treated as "key" columns. You can do this by setting  the key-columns-by-name and key-columns-by-ordinal options below.

 

type

Mandatory parameter. Specifies the type of the database connection. Valid values are: ado, adonet, jdbc, odbc, postgresql, and sqlite. Example:

 

type=sqlite

 

Connections of type postgresql and sqlite are built-in and do not require any drivers. For other database kinds, choose a database connection method that is suitable for the machine that will be running the comparison job. Such connections require that the corresponding database drivers be installed on the machine. For more information, see Supported Databases.

 

path

This parameter specifies the path to the database file. It is applicable only for SQLite databases. Do not set this parameter if you set the connection parameter, because they are mutually exclusive.

 

path=C:\Articles.sqlite

SQLite

 

connection

Conditional parameter. Specifies the database connection string. Do not set this parameter if you set the path parameter, because they are mutually exclusive.

 

connection=Data Source=DBSQLSERV;User ID=altova_user;Password=dhjdf84h;Provider=SQLNCLI11.1;Initial Catalog=NANONULL;Persist Security Info=true

SQL Server (ADO)

connection=jdbc:db2://mydb2-105:50000/NANONULL

DB2 (JDBC)

connection=jdbc:oracle:thin:@dbora12c:1521:orcl12c

Oracle (JDBC)

connection=my.dbserver.com

PostgreSQL (Native)

Note:For native PostgreSQL connections, the connection can also be the IP address of the database server, for example, 127.0.0.1. The database and port of a native PostgreSQL connection are specified in the database and port options.

 

Call the db-drivers command to view connections available on the local machine, displayed in a format that you can copy-paste. Also, if you have licensed DatabaseSpy, you can create a database connection from there, and then reuse the connection details as displayed in the Properties window.

 

database

Mandatory and applicable only for native PostgreSQL connections. Specifies the database of the PostgreSQL connection.

 

database = ZooDB

 

port

Specifies the port of the PostgreSQL database connection. Applicable only for native PostgreSQL database connections.

 

table

Conditional parameter. Specifies the full name of the database table or view to be used in the left or right part of the comparison. If you configure this parameter, do not set the sql parameter, because they are mutually exclusive. Examples:

 

table=[HR].[dbo].[Employees]

SQL Server

table="ALTOVA_ADMIN"."customers"

IBM DB2

Tip:With DatabaseSpy, you can easily obtain the full table name by right-clicking the table in Online Browser, and then selecting Show in new SQL Editor | Path from the context menu.

 

key-columns-by-name

For each table, view, or recordset, there should be at least one column which guarantees the uniqueness of each row. This column (or multiple columns) are subsequently referred to as the "key" columns. If the table has a primary key, a key column already exists and the uniqueness of the row is guaranteed. Otherwise, it is recommended that you specify one or more columns that should be treated as "key" columns. These don't necessarily have to be columns that are part of an actual primary or unique key on the table. Examples:

 

key-columns-by-name="guid"

 

 

key-columns-by-name="first_name","last_name","email"

 

Defining "key" columns is optional but should be done as best practice. If you don't define the "key" columns, the application will determine them automatically, but this may not work consistently across all database drivers. Also, detection works for tables or views only, so it is recommended that you define the "key" columns explicitly if you are using SQL queries that return computed values or results returned by stored procedures.

 

If you specify key columns, then the number of key columns must be the same for the "left" and "right" data source. For example, if you specify two key columns for the left data source and only one key column for the right data source, the comparison will fail with an error.

 

key-columns-by-ordinal

You can specify a "key" column either by name (see above) or by its ordinal number in the table, starting with 1. This option lets you specify key columns by their ordinal number. Examples:

 

key-columns-by-ordinal=1

 

 

key-columns-by-ordinal=1,2,5

 

If you specify key columns, then the number of key columns must be the same for the "left" and "right" data source.

 

user

Conditional parameter. Specifies the username required to connect to the database. This parameter is not necessary if the database accepts anonymous connections, or if the connection string already includes the username (see the connection parameter).

 

pass

Conditional parameter. Specifies the password required to connect to the database. This parameter is not necessary if the database accepts anonymous connections, or if the connection string already includes the password (see the connection parameter).

 

assembly

This option is mandatory for ADO.NET connections. Examples:

 

assembly=System.Data.SqlClient.SqlClientFactory, System.Data, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089

.NET Framework Data Provider for SQL Server

assembly=IBM.Data.DB2.iSeries.iDB2Factory, IBM.Data.DB2.iSeries, Version=12.0.0.0, Culture=neutral, PublicKeyToken=9cdb2ebfb1f93a26

.NET Framework Data Provider for IBM i

Run the db-drivers command to view .NET assemblies available on the local machine, displayed in a format that you can copy-paste.

 

 

class

This option is mandatory for ADO.NET and JDBC connections. Examples:

 

class=com.ibm.db2.jcc.DB2Driver

IBM DB2 via JDBC

class=com.mysql.jdbc.Driver

MySQL via JDBC

class=System.Data.SqlClient

SQL Server via ADO.NET

Run the db-drivers command to view the JDBC drivers available on the local machine, displayed in a format that you can copy-paste.

 

sql

Conditional parameter. Specifies a SELECT query in the syntax of the current database. The recordset retrieved by the SELECT query may be used as the left or right part of the comparison. If you set this parameter, do not set the table parameter, because they are mutually exclusive. Examples:

 

sql="SELECT * FROM employees WHERE a > 5 and b < 6 ORDER BY c"

SELECT query

sql="EXEC dbo.uspGetEmployeeManagers @BusinessEntityID = 50"

Stored procedure

If you specify a stored procedure call, it is expected that the stored procedure returns a recordset, not just output parameters.


© 2019 Altova GmbH