Monday, 11 July 2016

Getting Started with JSON Support in SQL Server 2016 – Part 1

Introduction


SQL Server 2005 introduced native support for storing, managing and processing XML data when XML had become the de-facto standard for data interchange. This feature still exists in SQL Server as XML has some inherent advantages and is still used widely, but because many new applications have started using JSON for a light-weight data interchange mechanism, SQL Server 2016 introduces built-in support for storing, managing and processing JSON data. In this article series, I will explain this new feature and demonstrate how to use it in your applications in different scenarios.


JSON Data Processing in SQL Server


SQL Server 2016 has introduced built-in support for storing, managing and parsing JSON data. Though it is possible to store, manage and parse JSON data even in the earlier versions of SQL Server, it requires more effort on the development side to implement it. With the built-in support, you can do it very quickly while focusing on your data parsing or your application logic rather than focusing on writing functions to parse JSON data.

JSON stands for JavaScript Object Notation and is a light-weight, human-readable text data exchange format, based on a subset of the JavaScript programming language (though it’s  language independent and hence these days used in many programming languages).

With this new feature, you can convert to the JSON format from tabular data available in relational tables as well as parse JSON data to bring it to the tabular format for reporting, joining with other tables, or passing it to other applications that expect tabular data etc.

Unlike XML native support, where you have the XML data type to store XML data or documents, you use NVARCHAR data to store JSON data or documents in SQL Server 2016. This means you are not constrained; you can store and parse JSON data almost everywhere, wherever NVARCHAR is supported. Also, you don’t need any special type of indexes on JSON data, you can still use regular indexes, which you are already familiar with.

Exporting Tabular Data as JSON Data


SQL Server 2016 introduces the FOR JSON [ AUTO | PATH ] clause to be used with your query to format your query result in JSON format before returning it to the client. If you have prior experience working with the FOR XML clause, you might find this new clause quite similar.

When you specify FOR JSON AUTO, SQL Server automatically formats the nested JSON sub arrays of the query result based on table hierarchy and column order used in the query.

Let me demonstrate this with an example. Assuming you have AdventureWorks2014 database (you can download it from here), you can run this query to get products ordered in these specific two orders. As you can see in the image below, for the OrderID 43663 there is only one product whereas OrderID 43687 has two products.

SELECT   H.SalesOrderID,   H.Status, H.PurchaseOrderNumber, H.ShipDate, P.ProductID  

FROM   [Sales].[SalesOrderHeader] H
INNER JOIN [Sales].SalesOrderDetail   D ON H.SalesOrderID   = D.SalesOrderID
INNER JOIN [Production].[Product]   P ON D.ProductID   = P.ProductID
WHERE H.SalesOrderID IN (43663, 43687)
GO

Getting Started with JSON Support in SQL Server 2016 – Part 1
Query to order products
Now let’s add the FOR JSON AUTO clause to the end of the above query and run it. As you can see in the image below, this time SQL Server took the result of the query, formatted it as a JSON document and finally returned it back to client:

SELECT   H.SalesOrderID,   H.Status, H.PurchaseOrderNumber, H.ShipDate, P.ProductID 
FROM   [Sales].[SalesOrderHeader] H
INNER JOIN [Sales].SalesOrderDetail   D ON H.SalesOrderID   = D.SalesOrderID
INNER JOIN [Production].[Product]   P ON D.ProductID   = P.ProductID
WHERE H.SalesOrderID IN (43663, 43687)
FOR JSON AUTO 
GO
Getting Started with JSON Support in SQL Server 2016 – Part 1
Query formatted as a JSON document

SQL Server does not format the JSON output by including whitespaces like tabs or spaces for formatting for the obvious reasons, though you can do it manually or use a local or online tool for formatting it for better readability.

When you format the JSON data from the above query, you will notice, with the FOR JSON AUTO clause, SQL Server automatically formatted the JSON output based on the query structure (order of the columns in the SELECT list and their source table).

In this case, since SalesOrderHeader is the first table and SalesOrderDetail is the second table, columns from the SalesOrderHeader are generated as properties of the parent object whereas columns from the SalesOrderDetail are generated as properties of the nested object:

Getting Started with JSON Support in SQL Server 2016 – Part 1
SQL Server automatically formatted the JSON output
The FOR JSON AUTO clause should be sufficient in most scenarios but there might be some other specific scenario where you would like to have control in how JSON data is generated or nested. The FOR JSON PATH clause gives you full control to specify the output format of the JSON data; it lets you create wrapper objects and nest complex properties.

If your query contains two or more tables, it returns a flat result by default where each column in the result becomes the property of the JSON object.

You can further use dot-separated column names for nested results; in the case below I have use Order as object name and this is what you will see as result of it:

SELECT   H.SalesOrderID AS   'Order.OrderID',   H.Status AS 'Order.Status', 
H.PurchaseOrderNumber AS   'Order.PONumber',   H.ShipDate AS   'Order.ShipDate',   
P.ProductID AS 'Order.ProductDI'
FROM   [Sales].[SalesOrderHeader] H
INNER JOIN [Sales].SalesOrderDetail   D ON H.SalesOrderID   = D.SalesOrderID
INNER JOIN [Production].[Product]   P ON D.ProductID   = P.ProductID
WHERE H.SalesOrderID IN (43663, 43687)
FOR JSON PATH

Getting Started with JSON Support in SQL Server 2016 – Part 1
Order as object name results
You can nest the result by using dot-separated column names with a different suffix or object name (by specifying column aliases that define the structure of your JSON data). For example, in this case you have a top level object, which has two Order and Product objects nested inside. With each of the objects we have specified some properties to be encapsulated within.

SELECT   H.SalesOrderID AS   'Order.OrderID',   H.Status AS 'Order.Status', 
H.PurchaseOrderNumber AS   'Order.PONumber',   H.ShipDate AS   'Product.ShipDate',   
P.ProductID AS 'Product.ProductID'
FROM   [Sales].[SalesOrderHeader] H
INNER JOIN [Sales].SalesOrderDetail   D ON H.SalesOrderID   = D.SalesOrderID
INNER JOIN [Production].[Product]   P ON D.ProductID   = P.ProductID
WHERE H.SalesOrderID IN (43663, 43687)
FOR JSON PATH
GO

Getting Started with JSON Support in SQL Server 2016 – Part 1
Result of using dot-separated column names
By default, JSON output does not include a root element and hence you can use a ROOT keyword with either FOR JSON AUTO or FOR JSON PATH clause to include a single, top-level element to the JSON output. For example, with the below query I am adding a top-level Orders element and the result will be nested inside as an array, as you can see in the image below.

You might also notice that, unlike in previous cases, this time the outer element has been changed from an array to an object which contains Orders as a property, this property further contains array of objects (Order and Product):

SELECT   H.SalesOrderID AS   'Order.OrderID',   H.Status AS 'Order.Status', 
H.PurchaseOrderNumber AS   'Order.PONumber',   H.ShipDate AS   'Product.ShipDate',   
P.ProductID AS 'Product.ProductID'
FROM   [Sales].[SalesOrderHeader] H
INNER JOIN [Sales].SalesOrderDetail   D ON H.SalesOrderID   = D.SalesOrderID
INNER JOIN [Production].[Product]   P ON D.ProductID   = P.ProductID
WHERE H.SalesOrderID IN (43663, 43687)
FOR JSON PATH, ROOT ('Orders')
GO

Getting Started with JSON Support in SQL Server 2016 – Part 1
Result nested inside an array
Like ROOT, you can also use the INCLUDE_NULL_VALUES keyword, with either the FOR JSON AUTO or FOR JSON PATH clause, to include null values (as JSON properties) in the JSON output as they are not included in the output by default.