Vinay Baliyan

Sunday, March 21, 2010

How to remove duplicate rows from a table with using primary key or identity column

The following sql query shows how to remove duplicate values with using primary key field, identity column or any temporary table.

IF OBJECT_ID('dups') IS NOT NULL
DROP TABLE Dups

CREATE TABLE Dups(ID int)
Go

INSERT INTO Dups(ID) VALUES(1)
INSERT INTO Dups(ID) VALUES(2)
INSERT INTO Dups(ID) VALUES(2)
INSERT INTO Dups(ID) VALUES(3)
INSERT INTO Dups(ID) VALUES(4)
INSERT INTO Dups(ID) VALUES(4)
INSERT INTO Dups(ID) VALUES(4)
INSERT INTO Dups(ID) VALUES(5)
INSERT INTO Dups(ID) VALUES(5)
GO

SELECT * FROM Dups
GO

WITH CTE AS
(
select row_number() over (order by id) AS RowNumber,ID from Dups
)
DELETE FROM CTE
WHERE RowNumber IN (
SELECT C1.RowNumber
FROM CTE C1
JOIN CTE C2 ON C1.ID = C2.ID and C1.RowNumber > C2.RowNumber
)
GO

SELECT * FROM Dups
GO

Tuesday, March 9, 2010

Interesting Join in SQL Server 2005

The following shows that some interesting joins:

IF OBJECT_Id('tempdb..#Table1') IS NOT NULL
DROP TABLE #Table1

IF OBJECT_Id('tempdb..#Table2') IS NOT NULL
DROP TABLE #Table2

CREATE TABLE #Table1(ID int)

CREATE TABLE #Table2(ID int)

INSERT INTO #Table1 VALUES(1)
INSERT INTO #Table1 VALUES(1)
INSERT INTO #Table1 VALUES(1)
INSERT INTO #Table1 VALUES(1)
INSERT INTO #Table1 VALUES(1)

INSERT INTO #Table2 VALUES(1)
INSERT INTO #Table2 VALUES(1)
INSERT INTO #Table2 VALUES(1)
INSERT INTO #Table2 VALUES(1)

SELECT T1.*
FROM #Table1 T1
LEFT JOIN #Table2 T2 ON T1.ID = T2.ID

The output is 20 rows regardless of any type join you use here.

Write sql query to find the employees who are making more than 80% of average salary

There is a table as below

Emp (EmpID, Salary)

Write query to find the employees who are making more than 80% of average salary. Discuss the query plan for written query.

Solution:

IF OBJECT_ID('tempdb..#Employee') IS NOT NULL
DROP TABLE #Employee

CREATE TABLE #Employee (EmpID int , Salary int)

INSERT INTO #Employee VALUES(1,3000)
INSERT INTO #Employee VALUES(2,4500)
INSERT INTO #Employee VALUES(3,5000)
INSERT INTO #Employee VALUES(4,6500)
INSERT INTO #Employee VALUES(5,4500)
INSERT INTO #Employee VALUES(6,2000)
INSERT INTO #Employee VALUES(7,1000)

SELECT *
FROM #Employee
WHERE Salary > (SELECT AVG(Salary)*.8 as [80%Salary] FROM #Employee)

This is the estimated execution plan:

|--Nested Loops(Inner Join, WHERE:([tempdb].[dbo].[#Employee].[Salary]>CONVERT_IMPLICIT(numeric(10,0),[Expr1008],0)*(0.8)))
|--Compute Scalar(DEFINE:([Expr1008]=CASE WHEN [Expr1015]=(0) THEN NULL ELSE [Expr1016]/CONVERT_IMPLICIT(int,[Expr1015],0) END))
| |--Stream Aggregate(DEFINE:([Expr1015]=COUNT_BIG([tempdb].[dbo].[#Employee].[Salary]), [Expr1016]=SUM([tempdb].[dbo].[#Employee].[Salary])))
| |--Table Scan(OBJECT:([tempdb].[dbo].[#Employee]))
|--Table Scan(OBJECT:([tempdb].[dbo].[#Employee]))

Monday, March 8, 2010

Write the query to find values where Gaps are starting

/*
There is a table like below

#Numbers (SeqNumber INT)

The data in the column is like this. 1, 2, 4, 5, 10, 11, 15, 1 billion

Write the query to find values where Gaps are starting. Like 2, 5, 11…
*/

IF OBJECT_ID('tempdb..#Numbers') IS NOT NULL
DROP TABLE #Numbers

CREATE TABLE #Numbers (SeqNumber INTEGER NOT NULL PRIMARY KEY);

INSERT INTO #Numbers VALUES (0); --sentinel
INSERT INTO #Numbers VALUES (2);
INSERT INTO #Numbers VALUES (3);
INSERT INTO #Numbers VALUES (5);
INSERT INTO #Numbers VALUES (7);
INSERT INTO #Numbers VALUES (8);
INSERT INTO #Numbers VALUES (14);
INSERT INTO #Numbers VALUES (20);

SELECT s1.SeqNumber+1
FROM #Numbers s1
LEFT JOIN #Numbers s2
ON s1.SeqNumber = s2.SeqNumber -1
WHERE s2.SeqNumber IS NULL

output:

1
4
6
9
16
21

SELECT s1.SeqNumber+1 Start,MIN(s3.SeqNumber)-1 Finish
FROM #Numbers s1
LEFT JOIN #Numbers s2 ON s1.SeqNumber = s2.SeqNumber -1
LEFT JOIN #Numbers s3 ON s1.SeqNumber < s3.SeqNumber
WHERE s2.SeqNumber IS NULL
GROUP BY s1.SeqNumber

Start Finish
1 1
4 4
6 6
9 13
15 19
21 NULL

How to display table records as comma separated values in sql server 2005

IF OBJECT_ID(N'tempdb..#Employee') IS NOT NULL
DROP TABLE #Employee

IF OBJECT_ID(N'tempdb..#Dept') IS NOT NULL
DROP TABLE #Dept

CREATE TABLE #Employee (EmpID int, DeptID int, FName varchar(50), LName varchar(50))

CREATE TABLE #Dept(DeptID int,DeptName varchar(100))

INSERT INTO #Employee VALUES(1,1,'Vinay','Kumar')
INSERT INTO #Employee VALUES(2,1,'Piyush','Mundra')
INSERT INTO #Employee VALUES(3,2,'Srinivasan','Nagarajan')
INSERT INTO #Employee VALUES(4,2,'Bharat','Prasad')
INSERT INTO #Employee VALUES(5,3,'Varun','Rokkam')

INSERT INTO #Dept VALUES(1,'Finance')
INSERT INTO #Dept VALUES(2,'IT')
INSERT INTO #Dept VALUES(3,'Sales')

SELECT D.DeptName,
(SELECT FName + ',' FROM #Employee E WHERE E.DeptID = D.DeptId FOR XML PATH(''))
FROM #Dept D

Thursday, June 4, 2009

Generate SQL Index scripts using C#

Problem Statement:

SQL Server 2005 management studio does not provide any method to generate Index scripts all at once. A simple C# program can help in this case:

Sample Console Program:

Make sure following dlls are referenced in your program:

using System;
using System.Collections.Generic;
using Microsoft.SqlServer.Management.Common;
using Microsoft.SqlServer.Management.Smo;
using System.Text;
using System.Collections.Specialized;

namespace DBScripts
{
class Program
{
static void Main(string[] args)
{
string servername;
string databasename;
servername = "localhost";
databasename = "Northwind";
Server server = new Server(servername);
Database db = server.Databases[databasename];
ScriptingOptions so = new ScriptingOptions();
so.ScriptDrops = false;
so.NonClusteredIndexes = true;
so.IncludeIfNotExists = true;
so.DriForeignKeys = false;
so.FileName = DateTime.Now.ToString().Replace("/"," ").Replace(":"," ") + "indexes.sql";
so.AppendToFile = true;
foreach (Table t in db.Tables)
{
foreach (Index i in t.Indexes)
i.Script(so);
Console.WriteLine(t.Name);
}
}
}
}

Monday, October 13, 2008

LinQ to SQL - Compiled Queries

Descrption:

This article demonstrates how to use compiled queries with LinQ. To eliminate the overhead of building query pipeline, compiled queries should be used.

e.g. This is the query to find all customers and employees living in "USA".

db.Customers
  .Select(c => new Person { PersonName = c.ContactName,CountryName= c.Country})
  .Union(db.Employees
  .Select(c => new Person { PersonName = c.FirstName, CountryName = c.Country}))
  .Where(c => c.CountryName == "USA"));

If this query is used frequently, we can avoid creation of query pipeline everytime. To compile the query, we can use CompiledQuery.Compile method. This method takes an argument list as input and a result type. It returns a delegate as a variable that we can use later. So here, we can pass the NorthwindDataContext instance and the country variable in. We will return a IQueryable(of Person) object. So the definition of our compiled query as follows:


public static Func<NorthwindDataContext,string,IQueryable<Person>>
            GetCustomernEmployees =
            CompiledQuery.Compile((NorthwindDataContext db,string country) =>
            db.Customers
                    .Select(c => new Person { PersonName = c.ContactName,CountryName= c.Country})
                    .Union(db.Employees
                    .Select(c => new Person { PersonName = c.FirstName, CountryName = c.Country}))
                    .Where(c => c.CountryName == "USA"));