How To

Removing Duplicate Data

Data Sync can be configured to remove duplicate items from a List with a little configuration.

Imagine you have a DataSet like this, where ProductID has been duplicated i.e. ProductID=1 appears more than once.

Data Set

ProductID is our key and is supposed to be unique however because of this error Data Sync flags it as a Duplicate. This appears in Data Sync like this.


To remove the invalid row we create a new project where source and target point to the same SharePoint List. This project should include only the ID column since we know this will always be unique and therefore will not be automatically excluded from the Data Sync Results.

Schema Mapping

Then we need to use a Data Sync Function ISDUPLICATE to return only those rows that are not duplicates so we get a delete action for the duplicates.

In Dynamic Columns return the inverse of ISDUPLICATE against the column we're testing i.e. !ISDUPLICATE(ProductID) in the BeginRow() method.

partial class DataSourceRowOverride : Simego.DataSync.DynamicColumns.DataSourceRowInternal 
    public override bool BeginRow()
        return !ISDUPLICATE(ProductID); 

When we now Compare A->B we get the duplicate items to delete. We just need to Synchronise to put the changes into effect.

Compare Results