How to detect and remove duplicate rows from a CSV file

Splitcsv.com is the fastest and easiest way to remove duplicates from a large CSV file

SplitCSV.com is not only the easiest way to split a large csv file, it is also the fastest and easiest way to detect and remove duplicate rows from your CSV file. If you have a large CSV file that is running slowly in Microsoft Excell or Google Sheets, you can use SplitCSV.com to break it apart into smaller files and remove duplicate rows in the process. This enables you to make sure your CSV data is 100% unique, prior to using it for analysis, loading it into a database, or anything else. Here's how it works

  1. First, head over to https://www.splitcsv.com  head over to splitcsv.com and upload your csv file
  2. Next, select "Choose File" and upload your csv file(ending in .csv) file. The upload will begin right away. Upload your CSV file
  3. Tell us if your file has headers, and if so, how many rows should be copied into each split file. Tell us about headers
  4. Choose how to split the file: you can limit the number of rows in each output file (by selecting the Rows tab and entering the maximum number of rows in each file), the size of each output file (by selecting the Size tab and then entering the appropriate size in bytes) or specify the exact number of files to produce (by selecting the Files tab and then entering the number of files to output). The image below shows an example of limiting by output file size.Choose how to split your file
  5. Select the "Detect Duplicates" toggle on the left, if you'd like to only detect duplicates. Select "Remove Duplicates" on the right, if you'd like your output files to be duplicate free.    Detect Duplicate Rows on the left, and Remove Duplicate Rows on the right         
  6. Press the Advanced Options button to move to the next step, then press the Confirm button to verify the choices. Press the Split button to pay and queue the split up and view the receipt.
  7. The split will be queued for completion, and should be executed shortly. The receipt page will refresh to include a link to the split results: all results will be zipped up for download, and available as Excel files..

We automatically remove all rows that can't be successfully opened in Excel (typically this is because the CSV file has a cell with more than 32767 characters, which is Excel's limit for the size of content in a single cell): a separate excelerrors.csv file will be added to the results zip file. This errors file will include a copy of all rows that were removed.

All uploaded files will be kept for at most 7 days before being deleted: depending on volume it could be earlier.  At no time is the original source file downloadable, and the link to download the results will only be available on the receipt page, no where else.

Happy splitting!

The Fastest Way to split a Text file into multiple files