Home | Accounts | Credentials | Peers | Projects | Upload | De-duplicate | Cluster | View | Browse | Search | Buckets | Datasets | Assign | Notifications | Toolbox | Code | Bookmarks | Validate | Report | FAQ | Contact

PCAT can de-duplicate your public comment archive

PCAT contains a “one-click” methods for removing duplicate public comments from an archive. Once you have uploaded an archive with suspected duplicates, click on the arrow at the end of the archive to expose the “Archive Menu” and then select “Generate Exact Duplicates”.



A prompt will ask to confirm that you want to run the de-duplication algorithm.



Click OK. PCAT will issue a notification when the duplicate clustering is complete. A new sub-archive will appear with the title “Non-exact Duplicates” and the document count for the sub-archive in parentheses.

Most Frequently Asked Questions

Why would I use this system? | Where do I get FDMS bulk downloads? | Does PCAT identify duplicates? | What is QDAP?



© 2009 - 2010 Qualitative Data Analysis Program (QDAP), in the University Center for Social and Urban Research, at the University of Pittsburgh, and QDAP-UMass, in the College of Social and Behavioral Sciences, at the University of Massachusetts Amherst. As of 2010, PCAT and this PCAT Help Wiki are maintained and improved by personnel from Texifter, LLC, which is a software start-up located in North Amherst & Springfield, MA and online at http://texifter.com.

Content on this website was made possible with the following grants from the National Science Foundation: III-0705566 “Collaborative Research III-COR: From a Pile of Documents to a Collection of Information: A Framework for Multi-Dimensional Text Analysis” and IIS-0429293 “Collaborative Research: Language Processing Technology for Electronic Rulemaking.” We are also grateful for financial support from the U.S. Environmental Protection Agency and the U.S. Fish & Wildlife Service. Any opinions, findings and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect those of the National Science Foundation.



Home | Accounts | Credentials | Peers | Projects | Upload | De-duplicate | Cluster | View | Browse | Search | Buckets | Datasets | Assign | Notifications | Toolbox | Code | Bookmarks | Validate | Report | FAQ | Contact

 
C:/PCAT/new-pcat-help/data/data/pages/de-duplicate.txt · Last modified: 2010/07/04 08:08 by stu
 
Except where otherwise noted, content on this wiki is licensed under the following license:CC Attribution-Noncommercial-Share Alike 3.0 Unported
Recent changes RSS feed Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki