This shows you the differences between two versions of the page.
|
does_pcat_identify_duplicates [2010/02/24 12:52] stu |
does_pcat_identify_duplicates [2010/06/10 14:01] (current) stu |
||
|---|---|---|---|
| Line 1: | Line 1: | ||
| - | ===== Duplicate Detection ===== | + | [[Home Page|Home]] | [[create_a_new_account|Accounts]] | [[credentials|Credentials]] | [[form_a_peer_network|Peers]] | [[create_a_new_project|Projects]] | [[upload_fdms_public_comment_data|Upload]] | [[de-duplicate|De-duplicate]] | [[cluster|Cluster]] | [[View all archives|View]] | [[browse_an_archive|Browse]] | [[search_for_key_concepts|Search]] | [[create_buckets|Buckets]] | [[create_datasets|Datasets]] | [[assign_coders|Assign]] | [[notifications|Notifications]] | [[toolbox_functions|Toolbox]] | [[code_data|Code]] | [[work_with_memos|Bookmarks]] | [[validate_the_work_of_coders|Validate]] | [[generate_summary_reports|Report]] | [[faq|FAQ]] | [[Contact QDAP|Contact]]\\ |
| - | [[Home Page|Home]] | [[create_a_new_account|Accounts]] | [[credentials|Credentials]] | [[form_a_peer_network|Peers]] | [[upload_fdms_public_comment_data|Upload]] | [[View all archives|View]] | [[browse_an_archive|Browse]] | [[search_for_key_concepts|Search]] | [[create_buckets|Buckets]] | [[create_datasets|Datasets]] | [[assign_coders|Assign]] | [[toolbox_functions|Toolbox]] | [[code_data|Code]] | [[work_with_memos|Bookmarks]] | [[validate_the_work_of_coders|Validate]] | [[generate_summary_reports|Report]] | [[faq|FAQ]] | [[Contact QDAP|Contact]]\\ | + | ===== Duplicate detection and clustering===== |
| - | ==== Does PCAT identify duplicates? ==== | + | It sure does. If you currently have an FDMS bulk download, or a large collection emails in Lotus Notes or Microsoft Outlook and you suspect it has mass email campaign duplicates, you can now run the de-duplication and clustering algorithms inside PCAT. Here is an example:\\ |
| - | It sure does, though we have not plugged that advanced "Sifter" feature into the public user interface yet, though we will very soon. We have completed and implemented a functional system for analysts in the US Fish & Wildlife Service that displays only comments that are not identical to other comments. If you currently have an FDMS bulk download, or a large collection emails in Lotus Notes or Microsoft Outlook and you suspect it has mass email campaign duplicates, we can run the de-duplication algorithm for you at the QDAP laboratory before you upload your data to PCAT. Users of PCAT can ask QDAP technicians to relax the threshold for identifying duplicates and sample "clusters" of near-duplicate comments generated by form letter and talking point campaigns. This step makes it easier to both document the dimensions of the central "talking points" while also focusing attention on unique or otherwise unexpected contributions. | + | \\ |
| + | {{:dedupe1.jpg|}}\\ | ||
| + | \\ | ||
| + | Users of PCAT can relax the threshold for identifying duplicates and sample "clusters" of near-duplicate comments generated by form letter and talking point campaigns. This step makes it easier to both document the dimensions of the central "talking points" while also focusing attention on unique or otherwise unexpected contributions.\\ | ||
| + | \\ | ||
| + | {{:cluster1.jpg|}}\\ | ||
| + | \\ | ||
| + | {{:cluster22.jpg|}} | ||
| ==== Most Frequently Asked Questions ==== | ==== Most Frequently Asked Questions ==== | ||
| - | [[Why would I use this system?]] | [[Where do I get FDMS bulk downloads?]] | [[Does PCAT identify duplicates?]] | [[What is QDAP?]]\\ | + | [[Why would I use this system?]] | [[Where do I get FDMS bulk downloads?]] | **[[Does PCAT identify duplicates?]]** | [[What is QDAP?]]\\ |
| \\ | \\ | ||
| - | [[http://www.pcat.qdap.net/|{{:pcatneweyeslongblue241x45.png|}}]] [[http://pcat.qdap.net|pcat.qdap.net]]\\ | + | © 2009-2010 - [[http://www.qdap.pitt.edu/|Qualitative Data Analysis Program]] (QDAP), in the [[http://www.ucsur.pitt.edu/index.php|University Center for Social and Urban Research]], at the [[http://www.pitt.edu|University of Pittsburgh]], and |
| - | \\ | + | |
| - | © 2009 - [[http://www.qdap.pitt.edu/|Qualitative Data Analysis Program]] (QDAP), in the [[http://www.ucsur.pitt.edu/index.php|University Center for Social and Urban Research]], at the [[http://www.pitt.edu|University of Pittsburgh]], and | + | |
| - | [[http://www.umass.edu/qdap/|QDAP-UMass]], in the [[http://www.umass.edu/sbs/|College of Social and Behavioral Sciences]], at the [[http://umass.edu/|University of Massachusetts Amherst]]. | + | [[http://www.umass.edu/qdap/|QDAP-UMass]], in the [[http://www.umass.edu/sbs/|College of Social and Behavioral Sciences]], at the [[http://umass.edu/|University of Massachusetts Amherst]]. |