<?xml version="1.0" encoding="utf-8" ?>
<?xml-stylesheet type="text/xsl" href="RSS_xslt_style.asp" version="1.0" ?>
<rss version="2.0" xmlns:WebWizForums="http://syndication.webwiz.co.uk/rss_namespace/">
 <channel>
  <title>DevForce Community Forum : validating duplicate value</title>
  <link>http://www.ideablade.com/forum/</link>
  <description>This is an XML content feed of; DevForce Community Forum : DevForce Classic : validating duplicate value</description>
  <pubDate>Thu, 11 Jun 2026 04:48:00 -700</pubDate>
  <lastBuildDate>Mon, 14 Apr 2008 23:09:29 -700</lastBuildDate>
  <docs>http://blogs.law.harvard.edu/tech/rss</docs>
  <generator>Web Wiz Forums 9.69</generator>
  <ttl>360</ttl>
  <WebWizForums:feedURL>www.ideablade.com/forum/RSS_post_feed.asp?TID=760</WebWizForums:feedURL>
  <image>
   <title>DevForce Community Forum</title>
   <url>http://www.ideablade.com/forum/forum_images/IdeaBlade_logo_tm.png</url>
   <link>http://www.ideablade.com/forum/</link>
  </image>
  <item>
   <title>validating duplicate value : The issue of finding duplicate...</title>
   <link>http://www.ideablade.com/forum/forum_posts.asp?TID=760&amp;PID=2828#2828</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.ideablade.com/forum/member_profile.asp?PF=103" rel="nofollow">WildHog</a><br /><strong>Subject:</strong> 760<br /><strong>Posted:</strong> 14-Apr-2008 at 11:09pm<br /><br />The issue of finding duplicate records, record linkage&nbsp;and updating dependent tables is a very complex, critical subject for an enterprise database.&nbsp; I do not claim to understand it but some issues that I have come across and can recall are: <DIV>1. You cannot rely on one column of data.&nbsp; For example Last Name, Social Security Number, even address fields. etc.&nbsp; For example Smith might be Symith, etc.&nbsp; can you rely of a Soundex, Metaphone, etc stored procedure?&nbsp; How about First and Last Name?&nbsp; William or John Smith are enter and validated by validation rules and are persisted to the database as different persons when in fact they are the same.&nbsp; Add address fields to&nbsp;the match algorithm...does an old address produce a different person compared to the newly entered address for a person of the same name?&nbsp; add SSN...the user is off by 1 number....certainly you can use a regular expression to filter/validate SSN's which valid i.e. do not break rules of the Social Security agency which has more rules than just requiring a 9 digit number.&nbsp;&nbsp; SSN do not have a check digit to exclude SSN that do not break any rules.&nbsp;Mother's maiden name and other fields my be tie breaker.&nbsp; Nevertheless, a robust sytem should have a validation algorithm that to prevent duplicate person or other entries which returns a probability of possible duplication and appropriate alerts to the end user.&nbsp; The Census Bureau&nbsp; and others use such a matching algorithm.&nbsp; Anyway, a robust master person index relying probalistic algorithms is the way to go.&nbsp;</DIV><DIV>See these open source probalistic algorithms.&nbsp; Perhaps, IdeaBlade can reverse engineer one of them into managed code as an add-in module.</DIV><DIV><U><FONT color=#ff6600><a href="http://datamining.anu.edu.au/projects/linkage.html#project_de&#115;cripti&#111;n" target="_blank">http://datamining.anu.edu.au/projects/linkage.html#project_description</A></FONT></U><a href="http://en.wikipedia.org/wiki/Record_linkage" target="_blank"></A></DIV><DIV><a href="http://www.cdc.gov/cancer/npcr/tools/registryplus/lp_tech_info.htm" target="_blank">http://www.cdc.gov/cancer/npcr/tools/registryplus/lp_tech_info.htm</A></DIV><DIV><a href="http://sourceforge.net/projects/simmetrics/" target="_blank">http://sourceforge.net/projects/simmetrics/</A>&nbsp;&nbsp;&nbsp; BTW this open source has both Java and NET 2.0 versions...has multiple algorithms </DIV><DIV>&nbsp;</DIV><DIV>&nbsp;</DIV><DIV>2. You&nbsp;should have a methods of discovering and cleaning your database of duplicate or inappropriately merged records&nbsp;subsequent to persistance.&nbsp;&nbsp;Discovery of duplicate/and incorrectly merged&nbsp;records and acting upon them by merging, unmerging&nbsp;incorrect merges, deleting, etc.</DIV><DIV>&nbsp;</DIV><DIV>3.&nbsp;Off the subject, aside from prevention of duplicate records and database scrubbing verifying the data is correct is another set of issues.&nbsp; I have gleaned from my reading is that Validating is a filter based upon business rules.&nbsp; Whereas Verification is the process that the&nbsp;entered data is correct.&nbsp; John Smith with SSN 999-99-9999 actually lives at 123 Hogs Hollow or even if the address exists.&nbsp; This&nbsp; Verification may come from one or more third parties such as US postal service, Melissa&nbsp;Data, Axicom, even credit agencies, etc. &nbsp;via a web service.&nbsp; What if the person&nbsp;has moved you will need some form of Skip Tracing usually provided by credit agencies.&nbsp; Once Verified the records should be updated with the correct information.&nbsp;&nbsp;</DIV><DIV>&nbsp;</DIV><DIV>I would like to hear others comment on their ideas of this problem.&nbsp;&nbsp;</DIV><DIV>&nbsp;</DIV><span style="font-size:10px"><br /><br />Edited by WildHog - 15-Apr-2008 at 6:19pm</span>]]>
   </description>
   <pubDate>Mon, 14 Apr 2008 23:09:29 -700</pubDate>
   <guid isPermaLink="true">http://www.ideablade.com/forum/forum_posts.asp?TID=760&amp;PID=2828#2828</guid>
  </item> 
  <item>
   <title>validating duplicate value : If you want to know if &#8220;XYZ&#8221; is...</title>
   <link>http://www.ideablade.com/forum/forum_posts.asp?TID=760&amp;PID=2795#2795</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.ideablade.com/forum/member_profile.asp?PF=23" rel="nofollow">davidklitzke</a><br /><strong>Subject:</strong> 760<br /><strong>Posted:</strong> 09-Apr-2008 at 10:05am<br /><br /><P =Ms&#111;normal style="MARGIN: 0in 0in 0pt"><SPAN style="FONT-SIZE: 10pt; COLOR: navy; FONT-FAMILY: Arial">If you want to know if “XYZ” is a duplicate for the “Code” column, you’ll have to query the database to see if there are any Customer rows that have a value of “XYZ” for the “Code” column.<SPAN style="mso-spacerun: yes">&nbsp; </SPAN>If you find such a row, you have found a duplicate.<?:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /><o:p></o:p></SPAN></P>]]>
   </description>
   <pubDate>Wed, 09 Apr 2008 10:05:53 -700</pubDate>
   <guid isPermaLink="true">http://www.ideablade.com/forum/forum_posts.asp?TID=760&amp;PID=2795#2795</guid>
  </item> 
  <item>
   <title>validating duplicate value : Hi;  TblCustomer ID (Primary) Code...</title>
   <link>http://www.ideablade.com/forum/forum_posts.asp?TID=760&amp;PID=2792#2792</link>
   <description>
    <![CDATA[<strong>Author:</strong> <a href="http://www.ideablade.com/forum/member_profile.asp?PF=45" rel="nofollow">erturkcevik</a><br /><strong>Subject:</strong> 760<br /><strong>Posted:</strong> 09-Apr-2008 at 8:28am<br /><br /><DIV><FONT color=#000080>Hi;</FONT></DIV><DIV><FONT color=#000080></FONT>&nbsp;</DIV><DIV><FONT color=#000080><U>TblCustomer</U></FONT></DIV><DIV><FONT color=#000080>ID (Primary)</FONT></DIV><DIV><FONT color=#000080>Code (Unique)</FONT></DIV><DIV><FONT color=#000080>Name</FONT></DIV><DIV><FONT color=#000080></FONT>&nbsp;</DIV><DIV><FONT color=#000080>How to detect and validating&nbsp;duplicate entry value&nbsp;when editing data?</FONT></DIV><DIV><FONT color=#000080>The "Code" Field only Unique value cannot&nbsp;duplicate.</FONT></DIV><DIV><FONT color=#000080></FONT>&nbsp;</DIV><DIV><FONT color=#000080></FONT>&nbsp;</DIV><DIV><FONT color=#000080>Best Regards</FONT></DIV>]]>
   </description>
   <pubDate>Wed, 09 Apr 2008 08:28:10 -700</pubDate>
   <guid isPermaLink="true">http://www.ideablade.com/forum/forum_posts.asp?TID=760&amp;PID=2792#2792</guid>
  </item> 
 </channel>
</rss>