Eustat

Eustat Euskadi.net

Seminar

FUSIÓN DE REGISTROS / ERREGISTROEN BATERATZEA / RECORD LINKAGE

PDF (0,6 Mb)

CONTENTS

  1. Introduction
  2. Definition
  3. Terminology
  4. Uses
  5. Capture-Recapture
  6. Record Linkage Basics
  7. Context
  8. Deterministic Record Linkage
  9. Probabilistic Record Linkage
  10. Not Statistical Matching
  11. Need for Automated Record Linkage
  12. Rec. Link. Theory: Fellegi & Sunter
  13. Basic Definitions and Notation
  14. Agreement Patterns
  15. Example Comparison Space
  16. Conditional Probabilities
  17. Linkage Rule
  18. Error Rates
  19. Clerical Region
  20. Fundamental Theorem
  21. Weight Distribution for Matches
  22. Weight Distribution for Non-Matches
  23. Idealized Distributions
  24. Error Rates, Clerical Review Region
  25. Conditional Independence Assumption
  26. Conditional Independence Example
  27. Fellegi-Sunter Summary
  28. Record Linkage Methodology
  29. Choosing Parameters
  30. Informal Methods
  31. EM Algorithm
  32. Likelihood Function
  33. Complete-data Likelihood Function
  34. Expectation Step
  35. Maximization Step
  36. EM Algorithm
  37. EM Algorithm Remarks
  38. Blocking
  39. Blocking Criteria
  40. Record Linkage Refinements
  41. String Comparator
  42. String Comparator Context
  43. Some String Comparator Types
  44. Bigrams
  45. Jaro-Winkler Comparator
  46. Jaro-Winkler Example
  47. Jaro-Winkler Variations
  48. Similar Characters
  49. Common Prefix
  50. Long String Adjustment
  51. Jaro-Winkler Comparator
  52. Edit Distance String Comparators
  53. Edit Distance Algorithm
  54. Edit Distance Similarity Function
  55. Edit Distance Example
  56. Longest Common Subsequence
  57. LCS Similarity Function
  58. Combination Similarity Function
  59. Evaluating String Comparators
  60. Results of String Comparator Evaluation
  61. Jaro-Winkler Anomaly
  62. Hybrid Comparator
  63. String Comparator Summary
  64. More Than Two Latent Classes
  65. EM for Three Classes
  66. More Than Two Comparison Values
  67. One-to-one Matching
  68. Linear Assignment Algorithm
  69. Error Rates
  70. Practical Considerations
  71. False Non-Match Rate
  72. False Match Rate
  73. Bellin-Rubin
  74. Larsen
  75. Improved Parameter Estimates
  76. Extended Likelihood Function
  77. Larsen, Rubin
  78. Winkler
  79. Data Preparation
  80. Basic Preparation
  81. Address Parsing
  82. Business Lists
  83. Example of Business Name Parsing
  84. Two Kinds of Standardizer
  85. Rule-Based Standardizer
  86. Hidden Markov Standardizer
  87. Hidden Markov Standardizer Reference
  88. Hidden Markov Model
  89. Viterbi Algorithm
  90. HMM Diagram
  91. Standardization Summary
  92. U.S. Census Bureau Software
  93. Matching Programs: Matcher
  94. Matching Programs: Bigmatch
  95. Auxiliary Programs: Counter
  96. Auxiliary Programs: EM
  97. Auxiliary Programs, Standardizer