
CJKIs Database of Arabic Names (DAN) is a database of Arabic names and their romanized variants that currently contains 2.4 million entries. As part of its feature set, DAN also provides comprehensive coverage of all the Arabic names and their aliases found in the OFAC list.
OFAC stands for the Office of Foreign Assets Control, which publishes the Specially Designated Nationals List (SDN), simply referred to as the OFAC list. This list contains individuals and organizations that are considered a security risk, and that United States citizens are prohibited from doing business with. It is used worldwide in security and AML applications, border control and watch list filtering.
The US government's watch lists have come under fire from members of Congress as being "crippled by technical flaws." One of the major factors behind these assertions is the inability to correctly identify and process the numerous variants of the names appearing in the watch lists. That is, there are numerous potential and actual variants of OFAC names which are not found the OFAC list. This poses a real danger that criminals and suspicious individuals will fail to be identified.
Highlights
|
To address the shortcomings of OFAC's SDN List, CJKI has exploited the linguistic and technical resources used to create the DAN database to develop a comprehensive "Expanded OFAC" database (referred to as XOFAC) of OFAC full name variants, the vast majority of which are not listed in OFAC.
Containing millions of potential and actual variants of the Arab names in OFAC's SDN List, XOFAC is ideal for those agencies and institutions that require maximum recall in their compliance and watch list filtering applications. Please see the examples below of real cases where variants in XOFAC are not listed in OFAC.
Please note that there is no guarantee that the millions of entries in XOFAC actually refer to the individual in question, or that they even exist. That is, they consist of potential and actual combinations of orthographically valid variants, some of which are legitimate and some of which could be used by other persons of the same name, or not at all. However, if the goal is to achieve maximum recall, comprehensive variant coverage is of great benefit and has no negative effects except possibly for system resources.
It is also important to note that XOFAC is structured differently from DAN. While DAN is a database of name components (surnames, given names, name elements), XOFAC consists of variants of full names only; in other words, names of actual and potential individuals and their variants.
Let us take the name Hatim Ahmad BARAKAT (خاتم أحمد بركات) as an example. The OFAC list has the following entries for this name:
Main OFAC entry: |
OFAC aliases: |
|---|---|
| Hatim Ahmad BARAKAT | Hatam Ahmad BARAKAT |
| Hatem Ahmad BARAKAT | |
| Hattem Ahmad BARAKAT | |
| Hotem Ahmad BARAKAT |
The table below lists the top 15 out of about 130,000 actual and potential variants of the OFAC name Hatim Ahmad BARAKAT, which appear in CJKI's XOFAC database. The table has the following fields:
Rank |
relative ranking based on component frequencies |
|---|---|
Variant |
variants of OFAC name, mostly not appearing in the OFAC list |
Freq1 |
frequency of occurrence on the web of Hatim variants |
Freq2 |
frequency of occurrence on the web of Ahmad variants |
Freq3 |
frequency of occurrence on the web of Barakat variants |
Rank
|
Variant
|
Freq1
|
Freq2
|
Freq3
|
|---|---|---|---|---|
| 000001 | Hatem Ahmed Barakat | 0001580000 | 0039000000 | 0001180000 |
| 000002 | Hatim Ahmed Barakat | 0000925000 | 0039000000 | 0001180000 |
| 000003 | Hatem Ahmed Bereket | 0001580000 | 0039000000 | 0000651000 |
| 000004 | Hatem Ahmad Barakat | 0001580000 | 0025600000 | 0001180000 |
| 000005 | Khadem Ahmed Barakat | 0000194000 | 0039000000 | 0001180000 |
| 000006 | Hatem Ahmed Bareket | 0001580000 | 0039000000 | 0000057200 |
| 000007 | Hattem Ahmed Barakat | 0000180000 | 0039000000 | 0001180000 |
| 000008 | Hatem Ahmed Berekat | 0001580000 | 0039000000 | 0000033400 |
| 000009 | Hatem Ahmet Barakat | 0001580000 | 0018400000 | 0001180000 |
| 000010 | Hadim Ahmed Barakat | 0000114000 | 0039000000 | 000118000 |
| 000011 | Hatem Ahmed Baraket | 0001580000 | 0039000000 | 0000016300 |
| 000012 | Hatam Ahmed Barakat | 0000081300 | 0039000000 | 0001180000 |
| 000013 | Hatem Ahmed Barakaat | 0001580000 | 0039000000 | 0000014300 |
| 000014 | Hatem Achmed Barakat | 0001580000 | 0000777000 | 0001180000 |
| 000015 | Hetem Ahmed Barakat | 0000065600 | 0039000000 | 0001180000 |
Only the fourth entry (Hatem Ahmad Barakat) in the top 15 above actually appears in the OFAC list. Therefore, even among the top 15 full name variants in XOFAC, this person's official OFAC name and three of his listed aliases are not to be found.
Hattem Ahmed Barakat (Rank 000007 above) is an example of an XOFAC entry that is an alias not found in the OFAC list that does indeed refer to the person in question. That is, it is an actual, not a potential, variant of that person's name.
Another example is the name Mohamed Ben Belgacem AOUADI, which has just one alias in OFAC. However, in addition to thousands of potential variants, our XOFAC database contains three additional attested variants, as illustrated below:
Main OFAC entry: |
OFAC alias: |
Validated full name variants
NOT listed in OFAC |
|---|---|---|
| Mohamed Ben Belgacem AOUADI | Mohamed Ben Belkacem AOUADI | Mohammed Ben Belgacem AOUADI |
| Muhammad Bin Belgacem AWADI | ||
| Mohamed Ben Belgacem AL AOUADI |
The above demonstrates the important role that the XOFAC database can play in avoiding false negatives; that is, in ensuring that the person in question is correctly identified no matter how his or her name is spelled or misspelled.