Ontology Alignment Evaluation Initiative - OAEI-2019 Campaign

Large BioMed Track

Results OAEI 2019::Large BioMed Track

Contact

If you have any question/suggestion related to the results of this track or if you notice any kind of error (wrong numbers, incorrect information on a matching system, etc.), feel free to write an email to ernesto [.] jimenez [.] ruiz [at] gmail [.] com

Evaluation setting

We have run the evaluation in a Ubuntu 18 Laptop with an Intel Core i5-6300HQ CPU @ 2.30GHz x 4 and allocating 15Gb of RAM.

Precision, Recall and F-measure have been computed with respect to a UMLS-based reference alignment. Systems have been ordered in terms of F-measure.

Unique mappings show the number of mappings computed by a system that are not predicted by any of the other participants (including variants).

Check out the supporting scripts to reproduce the evaluation: https://github.com/ernestojimenezruiz/oaei-evaluation

Participation and success

In the OAEI 2019 largebio track 10 participating systems have been able to complete at least one of the tasks of the largebio track within a 6 hours timeout. Eight systems were able to complete all six largebio tracks.

Use of background knowledge

LogMapBio uses BioPortal as mediating ontology provider, that is, it retrieves from BioPortal the most suitable top-10 ontologies for the matching task.

LogMap uses normalisations and spelling variants from the general (biomedical) purpose SPECIALIST Lexicon.

AML has three sources of background knowledge which can be used as mediators between the input ontologies: the Uber Anatomy Ontology (Uberon), the Human Disease Ontology (DOID) and the Medical Subject Headings (MeSH).

Alignment coherence

Together with Precision, Recall, F-measure and Runtimes we have also evaluated the coherence of alignments. We have reported (1) number of unsatisfiabilities when reasoning with the input ontologies together with the computed mappings, and (2) the ratio/degree of unsatisfiable classes with respect to the size of the union of the input ontologies.

We have used the OWL 2 reasoner HermiT to compute the number of unsatisfiable classes. For the cases in which HermiT could not cope with the input ontologies and the mappings (in less than 2 hours) we have provided a lower bound on the number of unsatisfiable classes (indicated by ≥) using the OWL 2 EL reasoner ELK. In a couple of cases ELK could finish in 2 hours, indicated with "-".

As in previous OAEI editions, only two systems have shown mapping repair facilities, namely: AML and LogMap (including its LogMapBio variant). Both systems produce relatively clean outputs in FMA-NCI and FMA-SNOMED cases; however in the SNOMED-NCI cases AML mappings lead to a number of unsatisfiable classes. The results also show that even the most precise alignment sets may lead to a large amount of unsatisfiable classes. This proves the importance of using techniques to assess the coherence of the generated alignments.


1. System runtimes and task completion

System FMA-NCI FMA-SNOMED SNOMED-NCI Average # Tasks
Task 1 Task 2 Task 3 Task 4 Task 5 Task 6
LogMapLt 1 9 2 15 7 16 8 6
DOME 3 21 4 38 17 38 20 6
FCAMapKG 4 300 7 116 62 403 149 6
AML 32 75 91 152 531 331 202 6
LogMap 9 82 50 394 220 590 224 6
LogMapBio 1,690 2,072 1,551 2,853 2,666 4,586 2,570 6
AGM 223 3,325 368 4,227 2,264 5,016 2,571 6
Wiktionary 111 4,699 251 12,633 1,143 9,208 4,674 6
POMAP++ 289 - 899 - 9,706 - 3,631 3
SANOM 2,233 - - - - - 2,233 1
# Systems 10 8 9 8 9 8 1,853 53
Table 1: System runtimes (s) and task completion.


2. Results for the FMA-NCI matching problem

Task 1: FMA-NCI small fragments

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 32 2,723 59 0.958 0.910 0.933 2 0.020%
LogMap 9 2,747 3 0.944 0.897 0.920 2 0.020%
LogMapBio 1,690 2,869 61 0.919 0.912 0.915 2 0.020%
POMAP++ 289 2,414 10 0.979 0.814 0.889 993 9.8%
LogMapLt 1 2,480 7 0.967 0.819 0.887 2,104 20.7%
FCAMapKG 0 2,508 17 0.967 0.817 0.886 4,209 41.3%
SANOM 2,233 2,385 6 0.979 0.803 0.882 288 2.8%
DOME 3 2,264 0 0.984 0.766 0.861 344 3.4%
Wiktionary 111 1,760 4 0.991 0.608 0.754 114 1.1%
AGM 223 2,707 1,294 0.495 0.481 0.488 9,298 91.3%
Table 2: Results for the largebio task 1.

Task 2: FMA-NCI whole ontologies

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 75 3,110 276 0.805 0.881 0.841 4 0.012%
LogMap 82 2,701 0 0.856 0.808 0.831 3 0.009%
LogMapBio 2,072 3,104 139 0.779 0.850 0.813 3 0.009%
LogMapLt 9 3,458 75 0.676 0.819 0.741 8,925 27.3%
Wiktionary 4,699 1,873 56 0.927 0.607 0.734 3,476 10.6%
DOME 21 2,413 7 0.796 0.669 0.727 1,033 3.2%
FCAMapKG 0 3,765 316 0.622 0.817 0.706 10,708 32.8%
AGM 3,325 7,648 6,819 0.079 0.224 0.117 28,537 87.4%
Table 3: Results for the largebio task 2.


3. Results for the FMA-SNOMED matching problem

Task 3: FMA-SNOMED small fragments

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 91 6,988 947 0.923 0.762 0.835 0 0%
LogMapBio 1,551 6,516 117 0.931 0.703 0.801 1 0.004%
LogMap 50 6,282 1 0.947 0.690 0.798 1 0.004%
AGM 368 5,856 2,681 0.463 0.365 0.408 22,530 95.6%
POMAP++ 899 2,163 139 0.906 0.260 0.404 894 3.8%
FCAMapKG 0 1,720 6 0.973 0.222 0.362 864 3.7%
LogMapLt 2 1,642 0 0.968 0.208 0.342 771 3.3%
DOME 4 1,531 0 0.988 0.198 0.330 754 3.2%
Wiktionary 251 1,297 21 0.965 0.170 0.289 525 2.2%
Table 4: Results for the largebio task 3.

Task 4: FMA whole ontology with SNOMED large fragment

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
LogMap 394 6,393 0 0.840 0.645 0.730 0 0%
LogMapBio 2,853 6,926 280 0.792 0.667 0.724 0 0%
AML 152 8,163 2,525 0.685 0.710 0.697 0 0%
FCAMapKG 0 1,863 77 0.881 0.222 0.355 1,527 2.0%
LogMapLt 15 1,820 47 0.851 0.208 0.334 1,386 1.8%
DOME 38 1,589 1 0.939 0.197 0.326 1,348 1.8%
Wiktionary 12,633 1,486 143 0.819 0.170 0.282 790 1.0%
AGM 4,227 11,896 10,644 0.069 0.133 0.091 70,923 92.7%
Table 5: Results for the largebio task 4.


4. Results for the SNOMED-NCI matching problem

Task 5: SNOMED-NCI small fragments

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 531 14,741 1,768 0.906 0.746 0.818 ≥3,967 ≥5.3%
LogMapBio 2,666 13,725 774 0.905 0.695 0.786 ≥0 ≥0%
LogMap 220 12,433 0 0.957 0.665 0.785 ≥0 ≥0%
LogMapLt 7 10,921 91 0.949 0.566 0.709 ≥60,447 ≥80.5%
POMAP++ 9,706 10,906 285 0.938 0.561 0.702 ≥52,305 ≥69.7%
FCAMapKG 0 10,910 333 0.937 0.555 0.697 ≥53,761 ≥71.6%
DOME 17 9,392 6 0.978 0.503 0.664 ≥45,039 ≥60.0%
Wiktionary 1,143 8,889 120 0.967 0.471 0.633 ≥41,335 ≥55.1%
AGM 2,264 15,466 8,886 0.413 0.363 0.386 - -
Table 6: Results for the largebio task 5.


Task 6: NCI whole ontology with SNOMED large fragment

System Time (s) # Mappings # Unique Scores Incoherence Analysis
Precision  Recall  F-measure Unsat. Degree
AML 331 14,200 2,656 0.862 0.687 0.765 ≥578 ≥0.5%
LogMapBio 4,586 13,732 940 0.814 0.627 0.708 ≥1 ≥0.001%
LogMap 590 12,276 0 0.867 0.596 0.706 ≥1 ≥0.001%
LogMapLt 16 12,864 658 0.798 0.566 0.662 ≥91,207 ≥84.7%
FCAMapKG 0 12,813 1,115 0.789 0.555 0.652 ≥84,579 ≥78.5%
DOME 38 9,806 26 0.905 0.489 0.635 ≥66,317 ≥61.6%
Wiktionary 9,208 9,585 518 0.895 0.472 0.618 ≥65,968 ≥61.2%
AGM 5,016 21,600 16,253 0.227 0.282 0.252 - -
Table 7: Results for the largebio task 6.