We have run the evaluation in a Ubuntu 20 Laptop with an Intel Core i5-6300HQ CPU @ 2.30GHz x 4 and allocating 15Gb of RAM.
Precision, Recall and F-measure have been computed with respect to a UMLS-based reference alignment. Systems have been ordered in terms of F-measure.
Unique mappings show the number of mappings computed by a system that are not predicted by any of the other participants (including variants).
Check out the supporting scripts to reproduce the evaluation: https://github.com/ernestojimenezruiz/oaei-evaluation
In the OAEI 2021 largebio track 12 participating systems have been able to complete at least one of the tasks of the largebio track within a 8 hours timeout. ALOD2Vec and ATMacher could not complete some tasks due to a runtime error. Five systems completed all 6 tasks. GMap and AMD produced an "OutOfMemoryException" while Lily gave an error during the matching process.
LogMapBio uses BioPortal as mediating ontology provider, that is, it retrieves from BioPortal the most suitable top-10 ontologies for the matching task.
LogMap uses normalisations and spelling variants from the general (biomedical) purpose SPECIALIST Lexicon.
AML has three sources of background knowledge which can be used as mediators between the input ontologies: the Uber Anatomy Ontology (Uberon), the Human Disease Ontology (DOID) and the Medical Subject Headings (MeSH).
Together with Precision, Recall, F-measure and Runtimes we have also evaluated the coherence of alignments. We have reported (1) number of unsatisfiabilities when reasoning with the input ontologies together with the computed mappings, and (2) the ratio/degree of unsatisfiable classes with respect to the size of the union of the input ontologies.
We have used the OWL 2 reasoner HermiT to compute the number of unsatisfiable classes. For the cases in which HermiT could not cope with the input ontologies and the mappings (in less than 2 hours) we have provided a lower bound on the number of unsatisfiable classes (indicated by ≥) using the OWL 2 EL reasoner ELK.
As in previous OAEI editions, only two systems have shown mapping repair facilities, namely: AML and LogMap (including its LogMapBio variant). Both systems produce relatively clean outputs for all 6 tasks. The results also show that even the most precise alignment sets may lead to a large amount of unsatisfiable classes. This proves the importance of using techniques to assess the coherence of the generated alignments.
1. System runtimes and task completion
System | Task 1 | Task 2 | Task 3 | Task 4 | Task 5 | Task 6 | Average | # Tasks |
KGMatcher | 7 | 18 | 9 | 31 | 27 | 39 | 22 | 6 |
LogMapLt | 13 | 28 | 16 | 36 | 27 | 40 | 27 | 6 |
AML | 44 | 92 | 124 | 183 | 1,026 | 375 | 307 | 6 |
LogMap | 24 | 142 | 95 | 761 | 397 | 1,386 | 468 | 6 |
LogMapBio | 1,190 | 2,582 | 1,434 | 4,921 | 4,652 | 10,486 | 4,211 | 6 |
ATMatcher | 19 | - | 30 | 77 | - | - | 42 | 3 |
OTMapOnto | 42 | - | 326 | - | - | - | 184 | 2 |
ALOD2Vec | 193 | - | 674 | - | - | - | 434 | 2 |
LSMatch | 4,854 | - | 8,632 | - | - | - | 6,743 | 2 |
TOM | 231 | - | - | - | - | - | 231 | 1 |
Fine-TOM | 231 | - | - | - | - | - | 209 | 1 |
Wiktionary | 13,435 | - | - | - | - | - | 13,435 | 1 |
# Systems | 12 | 5 | 9 | 6 | 5 | 5 | 2,194 | 42 |
2. Results for the FMA-NCI matching problem
System | Time (s) | # Mappings | # Unique | Scores | Incoherence Analysis | |||
Precision | Recall | F-measure | Unsat. | Degree | ||||
AML | 44 | 2,723 | 60 | 0.958 | 0.910 | 0.933 | 2 | 0.020% |
LogMap | 24 | 2,769 | 5 | 0.940 | 0.898 | 0.919 | 2 | 0.020% |
Wiktionary | 13,435 | 2,611 | 2 | 0.967 | 0.864 | 0.913 | 2,552 | 25.1% |
LogMapBio | 1,190 | 2,949 | 110 | 0.904 | 0.920 | 0.912 | 4 | 0.039% |
ALOD2Vec | 193 | 2,751 | 114 | 0.918 | 0.868 | 0.892 | 7,844 | 77.0% |
LogMapLt | 13 | 2,477 | 7 | 0.967 | 0.818 | 0.886 | 2,104 | 20.7% |
LSMatch | 4,854 | 2,355 | 0 | 0.979 | 0.792 | 0.876 | 1,549 | 15.2% |
ATMatcher | 19 | 2,332 | 20 | 0.974 | 0.781 | 0.867 | 314 | 3.1% |
OTMapOnto | 42 | 5,217 | 2,664 | 0.451 | 0.840 | 0.587 | 10,122 | 99.4% |
Fine-TOM | 231 | 870 | 19 | 0.949 | 0.277 | 0.429 | 43 | 0.4% |
TOM | 231 | 867 | 19 | 0.946 | 0.275 | 0.426 | 791 | 7.8% |
KGMatcher | 7 | 242 | 0 | 0.981 | 0.077 | 0.143 | 18 | 0.2% |
System | Time (s) | # Mappings | # Unique | Scores | Incoherence Analysis | |||
Precision | Recall | F-measure | Unsat. | Degree | ||||
AML | 92 | 3,109 | 313 | 0.806 | 0.881 | 0.842 | 2 | 0.015% |
LogMap | 142 | 2,702 | 0 | 0.845 | 0.796 | 0.820 | 2 | 0.015% |
LogMapBio | 2,582 | 3,371 | 288 | 0.726 | 0.861 | 0.788 | 4 | 0.029% |
LogMapLt | 28 | 3,471 | 798 | 0.673 | 0.818 | 0.738 | 5,190 | 38.1% |
KGMatcher | 18 | 303 | 5 | 0.754 | 0.076 | 0.138 | 68 | 0.5% |
3. Results for the FMA-SNOMED matching problem
System | Time (s) | # Mappings | # Unique | Scores | Incoherence Analysis | |||
Precision | Recall | F-measure | Unsat. | Degree | ||||
AML | 124 | 6,988 | 454 | 0.923 | 0.762 | 0.835 | 0 | 0% |
LogMapBio | 1,434 | 6,725 | 175 | 0.911 | 0.711 | 0.799 | 1 | 0.004% |
LogMap | 95 | 6,313 | 4 | 0.941 | 0.689 | 0.796 | 1 | 0.004% |
ATMatcher | 30 | 6,226 | 163 | 0.959 | 0.647 | 0.773 | 8,563 | 36.3% |
OTMapOnto | 326 | 12,797 | 6,486 | 0.381 | 0.678 | 0.488 | 23,565 | 100.0% |
LogMapLt | 16 | 1,641 | 2 | 0.969 | 0.208 | 0.342 | 774 | 3.3% |
LSMatch | 8,632 | 1,535 | 0 | 0.988 | 0.198 | 0.330 | 760 | 3.2% |
ALOD2Vec | 674 | 3,755 | 1,616 | 0.446 | 0.246 | 0.317 | 19,431 | 82.4% |
KGMatcher | 9 | 236 | 0 | 0.983 | 0.038 | 0.073 | 0 | 0% |
System | Time (s) | # Mappings | # Unique | Scores | Incoherence Analysis | |||
Precision | Recall | F-measure | Unsat. | Degree | ||||
LogMap | 761 | 6,463 | 0 | 0.825 | 0.644 | 0.723 | 0 | 0% |
LogMapBio | 4,921 | 7,377 | 529 | 0.753 | 0.682 | 0.716 | 0 | 0% |
AML | 183 | 8,163 | 2,567 | 0.685 | 0.710 | 0.697 | 0 | 0% |
LogMapLt | 36 | 1,820 | 31 | 0.852 | 0.208 | 0.334 | 983 | 3.0% |
ATMatcher | 77 | 1,890 | 162 | 0.794 | 0.206 | 0.327 | 962 | 2.9% |
KGMatcher | 31 | 252 | 0 | 0.920 | 0.038 | 0.073 | 0 | 0% |
4. Results for the SNOMED-NCI matching problem
System | Time (s) | # Mappings | # Unique | Scores | Incoherence Analysis | |||
Precision | Recall | F-measure | Unsat. | Degree | ||||
AML | 1,026 | 14,739 | 2,573 | 0.906 | 0.746 | 0.818 | ≥8 | ≥0.011% |
LogMapBio | 4,652 | 13,704 | 817 | 0.909 | 0.696 | 0.788 | ≥0 | ≥0% |
LogMap | 397 | 12,522 | 11 | 0.954 | 0.667 | 0.785 | ≥0 | ≥0% |
LogMapLt | 27 | 10,891 | 311 | 0.949 | 0.564 | 0.708 | ≥60,366 | ≥80.4% |
KGMatcher | 27 | 2,358 | 0 | 0.977 | 0.124 | 0.220 | ≥14,216 | ≥18.9% |
System | Time (s) | # Mappings | # Unique | Scores | Incoherence Analysis | |||
Precision | Recall | F-measure | Unsat. | Degree | ||||
AML | 375 | 14,195 | 2,380 | 0.862 | 0.687 | 0.765 | ≥0 | ≥0% |
LogMapBio | 10,486 | 14,594 | 1,026 | 0.825 | 0.676 | 0.743 | ≥0 | ≥0% |
LogMap | 1,386 | 12,298 | 41 | 0.866 | 0.598 | 0.707 | ≥0 | ≥0% |
LogMapLt | 40 | 12,837 | 1,568 | 0.797 | 0.564 | 0.661 | ≥71,454 | ≥87.6% |
KGMatcher | 39 | 2,494 | 2 | 0.916 | 0.124 | 0.218 | ≥19,777 | ≥24.2% |