In order to capture biomedical literature-based associations, the nferX platform defines two scores: a ‘local score’ and a ‘global score’, as described previously (Park et al., 2020). Briefly, the local score is obtained from applying a traditional natural language processing technique which captures the strength of association between two concepts in a selected corpus of biomedical literature based on the frequency of their co-occurrence normalized by the frequency of each individual concept throughout the corpus. A higher local score between Concept X and Concept Y indicates that these concepts are frequently mentioned in close proximity to each other more frequently than would be expected by chance. The global score, on the other hand, is based on the neural network renaissance that has recently taken place in Natural Language Processing (NLP). To compute global scores, all tokens (e.g. words and phrases) are projected in a high-dimensional vector space of word embeddings. These vectors serve to represent the ‘neighborhood’ of concepts which occur around a given concept. The cosine similarity between any two vectors measures the similarity of these neighborhoods and is the basis for our global score metric, where concepts which are more similar in this vector space have a higher global score.