Figure 1—figure supplement 1E illustrates a cosine similarity probability density function (PDF) graph to visually describe the implementation of the word2vec-like Vector Space Model in various N-dimensional spaces. As described in the Materials and methods section, the system is a Semantic Bio-Knowledge Graph of nodes representing the words/phrases chosen to be represented as vectors and edge weights determined by measures of Semantic Association Strength (e.g. the cosine similarity between a pair of word embeddings represented as vectors in a large dimensional space). The cosine similarity ranges from 0 (representing no semantic association) to 1 (representing strongest association). This metric of association can reflect the contextual similarity of the entities in the Biomedical Corpora. The typical dimensionality used by our neural network for generating the Global Scores is n = 300 dimensions. This is because, as can be seen in the graph, the distribution is highly peaked with most of the mass centered around 0 -- that is, a randomly chosen pair of vectors typically are orthogonal or close to orthogonal. Furthermore, over 300 dimensions, the distributions all have sufficiently long tails with the most interesting (salient) biomedical associations.