Id |
Subject |
Object |
Predicate |
Lexical cue |
T417 |
0-318 |
Sentence |
denotes |
One way to view the word2vec ‘black box’ operation from a Genes/Diseases perspective (cosine of <Gene, Disease> for all Genes and Diseases) is as a Transfer Function which changed the input probability distribution (pre-training randomly assigned word vectors for Genes and Diseases) to a new probability distribution. |
T418 |
319-527 |
Sentence |
denotes |
The ‘null hypothesis’ (which seems to be well preserved in actuality in the way word2vec assigns random values to vectors initially) is the ‘green colored’ Cosine Distribution (Figure 1—figure supplement 1D). |
T419 |
528-781 |
Sentence |
denotes |
Once word2vec training is over, the final word vectors are placed in specific positions in the 300-dimensional space so as to present the ‘blue colored’ Empirical distribution (the actual cosine similarity between <Gene, Disease> pairs that we observe). |
T420 |
782-966 |
Sentence |
denotes |
The ‘orange curve’ is the 2-Gamma mixture (the parametric distribution that captures the ‘empirical distribution’ with just eight parameters (two alphas, two betas, 2 ts and two phis). |