Logistic Regression Modeling of Self-Identity We examine the probabilistic relationship between self-identity and genetically inferred ancestry. To explore the interaction between genetic ancestry and self-reported identity, we estimated the proportion of individuals that identify as African American and European American, partitioned by levels of African ancestry. Jointly considering the cohorts of European Americans and African Americans, we examined the relationship between an individual’s genome-wide African ancestry proportion and whether they self-report as European American or African American. We note a strong dependence on the amount of African ancestry, with individuals carrying less than 20% African ancestry identifying largely as European American, and those with greater than 50% reporting as African American. To test the significance of this relationship, we fit a logistic regression model, using Python’s statsmodels package, predicting self-reported ancestry by using proportion African ancestry, sex, age, intercept, and interaction variables.