Help to understand categorical variable results in regression

This comment was posted to reddit on Jul 04, 2019 at 1:55 pm and was deleted within 15 hour(s) and 43 minutes.

Help to understand categorical variable results in regression

Hello again. What I understand is that ideally you would like to model the probability of someone getting nausea, using the cancer types as your logistic model's covariates. This is called relative risk regression and it is an alternative to logistic regression. Unfortunately, logistic regression models the odds of someone feeling nausea, defined as: P(nausea=yes|covariates)/(1-P(nausea=yes|covariates). One of the main reasons we use odds is that P(nausea=yes|covariates)=p is a probability and thus ranges in [0,1] and hence p/(1-p) spans the entire positive line and thus the logarithm of the odds spans the whole real line. If you tried to model the probability of someone getting nausea then you would have: Log(p)=bo+b1x1+…+bnxn. However, p is in [0,1] and hence log(p) must be negative and thus we need to impose a linear constraint on the linear combination bo+b1x1+…+bnxn<=0. This creates problems in finding the MLE as the standard Fisher scoring algorithm would likely not converge to a solution given the fact that it is an unconstrained optimization algorithm. You may find these links useful: https://www.theanalysisfactor.com/the-difference-between-relative-risk-and-odds-ratios/ and https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3305608/. Does that make any sense or I am way off and this wasn’t what you wanted to ask?

/r/statistics Thread

Help to understand categorical variable results in regression

Recently removed from /r/statistics

More Random Comments