Skip to content

Commit

Permalink
trying to fix figs + titles pt3
Browse files Browse the repository at this point in the history
  • Loading branch information
donizk committed May 4, 2023
1 parent 766f214 commit d949bec
Showing 1 changed file with 9 additions and 3 deletions.
12 changes: 9 additions & 3 deletions thesis.md
Original file line number Diff line number Diff line change
Expand Up @@ -461,9 +461,9 @@ Then this is factored into the following regression for the project.
$logit(Y_j) = \beta_0 + \beta_1 \mathrm{GENDER} + \beta_2 \mathrm{RACE} + \beta_3 \mathrm{HISPANIC}$

Due to the fact that this model encompasses explanatory variables that are both binary and categorical in nature, further data manipulation was needed in order to convert these variables into ones that can be used to create interpretable and valid results within a regression model. The unique values for each of these variables are recoded into dummy variables to achieve this goal. There are two exceptions to this: firstly, in the RACE variable in that the mixed race categories represented in the data were merged to create a single mixed race category in order to also aid in simplifying the interpretation of the model's results. The code snippet below first shows the recoding of the individual mixed race groups values into a universal "catch-all" variable, while the following code snippet shows the recoding of the RACE variable into individual dummy variables.






**Mixed Race Grouping**

Expand Down Expand Up @@ -493,6 +493,7 @@ Due to the fact that this model encompasses explanatory variables that are both

```



**Race Binary Recoding**

Expand All @@ -508,6 +509,7 @@ result$mixed_race <- ifelse(result$RACE == "999", 1, 0)
The same was done for the Other Hispanic population in the HISPAN variable, in which the Central and South American populations were merged with the Other Hispanic populations rates, in order to consolidate results, due to lower counts of all of these populations in isolation, relative to the other Hispanic races in the analysis. The code snippets for the merging of the Hispanic groups, as well as the binary recoding of the HISPAN variable can be observed below.




**Other Hispanic Group Merging**

Expand All @@ -519,6 +521,7 @@ result$HISPAN[result$HISPAN == "611"]<-"650" # central american, excluding salva
result$HISPAN[result$HISPAN == "612"]<-"650" # south american
```



**Hispanic Ethnicity Binary Recoding**

Expand Down Expand Up @@ -578,6 +581,7 @@ $logit(Y_j) = \beta_0 + \beta_1 \mathrm{FEMALE} + \beta_2 \mathrm{BLACK} + \beta

In order to construct this additional binary logistic regression in R, the HISPAN variable was recoded to merge all of the Hispanic values together. This merged values was then recoded into a binary variable, taking in values of 0 (non-hispanic) or 1 (hispanic). The code snippets for the recoding of HISPAN, as well as the newly constructed regression are below.



**Hispanic Recoding**

Expand All @@ -597,7 +601,9 @@ result$HISPAN[result$HISPAN == "612"]<-"650" # south american
# binary recode
result$hispanic <- ifelse(result$HISPAN == "650", 1, 0)
```



**2nd Regression**


Expand All @@ -610,7 +616,7 @@ In order to get more interpretable results, the odds ratio was also generated fo

The results of running the code for both regressions and subsequent odds ratios are as follows:


**First Regression and Odds Ratio**

![Regression 1](images/reg1.jpg)
Expand Down

0 comments on commit d949bec

Please sign in to comment.