Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unobserved subunit confounder variables #241

Open
adamrupe opened this issue Sep 13, 2024 · 4 comments
Open

Unobserved subunit confounder variables #241

adamrupe opened this issue Sep 13, 2024 · 4 comments

Comments

@adamrupe
Copy link
Collaborator

How do we want to handle unobserved subunit variables? They are discussed in Algorithm 1, but none of the examples have unobserved subunits. The current implementation of Algorithm 1, collapse_HCM, raises a ValueError saying that are not currently supported. If we allow unobserved subunit variables in HCMs that input into Algorithm 1 for collapse, we will need to decide how to handle their edges (i.e. if they connect only to other subunit variables or if they also connect to unit variables).

@adamrupe adamrupe self-assigned this Sep 13, 2024
@djinnome
Copy link
Contributor

djinnome commented Sep 13, 2024

I think figure 6b shows you how to handle unobserved subunit variables that connects to a single subunit variable, and figure 6d shows you how to handle unobserved subunit variables that connect to a single unit variable: you just marginalize them out.

Screenshot 2024-09-13 at 4 36 27 PM

But this raises a new question: how do you handle when an unobserved subunit variable connects to two or more subunit or unit variables? This is the case of unobserved confounding, and this may require a bit more careful thinking.

@adamrupe
Copy link
Collaborator Author

I should have clarified, but yes I meant specifically the case of unobserved confounding due to an unobserved subunit variable.

@djinnome djinnome changed the title Unobserved subunit variables Unobserved subunit confounder variables Sep 17, 2024
@djinnome
Copy link
Contributor

djinnome commented Sep 17, 2024

Let's break down the latent subunit confounders into two categories:

  1. Those without observed parents
  2. Those with observed parents.

For latent subunit confounders without observed parents, we can just follow algorithm 1:
For each subunit endogenous variable $v\in \mathcal{S}$

  1. Create a unit endogenous variable $Q^v$
  2. Mark $Q^v$ as hidden
  3. for each direct unit descendant $w \in \text{dd}_v(\mathcal{S})$ do
    connect $Q^v$ to $X^w$
    end for
  4. erase the subunit variable $X^v$

For latent subunit confounders with parents, we can still follow algorithm 1, but it makes a difference whether the parents are subunit variables or unit variables.

If the parents are subunit variables, they are no longer parents in the latent $Q^v$ variable.

The parents are unit variables, they are disconnected from the subunit variable $v$ and connected to the unit variable $Q^{v|pa_{\mathscr V}}$.

@adamrupe
Copy link
Collaborator Author

Does it matter if you have a chain of latent confounding variables? To be specific, if a subunit variable has unobserved subunit parents, its promoted Q variable is unobserved. Does it matter that this promoted Q variable is not connected to the promoted Q variable of its subunit parent?

I think the conditions in the Algorithm 1 pseudo code in the paper might sufficiently cover this. Now that I'm splitting up creating HCGMs and then collapsing in the code for Algorithm 1, I understand this better. Edges in the collapsed model (specifically undirected edges) are all at the unit level, and the Algorithm 1 pseudo code outlines whether the promoted Q variables are observed or not. Then the undirected edges are created based on unobserved unit variables (including Q variables).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants