Ontology highlight
ABSTRACT:
SUBMITTER: Egami N
PROVIDER: S-EPMC9581481 | biostudies-literature | 2022 Oct
REPOSITORIES: biostudies-literature
Egami Naoki N Fong Christian J CJ Grimmer Justin J Roberts Margaret E ME Stewart Brandon M BM
Science advances 20221019 42
Text as data techniques offer a great promise: the ability to inductively discover measures that are useful for testing social science theories with large collections of text. Nearly all text-based causal inferences depend on a latent representation of the text, but we show that estimating this latent representation from the data creates underacknowledged risks: we may introduce an identification problem or overfit. To address these risks, we introduce a split-sample workflow for making rigorous ...[more]