Published Works

Document Type

Conference Proceeding


Collection Development and Management | Scholarly Communication


This presentation is a fascinating look at how business researchers use databases and datasets. Our study examines the full text of hundreds of journal articles published between 2020 and 2023, by researchers from two business schools belonging to R1 universities in the Northeast U.S. - New York University, a top private research university, and the University of Connecticut, a public land-grant university.

Our presentation will list the most commonly cited datasets - commercial data and public, open data. We discuss the similarities and differences between the two business schools and offer insights on how dataset usage impacts research outcomes and visibility.

In addition, we carry out experiments utilizing machine-learning topic modeling techniques, including Latent Dirichlet Allocation (LDA) and other textual analysis methods. Our objective was to cluster content found in abstracts and parts of full text to identify a specific number of prominent topics within our corpora. We determine the dominant issue in each article and plan to perform cross-comparisons of the databases and datasets employed in the relevant studies and the identified topic clusters.

We believe that our findings have important implications for acquisition librarians, business librarians and business researchers. Librarians can use our findings to help researchers identify and access the datasets that are most relevant to their research. Libraries can help researchers find and understand open data, as well as provide tools and resources in data stewarding. Researchers can use our findings to understand how their peers are using datasets and to identify new opportunities for research.


2023 Charleston Conference, virtual concurrent session.