Construction of the literature graph in semantic scholar W Ammar, D Groeneveld, C Bhagavatula, I Beltagy, M Crawford, ... arXiv preprint arXiv:1805.02262, 2018 | 461 | 2018 |
Documenting large webtext corpora: A case study on the colossal clean crawled corpus J Dodge, M Sap, A Marasović, W Agnew, G Ilharco, D Groeneveld, ... arXiv preprint arXiv:2104.08758, 2021 | 305 | 2021 |
Generating search result summaries D Groeneveld, D Meyerzon, D Mowatt US Patent 8,285,699, 2012 | 117 | 2012 |
From ‘f’to ‘a’on the ny regents science exams: An overview of the aristo project P Clark, O Etzioni, T Khot, D Khashabi, B Mishra, K Richardson, ... Ai Magazine 41 (4), 39-53, 2020 | 107 | 2020 |
Name search using a ranking function DH Groeneveld, D Meyerzon, D Mowatt, JA Alspaugh US Patent 8,645,417, 2014 | 57 | 2014 |
Generating search result summaries D Groeneveld, D Meyerzon, D Mowatt US Patent 7,853,587, 2010 | 50 | 2010 |
A simple yet strong pipeline for hotpotqa D Groeneveld, T Khot, A Sabharwal arXiv preprint arXiv:2004.06753, 2020 | 41 | 2020 |
IKE-an interactive tool for knowledge extraction B Dalvi, S Bhakthavatsalam, C Clark, P Clark, O Etzioni, A Fader, ... Proceedings of the 5th workshop on automated knowledge base construction, 12-17, 2016 | 31 | 2016 |
Olmo: Accelerating the science of language models D Groeneveld, I Beltagy, P Walsh, A Bhagia, R Kinney, O Tafjord, AH Jha, ... arXiv preprint arXiv:2402.00838, 2024 | 22 | 2024 |
What's In My Big Data? Y Elazar, A Bhagia, I Magnusson, A Ravichander, D Schwenk, A Suhr, ... arXiv preprint arXiv:2310.20707, 2023 | 21 | 2023 |
Dolma: An Open Corpus of Three Trillion Tokens for Language Model Pretraining Research L Soldaini, R Kinney, A Bhagia, D Schwenk, D Atkinson, R Authur, ... arXiv preprint arXiv:2402.00159, 2024 | 15 | 2024 |
Generating search result summaries D Groeneveld, D Meyerzon, D Mowatt US Patent 8,032,519, 2011 | 15 | 2011 |
Name search using a ranking function DH Groeneveld, D Meyerzon, D Mowatt, JA Alspaugh US Patent 9,727,639, 2017 | 12 | 2017 |
Ananya Harsh Jha D Groeneveld, I Beltagy, P Walsh, A Bhagia, R Kinney, O Tafjord | 10 | 2024 |
Dolma: An open corpus of 3 trillion tokens for language model pretraining research L Soldaini, R Kinney, A Bhagia, D Schwenk, D Atkinson, R Authur, ... Allen Institute for AI, Tech. Rep, 5998-6008, 2023 | 8 | 2023 |
Construction of the literature graph in semantic scholar. NAACL W Ammar, D Groeneveld, C Bhagavatula, I Beltagy, M Crawford, ... URL: https://www. semanticscholar. org/paper …, 2018 | 4 | 2018 |
Catwalk: A unified language model evaluation framework for many datasets D Groeneveld, A Awadalla, I Beltagy, A Bhagia, I Magnusson, H Peng, ... arXiv preprint arXiv:2312.10253, 2023 | 3 | 2023 |
Large Language Model Distillation Doesn't Need a Teacher AH Jha, D Groeneveld, E Strubell, I Beltagy arXiv preprint arXiv:2305.14864, 2023 | 3 | 2023 |
Continued pretraining for better zero-and few-shot promptability Z Wu, RL Logan IV, P Walsh, A Bhagia, D Groeneveld, S Singh, I Beltagy arXiv preprint arXiv:2210.10258, 2022 | 3 | 2022 |
A flexible software middleware for interactive learning environments D Groeneveld, TC Hutchinson, F Kuester International Conference on Engineering Education and Research (iCEER) 6, 2004 | 2 | 2004 |