Follow
Jialin  Wu
Jialin Wu
Google DeepMind
Verified email at google.com - Homepage
Title
Cited by
Cited by
Year
Rt-2: Vision-language-action models transfer web knowledge to robotic control
B Zitkovich, T Yu, S Xu, P Xu, T Xiao, F Xia, J Wu, P Wohlhart, S Welker, ...
Conference on Robot Learning, 2165-2183, 2023
729*2023
Open x-embodiment: Robotic learning datasets and rt-x models
A O'Neill, A Rehman, A Gupta, A Maddukuri, A Gupta, A Padalkar, A Lee, ...
arXiv preprint arXiv:2310.08864, 2023
284*2023
Self-Critical Reasoning for Robust Visual Question Answering
J Wu, RJ Mooney
Proceedings of the Thirty-third Conference on Advances in Neural Information …, 2019
1732019
Pali-x: On scaling up a multilingual vision and language model
X Chen, J Djolonga, P Padlewski, B Mustafa, S Changpinyo, J Wu, ...
arXiv preprint arXiv:2305.18565, 2023
1372023
Multi-modal answer validation for knowledge-based vqa
J Wu, J Lu, A Sabharwal, R Mottaghi
Proceedings of the AAAI conference on artificial intelligence 36 (3), 2712-2721, 2022
1352022
Faithful Multimodal Explanation for Visual Question Answering
J Wu, RJ Mooney
Proceedings of the Second BlackboxNLP Workshop on Analyzing and Interpreting …, 2018
1102018
Generating question relevant captions to aid visual question answering
J Wu, Z Hu, RJ Mooney
arXiv preprint arXiv:1906.00513, 2019
65*2019
Dynamic filtering with large sampling field for convnets
J Wu, D Li, Y Yang, C Bajaj, X Ji
Proceedings of the European Conference on Computer Vision (ECCV), 185-200, 2018
61*2018
Pali-3 vision language models: Smaller, faster, stronger
X Chen, X Wang, L Beyer, A Kolesnikov, J Wu, P Voigtlaender, B Mustafa, ...
arXiv preprint arXiv:2310.09199, 2023
552023
Open X-Embodiment: Robotic learning datasets and RT-X models
OXE Collaboration, A Padalkar, A Pooley, A Jain, A Bewley, A Herzog, ...
arXiv preprint arXiv:2310.08864 1 (2), 2023
542023
Improving vqa and its explanations\\by comparing competing explanations
J Wu, L Chen, RJ Mooney
arXiv preprint arXiv:2006.15631, 2020
242020
Geomverse: A systematic evaluation of large models for geometric reasoning
M Kazemi, H Alvari, A Anand, J Wu, X Chen, R Soricut
arXiv preprint arXiv:2312.12241, 2023
222023
Action Recognition with Joint Attention on Multi-Level Deep Features
J Wu, G Wang, W Yang, X Ji
arXiv preprint arXiv:1607.02556, 2016
212016
Visual question answering based on local-scene-aware referring expression generation
JJ Kim, DG Lee, J Wu, HG Jung, SW Lee
Neural Networks 139, 158-167, 2021
192021
CausalLM is not optimal for in-context learning
N Ding, T Levinboim, J Wu, S Goodman, R Soricut
arXiv preprint arXiv:2308.06912, 2023
162023
CoNAN: A complementary neighboring-based attention network for referring expression generation
J Kim, H Ko, J Wu
Proceedings of the 28th International Conference on Computational …, 2020
142020
Entity-focused dense passage retrieval for outside-knowledge visual question answering
J Wu, RJ Mooney
arXiv preprint arXiv:2210.10176, 2022
122022
Distilling vision-language models on millions of videos
Y Zhao, L Zhao, X Zhou, J Wu, CT Chu, H Miao, F Schroff, H Adam, T Liu, ...
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
92024
Non-intrusive adaptation: Input-centric parameter-efficient fine-tuning for versatile multimodal modeling
Y Wang, J Wu, T Dabral, J Zhang, G Brown, CT Lu, F Liu, Y Liang, B Pang, ...
arXiv preprint arXiv:2310.12100, 2023
82023
Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts
J Wu, X Hu, Y Wang, B Pang, R Soricut
Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern …, 2024
62024
The system can't perform the operation now. Try again later.
Articles 1–20