关注
Dong Yan
Dong Yan
Baichuan Inc. Head of Reinforcement Learning Team.
在 baichuan-inc.com 的电子邮件经过验证
标题
引用次数
引用次数
年份
Baichuan 2: Open large-scale language models
A Yang, B Xiao, B Wang, B Zhang, C Bian, C Yin, C Lv, D Pan, D Wang, ...
arXiv preprint arXiv:2309.10305, 2023
1292023
Tianshou: A highly modularized deep reinforcement learning library
J Weng, H Chen, D Yan, K You, A Duburcq, M Zhang, Y Su, H Su, J Zhu
Journal of Machine Learning Research 23 (267), 1--6, 2022
1012022
Learning task-distribution reward shaping with meta-learning
H Zou, T Ren, D Yan, H Su, J Zhu
Proceedings of the AAAI Conference on Artificial Intelligence 35 (12), 11210 …, 2021
90*2021
Playing FPS Games With Environment-Aware Hierarchical Reinforcement Learning.
S Song, J Weng, H Su, D Yan, H Zou, J Zhu
IJCAI, 3475-3482, 2019
322019
Towards safe reinforcement learning via constraining conditional value-at-risk
C Ying, X Zhou, H Su, D Yan, N Chen, J Zhu
IJCAI, 3673-3680, 2022
202022
Deep reinforcement learning with credit assignment for combinatorial optimization
D Yan, J Weng, S Huang, C Li, Y Zhou, H Su, J Zhu
Pattern Recognition 124, 108466, 2022
192022
Lazy-CFR: fast and near-optimal regret minimization for extensive games with imperfect information
Y Zhou, T Ren, J Li, D Yan, J Zhu
International Conference on Learning Representations, 2020
17*2020
Using memory in the right way to accelerate Big Data processing
D Yan, XS Yin, C Lian, X Zhong, X Zhou, GS Wu
Journal of Computer Science and Technology 30, 30-41, 2015
162015
NativeTask: a Hadoop compatible framework for high performance
D Yang, X Zhong, D Yan, F Dai, X Yin, C Lian, Z Zhu, W Jiang, G Wu
2013 IEEE International Conference on Big Data, 94-101, 2013
122013
Pspec: A formal specification language for fine-grained control on distributed data analytics
C Luo, F He, D Yan, D Zhang, X Zhou, BY Wang
2017 IEEE/ACM 39th International Conference on Software Engineering …, 2017
62017
Policy learning for robust markov decision process with a mismatched generative model
J Li, T Ren, D Yan, H Su, J Zhu
Proceedings of the AAAI conference on artificial intelligence 36 (7), 7417-7425, 2022
52022
Reward informed dreamer for task generalization in reinforcement learning
C Ying, Z Hao, X Zhou, H Su, S Liu, J Li, D Yan, J Zhu
arXiv preprint arXiv:2303.05092, 2023
32023
Combining Tree Search and Action Prediction for State-of-the-Art Performance in DouDiZhu
Y Zhang, D Yan, B Shi, H Fu, Q Fu, H Su, J Zhu, N Chen
International Joint Conferences on Artificial Intelligence Organization …, 2021
32021
PSpec-SQL: Enabling Fine-Grained Control for Distributed Data Analytics
C Luo, F He, F Peng, D Yan, D Zhang, X Zhou
IEEE Transactions on Dependable and Secure Computing 18 (2), 810-824, 2019
32019
Advanced graph model for tainted variable tracking
C Ma, D Yan, YP Wang, SM Hu
Science China Information Sciences 56, 1-12, 2013
32013
Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective
T Qiu, F Zeng, J Ji, D Yan, K Wang, J Zhou, Y Han, J Dai, X Pan, Y Yang
https://arxiv.org/pdf/2402.10184.pdf, 2024
22024
On the reuse bias in off-policy reinforcement learning
C Ying, Z Hao, X Zhou, H Su, D Yan, J Zhu
arXiv preprint arXiv:2209.07074, 2022
12022
Model-based Reinforcement Learning with a Hamiltonian Canonical ODE Network
Y Feng, Y Jiang, H Su, D Yan, J Zhu
arXiv preprint arXiv:2211.00942, 2022
2022
基于内存访问优化的大数据处理
阎栋, 尹绪森, 连城, 钟翔, 周鑫, 吴甘沙
计算机科学技术学报 30 (1), 30-41, 2015
2015
Special Section on Computer Architecture and Systems for Big Data
PWG Chen, R Chen, JX Shi, HB Chen, BY Zang, D Yan, XS Yin, C Lian, ...
Journal of Computer Science and Technology 30, 2015
2015
系统目前无法执行此操作,请稍后再试。
文章 1–20