Hiteshi Sharma

Orcid: 0000-0002-4057-0302

According to our database1, Hiteshi Sharma authored at least 22 papers between 2014 and 2024.

Collaborative distances:

Timeline

Legend:

Book 
In proceedings 
Article 
PhD thesis 
Dataset
Other 

Links

On csauthors.net:

Bibliography

2024
Progressive Safeguards for Safe and Model-Agnostic Reinforcement Learning.
CoRR, 2024

Phi-3 Safety Post-Training: Aligning Language Models with a "Break-Fix" Cycle.
CoRR, 2024

Cost-Effective Proxy Reward Model Construction with On-Policy and Active Learning.
CoRR, 2024

Self-Exploring Language Models: Active Preference Elicitation for Online Alignment.
CoRR, 2024

Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone.
CoRR, 2024

Language Models can be Deductive Solvers.
Proceedings of the Findings of the Association for Computational Linguistics: NAACL 2024, 2024

Enhancing Language Model Alignment: A Confidence-Based Approach to Label Smoothing.
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, 2024

2023
Language Models can be Logical Solvers.
CoRR, 2023

ALLURE: Auditing and Improving LLM-based Evaluation of Text using Iterative In-Context-Learning.
CoRR, 2023

Fine-Tuning Language Models with Advantage-Induced Policy Alignment.
CoRR, 2023

Evaluating Cognitive Maps and Planning in Large Language Models with CogEval.
Proceedings of the Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, 2023

2020
A Universal Empirical Dynamic Programming Algorithm for Continuous State MDPs.
IEEE Trans. Autom. Control., 2020

Randomized Policy Learning for Continuous State and Action MDPs.
CoRR, 2020

Model-free Reinforcement Learning in Infinite-horizon Average-reward Markov Decision Processes.
Proceedings of the 37th International Conference on Machine Learning, 2020

Finite Time Guarantees for Continuous State MDPs with Generative Model.
Proceedings of the 59th IEEE Conference on Decision and Control, 2020

2019
Approximate Relative Value Learning for Average-reward Continuous State MDPs.
Proceedings of the Thirty-Fifth Conference on Uncertainty in Artificial Intelligence, 2019

An Empirical Relative Value Learning Algorithm for Non-parametric MDPs with Continuous State Space.
Proceedings of the 17th European Control Conference, 2019

Empirical Algorithms for General Stochastic Systems with Continuous States and Actions.
Proceedings of the 58th IEEE Conference on Decision and Control, 2019

An Approximately Optimal Relative Value Learning Algorithm for Averaged MDPs with Continuous States and Actions.
Proceedings of the 57th Annual Allerton Conference on Communication, 2019

2017
Randomized function fitting-based empirical value iteration.
Proceedings of the 56th IEEE Annual Conference on Decision and Control, 2017

2016
A dynamical systems framework for stochastic iterative optimization.
Proceedings of the 55th IEEE Conference on Decision and Control, 2016

2014
Optimal Spectrum Sensing for Cognitive Radio with Imperfect Detector.
Proceedings of the IEEE 79th Vehicular Technology Conference, 2014


  Loading...