LLM assistant for STEM education
Fine-tuned a GPT-2 model for a STEM questions chatbot and built a reward model to allow reinforcement learning from human feedback
Fine-tuned a GPT-2 model for a STEM questions chatbot and built a reward model to allow reinforcement learning from human feedback