hlfshell

#math

Maker. Roboticist. Person.
Keith Chester

LATEST ARTICLE:

GRPO in DeepSeek-R1

GRPO in DeepSeek-R1

Lately I'm thinking about...

Interview Practice App

One of the great parts of building out tools like arkaine is that it allows me to sit down and just build for a bit to test the framework. After some quick experimentation I have a great agent that:

  1. Takes in a resume and a job description

  2. Considers additional topics to research to build out a knowledge base of what certain acroynms, skills, and technologies mean, and what are industry standards and expectations are

  3. Searches the web for those topics with arkaine’s research agents, building a knowledge base of relevant information

  4. Builds a list of questions that an interview should ask and…

  5. Converts the questions to natural sounding language and uses TTS to create a virtual interviewer. (This part is… it works. But sometimes it’s too heavy on the pausing and filler words)

So far it’s working surprisingly well for a simple prototype script. I’ll probably expand on it quite a bit, especially since I haven’t utilized speech-to-text to allow the user to answer (and an agent to judge their response) yet.

Here are some examples (the TTS is handled by OpenAI’s gpt-4o-mini-tts):

#AI #arkaine
arkaine 0.0.21; next steps

Version 0.0.21 of arkaine is out, including the finalized format for the text to speech tooling, equivalent speech to text tooling (though, admittingly, I currently lack a locally hosted option for this), and the think tool I mentioned earlier.

There’s still a lot of features that I want to add, and some I’m in the middle of; adding a chat interface to Spellbook and expanding the number of possible chat interfaces would be fun, and I already started the process a month ago. Similarly I half a ~60% completed OCR implementation and a redefining of how researchers utilize resources (necessary for handling non website object search, like your own books/documents)… but right now I’m thinking of taking a moment to just building with what I have and creating something useful for people as is.

#AI #arkaine
Just give me a second to think...

I simply love when simple ideas get tested and proven to be quite effective. It’s a clear sign of slowly feeling out how to best understand the system at hand. Such a delight popped up when I saw that Anthropic had revealed that simply adding a no-op tool with a “thought” argument called “think”, allowing the agent to just output its thought in the chain of Action -> Result generation, improved performance on complicated tasks.

…of course, I also have already implemented it in arkaine; I’ll give it a more thorough testing with some more complicated agents later.

#AI #arkaine
arkaine docs

My framework arkaine, which I quickly presented a bit ago, finally has some nice documentation for it. I had v0 do an initial pass on it, which I rather liked. After two quick rounds of prompting on their free tier I downloaded the project and tried my hand at expanding it from there. It’s my first tailwind/next.js project, but it was surprisingly easy. Granted it’s a simple page relative to a typical SPA or backend service, but hey, I’ll take the wins where I can get them.

Check out the documentation, especially the toolbox, and see if you can get inspired to build anything cool with arkaine.

#arkaine #AI
BitNet b1.58 Reloaded

The paper club I host will be covering BitNet b1.58 Reloaded: State-of-the-art Performance Also on Smaller Networks! I’ve been looking forward to this one since I read the original paper on 1.58 bit nets. Join us and learn about the future of trinary and LLMs!

#AI
Mathematica
Attached image

I picked up David Bessis' Mathematica on a whim. It focused on a discussion of what math truly was to one accomplished in it, and how the public’s understanding of what mathematicians do is wildly, grossly inaccurate. The book’s premise: language is a poor medium for transmission of intuition itself, whereas math and logical proofs are overkill but required to express it. Mathematics, he argues, is the art of intuition and imagination, not calculation, trying its best damndest to define it. But, since intuition is so beyond language, we have to invent new concepts and symbols in which to grasp it and communicate it.

I enjoyed it, and have certainly spent time thinking on its lessons, but wished it delved more into direct hands-on walkthroughs of intuition in which to further illustrate the separation of language and logic, or perhaps more solid advice on directly attacking the problem of intuition growth.

#books #math

GRPO in DeepSeek-R1

GRPO in DeepSeek-R1
Liquid Time Constant Neural Networks
Attached image

Last night Adam gave a great presentation at the SDx paper club. The idea of using ODE solvers as an activation function was 🤯. It’s heavily used in robotics, so I’ll likely be doing a deep dive at some point; specifically building a neuron that uses the paper’s techniques to better understand the inner workings.

#AI #math
DeepSeek R1

A few weeks ago I gave a talk at an SDx paper club covering the DeepSeek R1 Paper. I talked in depth about the advancements made and the implications of their success with GRPO (group relative policy optimization) powered reinforcement learning.

The recording at the event borked so I re-recorded it the next day. Enjoy!

#AI
eli5-equations
Attached image

I’ve been working on arkaine’s OCr service all weekend, and need a break. I’ve been toying with the idea of an equation explainer that copies the style I present complicated math in my paper club presentations. I’ve decided to step away from arkaine and try using it a bit in a prototype. Hence: eli5-equations.

Want to get a walk through of a complicated equation? Pass it in with some context and see if your evening is a bit enlightened. I’ll do a further write up on this later probably.

#arkaine #AI #math
Mini hack-a-thon

Today I attended a mini-hackathon via SDx. I attended to solo work on some arkaine agents and to be present as a mentor/advisory role for other attendees. It was a short 6 hour affair, mainly focused on playing with the new OpenAI o3-mini. It also helps to be inspired by seeing other people creatively applying AI to a quick weekend project.

I ended up building a great prototype of a research agent - the original goal of arkaine for myself. It needs some work - I definitely ran into rate limiting issues and need to get the agent to better understand report generation at the end. Expect this to get added in to arkaine soon. Pushing myself to finish the project in the time allotted was also a great exercise in rapid prototyping. As for the other projects - there were quite a few that wowed me. I’m certainly looking forward to the next time I can dive in and code surrounded by other makers.

#arkaine #AI #SDx
Increased creativity by thinking longer
Attached image

Here’s an ingenious set of hacks to cheaply modify the behavior of existing LLMs to reason better. Most notably was the detecting the initial use of the </think> tag and instead replacing it with a second-guessing term (best performing was “Wait”). This forced the model to think longer, which in turn improved performance on tasks significantly.

I’ll likely be doing a deeper dive for my upcoming paper club presentation.

#AI
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

We’re kicking off 2025’s paper club series via SDx again on February 18th @ 6:30 pm. I’ll be presenting DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning. Join in if you’re in the area and want to deep dive some of the recent cutting edge discoveries.

#AI #Meetup
I'm afraid I can't do that, Dave...

I found myself looking into the effects of censorship removal from LLMs - particularly the recent popular kid on the block Deepseek R-1. It seems that the model becomes uncooperative against certain topics that don’t align with party doctrine. I came a cross a generic refusals removal repository linked here which made me chuckle - it’s just control vectors fine tuned into the model, which I discussed here.

#AI
(Rapidly) introducing arkaine

I recently gave (an unfortunately rushed) talk about arkaine - a maker-focused agentic AI framework I’ve been spending most of my time building. Slides for the talk are here

#AI

Diffusion Models Are Real-Time Game Engines

Diffusion Models Are Real-Time Game Engines

Google DeepMind's Grandmaster-Level Chess Without Search

Google DeepMind's Grandmaster-Level Chess Without Search

Representation Engineering and Control Vectors - Neuroscience for LLMs

Representation Engineering and Control Vectors - Neuroscience for LLMs

Nerd Sniped - Solving for Jumbles and Letter Boxed

Nerd Sniped - Solving for Jumbles and Letter Boxed

Utilizing LLMs as a Task Planning Agent for Robotics

Utilizing LLMs as a Task Planning Agent for Robotics

A Corollary to Conway's Law - Build for The Team You Have

A Corollary to Conway's Law - Build for The Team You Have

Repeatable Dev Environments for ROS2

Repeatable Dev Environments for ROS2

State of the art in LLMs + Robotics - 2023

State of the art in LLMs + Robotics - 2023

Reinforcement Learning with a Pick and Place Robotic Arm

Reinforcement Learning with a Pick and Place Robotic Arm