vLLM Tuning For Low Memory
GLM-4.7-Flash has been released! a 30B-A3B MoE model, I surely can run this on two 5090s... or can I?
Working notes and small writeups.
Notes from building software, mostly the parts worth remembering by Jason Brown (Loktar).
GLM-4.7-Flash has been released! a 30B-A3B MoE model, I surely can run this on two 5090s... or can I?
For the last two years I've worked as an Engineering Manager at a small startup, every day either feeling like I'm failing the job because what I do doesn't fall into the typical EM role. I realized I was measuring myself against a whole support structure I don't have.
I've been running local LLMs for over a year at this point, and while it's been great my family members have asked for access, some who don't live in the house anymore. There's a myriad of services that do this. In the past I've used ngrok for solutions, or even just set up a VPN through my home router and given access to my kids. None of them were as straightforward as what I found using Cloudflare.
If your day job repo is not on GitHub, here is how I export commit dates, bucket them, and replay them so my GitHub activity graph matches the work I actually did.
I got tired of fighting WordPress and Gatsby just to publish a post, so I built a small Node script that turns markdown (front matter included) into a static site using EJS templates, plus a couple helpers for gists and code blocks.
I started pulling my scattered demos from places like CodePen and JSFiddle into one repo and gallery. The goal is to keep everything versioned and make it easy to generate previews.
Deep copying in JavaScript is a mess if you do not know the tradeoffs. This post covers shallow vs deep copies, where JSON stringify and spread/Object.assign break down, and when structuredClone is the right tool.
A quick JS/CSS experiment where text trails the cursor, inspired by a tweet. I fed it a random Nostradamus paragraph and called it a nice little deviation.
A minimal toggle built from a single checkbox input. It is pure CSS using pseudo-elements for the track and thumb, so it drops into a form without extra markup.
A simple Express pattern for reading the host header, pulling out the subdomain, and normalizing hyphens so you can drive behavior off `whatever.yourdomain.com with` a wildcard DNS entry.