Or, By The Sorceries of Python and LM Studio Combined

(Note: this is me rewriting a Claude draft post about this development. I had Claude do the initial draft inside the same chat where most of this python coding had been done).

After I wrote about my n8n automations back in February, my computer died in April, and was replaced by something newer and pinker. I resolved to do something different this time around: without API costs, without cloud dependencies, and without the kind of platform risk that bit me with Perplexity. The goal was to run my automations locally: local models, local inference, local Python scripts. No subscription fees beyond what I was already paying for Claude Pro, no data leaving my machine, no terms of service surprises. I have studied Python, but the actual coding was mostly done by Claude Sonnet 4.6. I’m not including the code itself, because your use cases may be different and your pet chatbot is probably just as good at writing python scripts as mine.

What follows is a report from the other side of that transition. TL;DR version: you can get an AI chatbot to write and troubleshoot python scripts which talk to LM Studio and do various useful support tasks for writers. Once those python scripts are finalized, you will be that much less dependent on the chatbots living out there on other people’s servers. Your “Skynet Secretary” will be living at home with you, instead of out there online. I also include some asides on how to do something similar with just a chatbot and without the python scripts and LM Studio, for people who care more about keeping things simple than keeping them local.

The Stack

The short version: Python 3.11, Whisper for transcription, LM Studio as the local model runner, and a handful of Python scripts that talk to LM Studio via its SDK. The model doing most of the heavy lifting is Gemma 4 26B A4B — a Google model running in “thinking” mode, meaning it reasons through the task before producing output, which turns out to matter quite a lot for the kind of structured analysis I’m asking it to do.

LM Studio deserves a mention on its own. It’s a desktop application that lets you download and run open-source models locally, manages the technical bits of inference, and exposes a local API that Python can talk to. If you’ve been curious about running models locally but found the command-line setup intimidating, LM Studio is the friendliest on-ramp I’ve found. Currently, the best American-made models available through LM Studio are the Gemma 4 family from Google.

Summary Plus, Rebuilt

The first thing I rebuilt was the Summary Plus workflow from the n8n post. This is the one that goes through a manuscript chapter by chapter and extracts structured information. The Python version first runs a separate manuscript splitter script (also written by Claude) to allow it to see the manuscript as a series of chapters. It then does everything the n8n version did. I asked to have these scripts set to look at a particular folder on my computer where I drop the files I want them to look at, and that has worked pretty well so far. I got the idea from something my friend Cedar Sanderson mentioned in this post, about having whisper transcribe all the audio files in a specific folder.

The output is JSON: chapter summaries, plot events, settings with sensory details and tonal weight, characters with their roles and relationship developments, new information revealed, tropes, and three to five high-impact quotes per chapter. For a 41-chapter novel this runs in a few hours and produces a structured document you can drop directly into a chatbot conversation to get it up to speed on your book without feeding it the whole manuscript.

Gemma 4 runs in thinking mode, which means it produces a lot of internal reasoning before it commits to output. This is mostly a feature, but it creates an interesting engineering problem: the model occasionally leaks fragments of its thinking process into the JSON. Things like stray letters, garbled keys, prose annotations mid-field. I spent several debugging sessions building a repair kit of regex fixes that catches and corrects these artifacts before the JSON parser sees them. The current version handles about 95% of them automatically. The remaining failures land in a separate errors file for manual review.

I then take the JSON output, flaws and all, to Claude and ask it to synthesize a human-readable story bible. The chatbots can supposedly read even imperfect JSON more efficiently than they can human prose, which is why I went with JSON output for this script. This has worked reasonable well so far, although it still requires me to review it and tweak it. This is pretty useful for any writer who spends more time on sequels and series continuity than on first drafts.

Chatbot Version of Summary Plus

Cedar Sanderson talks here about taking the short novels in her Groundskeeper series to Grok so that Grok can summarize them and find loose threads or world-building details Cedar might want to use in sequels. Now, I’ve seen ancedotal evidence suggesting that Grok might be better than Claude when it comes to juggling large amounts of context, and the Groundskeeper books are fairly short, with the first one running 57 pages in print. So, for works in the 50K-100K words range, like mine, it might be better to feed the chatbot only a few chapters at a time, ask for a stitched together summary of the book to date when the chat starts getting too long, and then bring that partial summary along to a new chat and start feeding more chapters. Here’s the prompt section of the Summary Plus python script, to get you started:

			
You are a story analyst extracting structured information from a
novel chapter for use in a story bible, developmental review, and marketing
pipeline. Return your analysis as a single JSON object. No preamble, no
explanation, no text outside the JSON object. Be specific. Quote directly
from the text for quotes. Do not pad thin chapters.
Return this exact JSON structure:
{
  "chapter_summary": "2-4 sentences: core dramatic event, what changes,
                      where chapter leaves off.",
  "plot_events": ["discrete event moving story forward or revealing info"],
  "settings": [{"location": "", "sensory_details": "",
                 "tonal_weight": ""}],
  "characters": [{"name": "", "role": "protagonist|antagonist|
                   supporting|mentioned", "new_information": "",
                   "relationship_development": "string or null"}],
  "new_information": ["fact or revelation established for first time"],
  "tropes": [{"name": "", "deployment": ""}],
  "quotes": [{"text": "exact quotation", "significance": ""}]
}
Rules: plot_events 3-8 items; tropes 3-5 functional only; quotes 3-5
standalone power. Valid JSON only. No markdown fences.

		

Developmental Editing

The DevEditor from the n8n post has been rebuilt as two separate scripts: one for general fiction, one for cozy mysteries. The mystery version has an extended focus on what I think of as the cozy contract: the genre’s specific requirements around fair play, clue distribution, suspect management, and the tonal balance between wit and darkness. The general fiction version covers pacing, character arcs, and worldbuilding. Like Summary Plus, it starts by running the manuscript splitter so that it can see where the chapter breaks are.

I made a deliberate decision to strip out the “prior content” feature from the n8n version. This was the rolling window of earlier chapter summaries that let the model check for continuity. It just didn’t seem to work that well with the local model, and the real continuity synthesis happens when you take the full output to the chatbot of your choice afterward, and ask it to put together a synthesis of recurring issues in the manuscript, and note which chapter-specific issues are not addressed elsewhere. Better to have the local model give honest chapter-by-chapter analysis of what it can actually see than confident pronouncements about continuity it’s inferring from fragments.

The output is a Markdown file, one section per chapter, ready to bring to a chatbot for the big-picture pass: identifying recurring issues across the whole manuscript, noting which chapter-level issues are already addressed elsewhere, and producing the kind of top-level editorial summary that would take a human reader multiple passes to assemble.

If the n8n DevEditor was a couple of bucks per pass after the initial tests, this one is completely free beyond the initial hardware costs. This is not remotely human-level developmental editing. It is however a thorough first-pass reader that will work through your entire manuscript at 3am without complaint and flag things you’d otherwise not notice until the fifth reread.

Chatbot Version of DevEditor

For this I think you would want what Claude/ChatGPT call a project and the other chatbots possibly call something else, where you have a fixed set of instructions and reference documents and then any chat in the project has access to it. Here’s the prompt component of the general DevEditor python script, made a bit more human-readable. You would want something like this as either a chat prompt or project instructions.

			
"You are acting as a professional developmental editor. 
Be honest and specific. Mix clear praise with direct, actionable criticism."
"Focus on:
**Pacing & Tension**
- Slow or draggy sections that could be tightened
- Moments where tension drops unexpectedly
- Missed opportunities to increase stakes or suspense
- Scene openings/endings that feel weak or abrupt
**Character Arcs**
- Whether character actions/emotions match their prior arc
- Clear goals, conflicts, and internal changes
- Missed opportunities for deeper emotion or conflict
**Worldbuilding & Setting**
- Clarity and vividness of setting
- Where sensory detail could deepen immersion
- Over/under-explaining world details that affect pacing"

		

Copyediting

This one I only just finished tweaking. The copyediting script asks the model to return a full cleaned version of each chapter alongside notes on what it changed. The challenge is that asking a model to reproduce thousands of words verbatim with only small corrections pushes against the same limits as asking a human copyeditor to retype rather than mark up — it’s slower and more error-prone than the task warrants.

My current view is that this script is most useful for catching consistent spelling patterns and obvious typos across a long manuscript, and least useful as a substitute for careful line-by-line review. The output gives you a second opinion worth checking, not a clean manuscript ready to publish.

Gemma 4 A4B tended to overthink this task and make mistakes when I tested it in the chatbot area of LM Studio. For this task I’ve switched to the smaller Gemma 4 E4B model, which is faster and uses a fraction of the VRAM, and doesn’t get as turned around. Copyediting doesn’t require the same depth of reasoning as structural analysis. In fact, “reasoning,” however you define that for LLMs, seems downright unhelpful for this task.

Chatbot Version of Copyediting

Here’s the copyediting prompt Perplexity originally wrote for me, before its TOS got ridiculous. You could set it as instructions in a chatbot project if you don’t want to mess with n8n or python:

			
You are a professional fiction copyeditor. You are conservative with changes and respect the existing style. 
You are acting as a professional copyeditor and proofreader for a fantasy novel.
STYLE AND PUNCTUATION RULES:
Prefer commas, periods, and semicolons over em-dashes.
Do NOT convert any form of existing punctuation or absence of punctuation into em-dashes.
Preserve existing punctuation if it already reads well.
FANTASY NAMES AND TERMS:
Treat unusual capitalized words as intentional names or world terms unless the context clearly shows they are typos.
Do NOT “correct” unfamiliar names to more common spellings.
Preserve internal capitalization (e.g., Ael’Thar, DeVere, McAllan) unless obviously inconsistent within this text.
Your job is to:
Fix spelling mistakes and obvious typos.
Fix clear grammar and punctuation errors that hurt clarity.
Preserve the author's voice, tone, and dialect.
Avoid unnecessary rewrites; only change wording when needed for clarity or correctness.
Return your answer in exactly this markdown structure:
CLEANED TEXT
[Full chapter text with typos and obvious grammar errors corrected.]
CHANGE NOTES
Bullet list of notable corrections and patterns you noticed 
(recurring misspellings, punctuation issues, consistency choices, etc.)
Now copyedit the following: 

		

Transcription Cleanup

The Whisper transcription script I was already running got a LM Studio cleanup pass added to it. Whisper produces a raw transcript; the cleanup pass removes filler words, corrects mishears, reorganizes ideas into coherent paragraphs, and flags significant ambiguities in curly brackets. It’s not as good as a Claude project combining the same instructions with some kind of story bible, but it just feels like less of a pain to have dictation+initial cleanup done in one pass, then correct and expand it myself and show the result to Claude/a similar chatbot for feedback on tone and typos.

What I Have Learned

A few things that might save you time if you try any of this:

Take a belt and suspenders approach to context window size for the local models. Set it both in LM Studio’s server area and in the python script, just to be on the safe side. If you’re unsure how large to set it, take the wordcount of the longest chapter that you can find in your manuscripts, and consult with your preferred chatbot about what size of context window would be needed to handle that.

Thinking models leak. If you are using a model that shows its reasoning process, build a thinking-block stripper into any script that expects structured output. The artifacts are small and inconsistent, which makes them annoying rather than catastrophic. If you’re just taking the JSON to a chatbot so the chatbot can advise you on marketing and sequels, it can generally read slightly flawed JSON without too much trouble.

Local models are not Claude or Grok or ChatGPT. Gemma 4 26B is genuinely impressive for its size, and it handles structured extraction tasks well. It also sometimes misidentifies which character is the protagonist when reading a chapter in isolation, has a lot of trouble putting names to first person narrators, and the chapter by chapter approach requires it to have the attention of a goldfish. The output is a starting point, not a verdict.

The devEditor pipeline works best when you treat the local model as a first reader and Claude as the editor who synthesizes the first reader’s notes. Neither one replaces actually reading your own manuscript. What the pipeline replaces is the feeling of not having anyone you can ask for help.

Since you’re probably going to be running the Python scripts in PowerShell, I find it helpful to keep a kind of checklist of commands in a txt file, that I can copy and paste into PowerShell. Here’s an example, with comments/advice to myself in hashtags. (Please note that the wispr-env stuff is mostly because I originally set up the Python stuff with whisper in mind, and Whisper is finicky about which version of Python it plays with.)

			
#Run in PowerShell, regular user, copy paste one uncommented line at a time. 
#Source document should be in docx form in C:\Users\UserName\Documents\LMStudio Automation Output
#LM Studio should be open, with the model named in the script (usually gemma-4-26b-a4b) running, and server under developer's tab toggled on.
cd C:\Users\UserName\
wispr-env\Scripts\activate
cd C:\Users\UserName\wispr-env\Scripts
python summary_plus.py
#or
batch_reverse_prompt.py
#or
python devEditor.py
#or
python devEditorMystery.py
#or, with gemma-4-e4b,
copyedit.py
#devEditors and copyedit do more analyzing and write more output, so may take longer than the 3-4 minutes per chapter that summary_plus takes. reverse prompt seems to take a couple minutes per image
deactivate

		

What’s Next

I’ve also had Claude write a few purely deterministic Python scripts for formatting Word documents for Amazon, no LM Studio or Gemma involved. If you’ve seen the new Ancestors of Jaiya Series Collection on preorder, collecting four of my earlier works into one volume, that is a project that sat around my documents folder for years, gathering dust, until I got the formatting scripts. I’ve taken my first stumbling steps into generating images locally with comfyui. The long term goal there is to get image and video results decent enough to where I can cut the cord on Midjourney. I’ll report back when there’s more to report. In the meantime, if any of this is useful to you, the best advice I can give is the same advice I gave in February: go to the chatbot of your choice and get it to walk you through building what you want. The barrier to entry for this kind of tool-building is lower than it looks from the outside.

Jaglion Press

AI as Writer’s Assistant: the Local Edition

Or, By The Sorceries of Python and LM Studio Combined

Leave a comment Cancel reply

Or, By The Sorceries of Python and LM Studio Combined

Share this:

Related

Leave a comment Cancel reply