TalkTastic Insiders Icon

Introducing Gdump: A Command Line Tool for Managing Gemini API Files

·
·

I just published my second open source repo. This one's called Gdump. It turns out that the Gemini API a) stores any files you upload for 48 hours, and b) has a stupid 20 GB file limit, and if you hit it, it just stops working. So I built a simple command line script that dumps all files you didn't know you were storing. It works with multiple projects. You can set it to run as a cron job that'll dump everything on any schedule. Pretty useful if you're running large volumes through Gemini API.

  • Avatar of Messina
    Messina
    ·
    ·

    How did you reach 20GB of stuff?

  • Avatar of Matt Mireles
    Matt Mireles
    ·
    ·

    I'm uploading our entire codebase to Gemini and having it document the whole thing in extensive detail. The Spaghetti Monster in question is 1.8M tokens, and that's just the codebase for the local macOS client app, excluding third-party packages and dependencies. The goal is to build an AI agent that can understand and accurately document the whole codebase for me. There’s simply no way I could understand this all myself without AI in any reasonable period of time. Getting something is easy, but actually getting an accurate representation of how the system works is basically a research problem. My approach starts by generating an outline structure and a rough draft, then starts iterating on individual sections one at a time until it's 100% accurate. But.. apparently every time you make an API call, even if it's with the same file, Gemini API stores a different copy of it. This adds up very fast, as I have learned. I was pretty shocked - I dumped all the files last night and then woke up this morning at 40% capacity. Mind you, I'm distributing this workload across 4 different Google Cloud projects in order bypass the 4M token per minute per model rate limit. so really we're talking about 80 GB - 20 GB x 4. I don't understand why they built the system this way, but it is what it is.

  • Avatar of Messina
    Messina
    ·
    ·

    Interesting...! Have you looked into dedicated products like Qodo's AlphaCodium?

  • Avatar of Matt Mireles
    Matt Mireles
    ·
    ·

    First time I’ve heard of it, but the challenge we have right now is understanding how the system works. Basically, we just inherited a mountain of technical debt and don’t have access to the primary person who built it. Which is a different problem than what this does.

  • Avatar of Messina
    Messina
    ·
    ·

    Got it, I see. Yeah, def a harder problem!

  • Avatar of Seth Blank
    Seth Blank
    ·
    ·

    Hi Messina, it’s been years and years! 👋

  • Avatar of Messina
    Messina
    ·
    ·

    Seth Blank sup dude! 🙂

  • Avatar of Matt Mireles
    Matt Mireles
    ·
    ·