2025-11-28 04:25:05

Ever since DeepSeek launched, I’ve been tinkering with building my own product.

At first, I knew nothing—so I used a “decoupled workflow”: write a .md file for every little step, read it, confirm the logic, then move to the next step.

Then I learned to write JSON, tried putting data into LiteSQL, and figured out how to inspect the database. But once the data got bigger, LiteSQL IO just couldn’t keep up.

So my teammate and I moved to Redis for acceleration, then to distributed Kafka streaming. I even looked into RisingWave to run directly on top of Kafka.

But eventually the streaming computation itself became the bottleneck, so I jumped to vectorized processing with Polars.
And for storage? Went all the way back to simple parquet files.

Looking back, I can’t help but laugh—
If I had just learned how to read parquet in the beginning, none of this would’ve happened 😂

This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.