mikeblas said:<Regarding working per-character>
For slickness? It would be kind of neat; you'd invent a nifty little state machine. That's kind of sexy, in a CS nerd way.
For performance? It would suck. You have to do a call to get each character, juggle around a buffer, and then finally get the character. In my implementation (IIRC) I'm using strchr() to find things. strchr() is very highly optimized, if you have a compiler with a decent and mature runtime library implementation. Like, instead of using an eight-bit register to get a cahracter and decide, it's going to get four bytes at a time in a 32-bit register and do fancy shifting and masking to find the beginning of your string. Doesn't that sound lots faster than setting up a function call for each character?
This is basically how I thought about it: Map the file to memory, and progress through it per-byte. It will still be read in larger chunks, but that is entirely hidden from me, allowing the code to be fairly neat.
(Ok, the neatness of my code can probably be disputed, but it's still conceptually simpler than having to handle a buffer by hand.)
Of course, given that it's C++, you could make an object around the file stream that contained a buffer and had a popByte()method (or something prettier with streams; I'm not used to C++), allowing the core loop to be written as if you were reading a single byte per time. While the actual operations performed would be similar, it could be an useful level of abstraction.